Population Genomic Diversity of France
- Conditions
- Genetics, Population
- Interventions
- Genetic: Collection of a salivary sample
- Registration Number
- NCT04183023
- Lead Sponsor
- Institut National de la Santé Et de la Recherche Médicale, France
- Brief Summary
Having access to DNA sequences from individuals that share common ancestry with patients is of interest when analysing individual genomes for diagnoses. Information regarding allele frequency distribution in the same geographic areas where patients have ancestry will be necessary to help select the variants that are the most likely involved in disease and should hence be tested in functional assays.
To provide such a general population panel for France, the POPGEN project will select 10,000 individuals from CONSTANCES cohort with ancestry in different regions of France outside western part. These participants will have their DNA extracted from salivary kits and genotyped using SNP-chip. Among these 10,000 individuals, 4,000 individuals will have their whole genome sequenced. This study is one of the four pilot projects of the France Genomic Medicine plan (FMG 2025). The FMG2025 plan aims at introducing genome sequencing in the routine clinical practice to accelerate and improve diagnoses. The POPGEN project will provide the required references from the general French population to help filter out common variants from the genomes of patients.
- Detailed Description
The development of high throughput sequencing technologies has opened-up new possibilities to sequence the human genome and identify all genetic variants in individual genome. In each genome, more than four million differences from the reference sequence, i.e. variants, are found and the real challenge is to find "the needles in stacks of needles". A majority of these variants are neutral with no impact on individual phenotype but some of them could impair individual health and lead to disease. Guidelines have been proposed for implicating sequence variants in diseases and limiting false positive reports. In these guidelines, emphasis is put on the requirement to compare the distributions of variants between patients and large control datasets matched as closely as possible to the patients in terms of ancestry.
Several large reference datasets such as those from the 1000 Genomes Project Consortium, the Exome Sequencing Project (ESP) or the Exome Aggregation Consortium (ExAC) are publicly available and give information on the variants found in the exomes or whole genomes of individuals with ancestries in different populations and their respective frequencies. Europeans are well represented in this database. However, there is no information on the geographic region in Europe where individuals are originating from, except for the Finns that are considered separately from the rest of Europe. Previous studies using SNP-chips have shown however that there exist some differences in allele frequencies between continental European populations and that these differences could lead to false positive results in association studies. These allele frequency differences at common variants are also detectable between regions within a country as found by several studies on different European populations, including France. It is very important to describe these geographic fine-scale stratification patterns to allow an efficient matching of cases and controls in association studies. This is especially true when the interest is on rare variants as these variants are, for the majority, young variants that have recently appeared in local populations and not spread over large geographic regions.
Having access to DNA sequences from individuals that share common ancestry with patients is also of interest when analysing individual genomes for diagnoses. Information regarding allele frequency distribution in the same geographic areas where patients have ancestry will therefore be necessary to help select the variants that are the most likely involved in disease and should hence be tested in functional assays.
A first project focusing on the western part of France is ongoing at Institut du Thorax in Nantes. Preliminary results have shown that using this panel of individuals improve variant filtering in the genomes of patients with similar ancestries. Efforts should therefore be made to cover other geographic regions of France.
To provide such a general population panel for France, the POPGEN project will select 10,000 individuals from CONSTANCES cohort with ancestry in different regions of France outside western part. These participants will have their DNA extracted from salivary kits and genotyped using SNP-chip. Among these 10,000 individuals, 4,000 individuals will have their whole genome sequenced. This study is one of the four pilot projects of the France Genomic Medicine plan (FMG 2025). The FMG2025 plan aims at introducing genome sequencing in the routine clinical practice to accelerate and improve diagnoses. The POPGEN project will provide the required references from the general French population to help filter out common variants from the genomes of patients.
Recruitment & Eligibility
- Status
- ACTIVE_NOT_RECRUITING
- Sex
- All
- Target Recruitment
- 10250
- participant included in CONSTANCES Cohort and have agreed to transmit their data for research purposes,
- participant meeting the geographic criteria of the study,
- participant who has given his consent for participating to this study.
- participant who do not have sent back their informed consent or the informed consent is non-complying
- participant who do not to have provided a written free informed consent, such as for individuals placed under tutorship or guardianship.
Study & Design
- Study Type
- OBSERVATIONAL
- Study Design
- Not specified
- Arm && Interventions
Group Intervention Description Single arm Collection of a salivary sample There is a unique arm in which all the participants will be included. The intervention consists in the collection of a salivary sample Volunteers who agreed to participate will have to collect their saliva with the self-collection device. They will then send back their saliva sample and a dated and signed copy of the informed consent form in the pre-paid return envelope. All DNA of the saliva samples will be automatically extracted, then DNA samples will be genotyped and a subset of 4,000 DNA will be sequenced. The genotyping will consist of measurement of general genetic variation, including the Single Nucleotide Polymorphisms. The SNP genotyping will be carried out using Illumina high density chips, in CNRGH production platform. Sequencing will be performed in order to reach a mean coverage of 30X for each sample and a minimum of 25X mean coverage. Finally, a bioinformatics analysis will be performed on sequencing data.
- Primary Outcome Measures
Name Time Method Genotypes and allele frequencies of the participants Through study completion, an average of 1 year 4,000 individuals will be selected based on the places of birth of their grandparents in order to ensure a homogeneous coverage of the different geographic regions. The DNA of these 4,000 individuals will be sequenced. Genotypes and allele frequencies at the different genomic positions where variations will be observed in the POPGEN population will be computed by simple counting of their occurences in the dataset.
- Secondary Outcome Measures
Name Time Method
Trial Locations
- Locations (1)
Inserm - UMR1078 GGB
🇫🇷Brest, France