scholarly journals GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing

2021 ◽  
Author(s):  
Jordi Valls-Margarit ◽  
Iván Galván-Femenía ◽  
Daniel Matias ◽  
Natalia Blay ◽  
Montserrat Puiggròs ◽  
...  

The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression models, we present a catalogue of 35,431,441 variants, including 89,178 SVs (≥50bp), 30,325,064 SNVs and 5,017,199 indels, across 785 Illumina high coverage (30X) whole-genomes from the Iberian GCAT Cohort, containing 3.52M SNVs, 606,336 indels and 6,393 SVs in median per individual. The haplotype panel is able to impute up to 14,360,728 SNVs/indels and 23,179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies.

2020 ◽  
Vol 6 (22) ◽  
pp. eaaz7835 ◽  
Author(s):  
Sungwon Jeon ◽  
Youngjune Bhak ◽  
Yeonsong Choi ◽  
Yeonsu Jeon ◽  
Seunghoon Kim ◽  
...  

We present the initial phase of the Korean Genome Project (Korea1K), including 1094 whole genomes (sequenced at an average depth of 31×), along with data of 79 quantitative clinical traits. We identified 39 million single-nucleotide variants and indels of which half were singleton or doubleton and detected Korean-specific patterns based on several types of genomic variations. A genome-wide association study illustrated the power of whole-genome sequences for analyzing clinical traits, identifying nine more significant candidate alleles than previously reported from the same linkage disequilibrium blocks. Also, Korea1K, as a reference, showed better imputation accuracy for Koreans than the 1KGP panel. As proof of utility, germline variants in cancer samples could be filtered out more effectively when the Korea1K variome was used as a panel of normals compared to non-Korean variome sets. Overall, this study shows that Korea1K can be a useful genotypic and phenotypic resource for clinical and ethnogenetic studies.


Author(s):  
Alicia R. Martin ◽  
Elizabeth G. Atkinson ◽  
Sinéad B. Chapman ◽  
Anne Stevenson ◽  
Rocky E. Stroud ◽  
...  

AbstractBackgroundGenetic studies of biomedical phenotypes in underrepresented populations identify disproportionate numbers of novel associations. However, current genomics infrastructure--including most genotyping arrays and sequenced reference panels--best serves populations of European descent. A critical step for facilitating genetic studies in underrepresented populations is to ensure that genetic technologies accurately capture variation in all populations. Here, we quantify the accuracy of low-coverage sequencing in diverse African populations.ResultsWe sequenced the whole genomes of 91 individuals to high-coverage (≥20X) from the Neuropsychiatric Genetics of African Population-Psychosis (NeuroGAP-Psychosis) study, in which participants were recruited from Ethiopia, Kenya, South Africa, and Uganda. We empirically tested two data generation strategies, GWAS arrays versus low-coverage sequencing, by calculating the concordance of imputed variants from these technologies with those from deep whole genome sequencing data. We show that low-coverage sequencing at a depth of ≥4X captures variants of all frequencies more accurately than all commonly used GWAS arrays investigated and at a comparable cost. Lower depths of sequencing (0.5-1X) performed comparable to commonly used low-density GWAS arrays. Low-coverage sequencing is also sensitive to novel variation, with 4X sequencing detecting 45% of singletons and 95% of common variants identified in high-coverage African whole genomes.ConclusionThese results indicate that low-coverage sequencing approaches surmount the problems induced by the ascertainment of common genotyping arrays, including those that capture variation most common in Europeans and Africans. Low-coverage sequencing effectively identifies novel variation (particularly in underrepresented populations), and presents opportunities to enhance variant discovery at a similar cost to traditional approaches.


2020 ◽  
Author(s):  
Saikat Chakraborty ◽  
Analabha Basu ◽  

AbstractThe invention of agriculture (IOA) by anatomically modern humans (AMH) around 10,000 years before present (ybp) is known to have led to an increase in AMH’s carrying capacity and hence its population size. Reconstruction of historical demography using high coverage (~30X) whole genome sequences (WGS) from >700 individuals from different South Asian (SAS) and Southeast Asian (SEA) populations reveals that although several present day populous groups did indeed experience a positive Neolithic Demographic Transition (NDT), most hunter-gatherers (HGs) experienced a demographic decrease. Differential fertility between HGs and non-HGs, exposure of HGs to novel pathogens from non-HGs could have resulted in such contrasting patterns. However, we think the most parsimonious explanation of the drastic decrease in population size of HGs is their displacement/enslavement by non-HGs.Significance StatementThe invention of agriculture, around 10000 years ago, facilitated more food production which could feed larger populations. This had far-reaching socio-political and demographic impacts, including a ~10,000 fold increase in global population-size in the last 10,000 years. However, this increase in population size is not a universal truth and present day hunter-gatherer populations, in contrast, have dwindled in size, often drastically. The signatures of this rise in population size are discernible from the genomes of present-day individuals. Using genomic data, we show that for the majority of Asian hunter-gatherers, population-sizes drastically decreased following the invention of agriculture. We argue that a combination of displacement, enslavement and disease resulted in the decimation of hunter-gatherer societies.


2019 ◽  
Author(s):  
Qingbo Wang ◽  
Emma Pierce-Hoffman ◽  
Beryl B. Cummings ◽  
Konrad J. Karczewski ◽  
Jessica Alföldi ◽  
...  

AbstractMulti-nucleotide variants (MNVs), defined as two or more nearby variants existing on the same haplotype in an individual, are a clinically and biologically important class of genetic variation. However, existing tools for variant interpretation typically do not accurately classify MNVs, and understanding of their mutational origins remains limited. Here, we systematically survey MNVs in 125,748 whole exomes and 15,708 whole genomes from the Genome Aggregation Database (gnomAD). We identify 1,996,125 MNVs across the genome with constituent variants falling within 2 bp distance of one another, of which 31,510 exist within the same codon, including 405 predicted to result in gain of a nonsense mutation, 1,818 predicted to rescue a nonsense mutation event that would otherwise be caused by one of the constituent variants, and 16,481 additional variants predicted to alter protein sequences. We show that the distribution of MNVs is highly non-uniform across the genome, and that this non-uniformity can be largely explained by a variety of known mutational mechanisms, such as CpG deamination, replication error by polymerase zeta, or polymerase slippage at repeat junctions. We also provide an estimate of the dinucleotide mutation rate caused by polymerase zeta. Finally, we show that differential CpG methylation drives MNV differences across functional categories. Our results demonstrate the importance of incorporating haplotype-aware annotation for accurate functional interpretation of genetic variation, and refine our understanding of genome-wide mutational mechanisms of MNVs.


2020 ◽  
Vol 117 (5) ◽  
pp. 2560-2569 ◽  
Author(s):  
Michael D. Kessler ◽  
Douglas P. Loesch ◽  
James A. Perry ◽  
Nancy L. Heard-Costa ◽  
Daniel Taliun ◽  
...  

De novo mutations (DNMs), or mutations that appear in an individual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole-genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) Program, we called 93,325 single-nucleotide DNMs across 1,465 trios from an array of diverse human populations, and used them to directly estimate and analyze DNM counts, rates, and spectra. We find a significant positive correlation between local recombination rate and local DNM rate, and that DNM rate explains a substantial portion (8.98 to 34.92%, depending on the model) of the genome-wide variation in population-level genetic variation from 41K unrelated TOPMed samples. Genome-wide heterozygosity does correlate with DNM rate, but only explains <1% of variation. While we are underpowered to see small differences, we do not find significant differences in DNM rate between individuals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed individuals. However, we did find significantly fewer DNMs in Amish individuals, even when compared with other Europeans, and even after accounting for parental age and sequencing center. Specifically, we found significant reductions in the number of C→A and T→C mutations in the Amish, which seem to underpin their overall reduction in DNMs. Finally, we calculated near-zero estimates of narrow sense heritability (h2), which suggest that variation in DNM rate is significantly shaped by nonadditive genetic effects and the environment.


2017 ◽  
Author(s):  
Mateus Jose Abdalla Diniz ◽  
Andiara Calado Saloma Rodrigues ◽  
Ary Gadelha ◽  
Shaza Issam Alsabban ◽  
Camila Guindalini ◽  
...  

AbstractBoth common and rare genetic variation play a role in the causes for mood disorders. Very large families pose unique opportunities and analytical challenges but may provide a way to identify regions and mutations associated with mood disorders. We identified a family with a high prevalence (~30%) of mood disorders in a rural village in Brazil, featuring decreasing age of onset over generations. The pattern of inheritance was complex with 32 Bipolar type I cases, 11 Bipolar type II and 59 recurrent and/or severe Depression cases in addition to other phenotypes. We enrolled 333 participants with DNA samples from a broader pedigree of 960 subjects for genotyping using the Affymetrix 10K array. Non-parametric linkage was carried out via MERLIN and parametric with both MERLIN and MCLINKAGE. We exome sequenced a subset of the family (n=27) in order to identify rare variation within the linkage regions shared by affected family members. We identified four genome wide significant and four suggestive linkage regions on chromosomes 1, 2, 3, 11 and 12 for different phenotype definitions. However, no region received strong joint support in both the parametric and non-parametric analyses. Exome sequencing revealed potential deleterious variants in 11p15.4 for MDD and 1q21.1-1q21.3 and 12p23.1-p22.3, implicated in cell signaling, adhesion, translation and neurogenesis processes. Overall, our results suggest promising, but not definitive or confirmed evidence, that rare genetic variation contributes to the high prevalence of mood disorders in this multi-generational family. We note that a substantial role for common genetic variation is likely given the strength of the linkage signals observed.The World Health Organisation reports depression and bipolar disorder as the second and seventh most important causes of years lost due to disability worldwide[1]. The heritability of bipolar disorder is between 60-90% with a lower but still substantial heritability for major depression (40-45%) [2]; [3]. First-degree relatives of bipolar disorder probands have a 5-10 fold increase in risk of developing the illness compared to relatives of controls but also show a three fold increase in unipolar depression, indicating that bipolar disorder does not “breed true” [4]. Large collaborative genome-wide association studies (GWAS) have uncovered several common genetic variants of small effect [5]. Genomewide estimates of heritability suggest that up to 60% of the genetic risk is contributed by common variants [6]. Overall, the current picture for bipolar disorder (and almost all complex traits) is a genetic architecture formed of both common and rare variants.Linkage studies have been pursued on the basis that there may be variants of greater effect shared between and within affected families. However these studies have usually focused on collections of comparatively small families or sib pairs and few consistent findings have emerged [7]. Large multigenerational families (e. g. of >30 affected individuals) theoretically offer a powerful means for mapping complex disease loci that are individually rare but common in a single family. These loci may be more highly penetrant and of larger effect than loci found with GWAS [8]. Here we report the results of the Brazilian Bipolar Family (BBF) study on a five-generation family of 639 members of which 333 were enrolled in the current analyses. Our objectives were to perform a linkage analysis with genome coverage and try to identify new genes/mutations related to bipolar and other mood disorders in the family. Here we report our findings and preliminary results of sequencing of linkage regions.


2021 ◽  
Vol 11 (1) ◽  
pp. 33
Author(s):  
Nayoung Han ◽  
Jung Mi Oh ◽  
In-Wha Kim

For predicting phenotypes and executing precision medicine, combination analysis of single nucleotide variants (SNVs) genotyping with copy number variations (CNVs) is required. The aim of this study was to discover SNVs or common copy CNVs and examine the combined frequencies of SNVs and CNVs in pharmacogenes using the Korean genome and epidemiology study (KoGES), a consortium project. The genotypes (N = 72,299) and CNV data (N = 1000) were provided by the Korean National Institute of Health, Korea Centers for Disease Control and Prevention. The allele frequencies of SNVs, CNVs, and combined SNVs with CNVs were calculated and haplotype analysis was performed. CYP2D6 rs1065852 (c.100C>T, p.P34S) was the most common variant allele (48.23%). A total of 8454 haplotype blocks in 18 pharmacogenes were estimated. DMD ranked the highest in frequency for gene gain (64.52%), while TPMT ranked the highest in frequency for gene loss (51.80%). Copy number gain of CYP4F2 was observed in 22 subjects; 13 of those subjects were carriers with CYP4F2*3 gain. In the case of TPMT, approximately one-half of the participants (N = 308) had loss of the TPMT*1*1 diplotype. The frequencies of SNVs and CNVs in pharmacogenes were determined using the Korean cohort-based genome-wide association study.


Molecules ◽  
2021 ◽  
Vol 26 (9) ◽  
pp. 2431
Author(s):  
Natalia A. Shnayder ◽  
Marina M. Petrova ◽  
Tatiana E. Popova ◽  
Tatiana K. Davidova ◽  
Olga P. Bobrova ◽  
...  

Chronic pain syndromes are an important medical problem generated by various molecular, genetic, and pathophysiologic mechanisms. Back pain, neuropathic pain, and posttraumatic pain are the most important pathological processes associated with chronic pain in adults. Standard approaches to the treatment of them do not solve the problem of pain chronicity. This is the reason for the search for new personalized strategies for the prevention and treatment of chronic pain. The nitric oxide (NO) system can play one of the key roles in the development of peripheral pain and its chronicity. The purpose of the study is to review publications devoted to changes in the NO system in patients with peripheral chronical pain syndromes. We have carried out a search for the articles published in e-Library, PubMed, Oxford Press, Clinical Case, Springer, Elsevier, and Google Scholar databases. The search was carried out using keywords and their combinations. The role of NO and NO synthases (NOS) isoforms in peripheral pain development and chronicity was demonstrated primarily from animal models to humans. The most studied is the neuronal NOS (nNOS). The role of inducible NOS (iNOS) and endothelial NOS (eNOS) is still under investigation. Associative genetic studies have shown that single nucleotide variants (SNVs) of NOS1, NOS2, and NOS3 genes encoding nNOS, iNOS, and eNOS may be associated with acute and chronic peripheral pain. Prospects for the use of NOS inhibitors to modulate the effect of drugs used to treat peripheral pain syndrome are discussed. Associative genetic studies of SNVs NOS1, NOS2, and NOS3 genes are important for understanding genetic predictors of peripheral pain chronicity and development of new personalized pharmacotherapy strategies.


Pathogens ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 363
Author(s):  
Sulochana K. Wasala ◽  
Dana K. Howe ◽  
Louise-Marie Dandurand ◽  
Inga A. Zasada ◽  
Dee R. Denver

Globodera pallida is among the most significant plant-parasitic nematodes worldwide, causing major damage to potato production. Since it was discovered in Idaho in 2006, eradication efforts have aimed to contain and eradicate G. pallida through phytosanitary action and soil fumigation. In this study, we investigated genome-wide patterns of G. pallida genetic variation across Idaho fields to evaluate whether the infestation resulted from a single or multiple introduction(s) and to investigate potential evolutionary responses since the time of infestation. A total of 53 G. pallida samples (~1,042,000 individuals) were collected and analyzed, representing five different fields in Idaho, a greenhouse population, and a field in Scotland that was used for external comparison. According to genome-wide allele frequency and fixation index (Fst) analyses, most of the genetic variation was shared among the G. pallida populations in Idaho fields pre-fumigation, indicating that the infestation likely resulted from a single introduction. Temporal patterns of genome-wide polymorphisms involving (1) pre-fumigation field samples collected in 2007 and 2014 and (2) pre- and post-fumigation samples revealed nucleotide variants (SNPs, single-nucleotide polymorphisms) with significantly differentiated allele frequencies indicating genetic differentiation. This study provides insights into the genetic origins and adaptive potential of G. pallida invading new environments.


Sign in / Sign up

Export Citation Format

Share Document