Current Study Designs, Methods, and Future Directions of Genetic Association Mapping

Author(s):  
Jami Jackson ◽  
Alison Motsinger-Reif

Rapid progress in genotyping technologies, including the scaling up of assay technologies to genome-wide levels and next generation sequencing, has motivated a burst in methods development and application to detect genotype-phenotype associations in a wide array of diseases and other phenotypes. In this chapter, the authors review the study design and genotyping options that are used in association mapping, along with the appropriate methods to perform mapping within these study designs. The authors discuss both candidate gene and genome-wide studies, focused on DNA level variation. Quality control, genotyping technologies, and single-SNP and multiple-SNP analyses have facilitated the successes in identifying numerous loci influence disease risk. However, variants identified have generally explained only a small fraction of the heritable component of disease risk. The authors discuss emerging trends and future directions in performing analysis for rare variants to detect these variants that predict these traits with more complex etiologies.

2021 ◽  
Author(s):  
Aleksejs Sazonovs ◽  
Christine R Stevens ◽  
Guhan R Venkataraman ◽  
Kai Yuan ◽  
Brandon Avila ◽  
...  

Genome-wide association studies (GWAS) have identified hundreds of loci associated with Crohns disease (CD); however, as with all complex diseases, deriving pathogenic mechanisms from these non-coding GWAS discoveries has been challenging. To complement GWAS and better define actionable biological targets, we analysed sequence data from more than 30,000 CD cases and 80,000 population controls. We observe rare coding variants in established CD susceptibility genes as well as ten genes where coding variation directly implicates the gene in disease risk for the first time.


2020 ◽  
Author(s):  
Samuel Hokin ◽  
Alan Cleary ◽  
Joann Mudge

Complex diseases, with many associated genetic and environmental factors, are a challenging target for genomic risk assessment. Genome-wide association studies (GWAS) associate disease status with, and compute risk from, individual common variants, which can be problematic for diseases with many interacting or rare variants. In addition, GWAS typically employ a reference genome which is not built from the subjects of the study, whose genetic background may differ from the reference and whose genetic characterization may be limited. We present a complementary method based on disease association with collections of genotypes, called frequented regions, on a pangenomic graph built from subjects' genomes. We introduce the pangenomic genotype graph, which is better suited than sequence graphs to human disease studies. Our method draws out collections of features, across multiple genomic segments, which are associated with disease status. We show that the frequented regions method consistently improves machine-learning classification of disease status over GWAS classification, allowing incorporation of rare or interacting variants. Notably, genomic segments that have few or no variants of genome-wide significance (p<5x10-8) provide much-improved classification with frequented regions, encouraging their application across the entire genome. Frequented regions may also be utilized for purposes such as choice of treatment in addition to prediction of disease risk.


2016 ◽  
Author(s):  
Elizabeth O’Brien ◽  
Richard A. Kerber ◽  
Raymond L. White

AbstractThe problem of “missing heritability” in genome-wide analyses of complex diseases is thought to be attributable to some combination of: rare variants of moderate to large effect, common variants of very small effect, and epigenetic, epistatic, or shared environmental effects. Rare variants do not affect large numbers of people by definition, but identified genes and pathways frequently lead to important insights into pathogenesis, and become targets of chemoprevention or therapy. Family studies remain an efficient way to identify rare variants with sizable effects on disease risk. We present a genome-wide study of breast cancer in 22 large high-risk families including 154 women diagnosed with breast cancer. Appropriate marker spacing was achieved by simulation studies of founder haplotypes to reduce the chance that linkage disequilibrium produced spurious linkage peaks. For each family, we generated 100 simulations of null linkage genome-wide to estimate the probability that individual results were due to chance. We identified a total of 12 putative susceptibility regions with per-family genome-wide probability < 0.05. These regions were located on 10 chromosomes; 10 of the 22 families showed linkage at these locations; two or more families showed linkage to 6 regions on 5 chromosomes (4q, 5q, 6p, 14q, 18p, and 18q). These results indicate that there is considerable heterogeneity among families in genomic regions and thus variants predisposing to breast cancer. Moreover, they suggest that uncommon high– or medium-risk genetic variants remain to be found, and that family designs can be an efficient way to identify them.


2019 ◽  
Vol 4 (1) ◽  
Author(s):  
C. L. van Eyk ◽  
M. A. Corbett ◽  
M. S. B. Frank ◽  
D. L. Webber ◽  
M. Newman ◽  
...  

Abstract A growing body of evidence points to a considerable and heterogeneous genetic aetiology of cerebral palsy (CP). To identify recurrently variant CP genes, we designed a custom gene panel of 112 candidate genes. We tested 366 clinically unselected singleton cases with CP, including 271 cases not previously examined using next-generation sequencing technologies. Overall, 5.2% of the naïve cases (14/271) harboured a genetic variant of clinical significance in a known disease gene, with a further 4.8% of individuals (13/271) having a variant in a candidate gene classified as intolerant to variation. In the aggregate cohort of individuals from this study and our previous genomic investigations, six recurrently hit genes contributed at least 4% of disease burden to CP: COL4A1, TUBA1A, AGAP1, L1CAM, MAOB and KIF1A. Significance of Rare VAriants (SORVA) burden analysis identified four genes with a genome-wide significant burden of variants, AGAP1, ERLIN1, ZDHHC9 and PROC, of which we functionally assessed AGAP1 using a zebrafish model. Our investigations reinforce that CP is a heterogeneous neurodevelopmental disorder with known as well as novel genetic determinants.


2020 ◽  
Vol 127 (11) ◽  
pp. 1517-1526
Author(s):  
Yoshiro Morimoto ◽  
Shinji Ono ◽  
Naohiro Kurotaki ◽  
Akira Imamura ◽  
Hiroki Ozawa

Abstract Panic disorder (PD) is a common and debilitating neuropsychiatric disorder characterized by panic attacks coupled with excessive anxiety. Both genetic factors and environmental factors play an important role in PD pathogenesis and response to treatment. However, PD is clinically heterogeneous and genetically complex, and the exact genetic or environmental causes of this disorder remain unclear. Various approaches for detecting disease-causing genes have recently been made available. In particular, genome-wide association studies (GWAS) have attracted attention for the identification of disease-associated loci of multifactorial disorders. This review introduces GWAS of PD, followed by a discussion about the limitations of GWAS and the major challenges facing geneticists in the post-GWAS era. Alternative strategies to address these challenges are then proposed, such as epigenome-wide association studies (EWAS) and rare variant association studies (RVAS) using next-generation sequencing. To date, however, few reports have described these analyses, and the evidence remains insufficient to confidently identify or exclude rare variants or epigenetic changes in PD. Further analyses are therefore required, using sample sizes in the tens of thousands, extensive functional annotations, and highly targeted hypothesis testing.


2020 ◽  
Vol 66 (1) ◽  
pp. 11-23
Author(s):  
Yukihide Momozawa ◽  
Keijiro Mizukami

AbstractGenome-wide association studies have identified >10,000 genetic variants associated with various phenotypes and diseases. Although the majority are common variants, rare variants with >0.1% of minor allele frequency have been investigated by imputation and using disease-specific custom SNP arrays. Rare variants sequencing analysis mainly revealed have played unique roles in the genetics of complex diseases in humans due to their distinctive features, in contrast to common variants. Unique roles are hypothesis-free evidence for gene causality, a precise target of functional analysis for understanding disease mechanisms, a new favorable target for drug development, and a genetic marker with high disease risk for personalized medicine. As whole-genome sequencing continues to identify more rare variants, the roles associated with rare variants will also increase. However, a better estimation of the functional impact of rare variants across whole genome is needed to enhance their contribution to improvements in human health.


Author(s):  
Zixuan Zeng ◽  
Thammannoon Hengsadeekul

Environmental issues and social responsibility have a significant impact on the natural ecological system and economic development. Hence, it is important to find a relative balance path between them. Previous studies have sought to explore environmental or social responsibility rather than seek solutions from a systematic perspective, and there seems to be a lack of a systematic, quantitative review of systematic solutions or details. To identify the multiple impacts and relationships between environmental issues and social responsibility and illustrate emerging trends and challenges, this article proposes a scientometrics review based on 1,336 articles published from 2001 to 2020, through co-occurrence analysis and co-citation analysis together with cluster and burstiness analysis to reveal the depth and breadth of emerging research. This research demonstrates the research paradigm of environmental issues and social responsibility extends from a single stakeholder level to a systematic strategic perspective of multiple organizations and stakeholders. The results provide researchers and practitioners with a deeper understanding of future directions and implications Keywords: Environmental issues; social responsibility; strategy; scientometrics; review


Nature ◽  
2021 ◽  
Vol 590 (7845) ◽  
pp. 290-299 ◽  
Author(s):  
Daniel Taliun ◽  
◽  
Daniel N. Harris ◽  
Michael D. Kessler ◽  
Jedidiah Carlson ◽  
...  

AbstractThe Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.


Nutrients ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 1984
Author(s):  
Majid Nikpay ◽  
Sepehr Ravati ◽  
Robert Dent ◽  
Ruth McPherson

Here, we performed a genome-wide search for methylation sites that contribute to the risk of obesity. We integrated methylation quantitative trait locus (mQTL) data with BMI GWAS information through a SNP-based multiomics approach to identify genomic regions where mQTLs for a methylation site co-localize with obesity risk SNPs. We then tested whether the identified site contributed to BMI through Mendelian randomization. We identified multiple methylation sites causally contributing to the risk of obesity. We validated these findings through a replication stage. By integrating expression quantitative trait locus (eQTL) data, we noted that lower methylation at cg21178254 site upstream of CCNL1 contributes to obesity by increasing the expression of this gene. Higher methylation at cg02814054 increases the risk of obesity by lowering the expression of MAST3, whereas lower methylation at cg06028605 contributes to obesity by decreasing the expression of SLC5A11. Finally, we noted that rare variants within 2p23.3 impact obesity by making the cg01884057 site more susceptible to methylation, which consequently lowers the expression of POMC, ADCY3 and DNAJC27. In this study, we identify methylation sites associated with the risk of obesity and reveal the mechanism whereby a number of these sites exert their effects. This study provides a framework to perform an omics-wide association study for a phenotype and to understand the mechanism whereby a rare variant causes a disease.


Sign in / Sign up

Export Citation Format

Share Document