On the Threshold from Genome-Wide Association Studies to Whole-Genome Sequencing. Looking for Signal in All the Right Places

2014 ◽  
Vol 189 (4) ◽  
pp. 381-383 ◽  
Author(s):  
Nadia N. Hansel ◽  
Rasika A. Mathias
2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Gabriel Costa Monteiro Moreira ◽  
Clarissa Boschiero ◽  
Aline Silva Mello Cesar ◽  
James M. Reecy ◽  
Thaís Fernanda Godoy ◽  
...  

2021 ◽  
Vol 12 ◽  
Author(s):  
Zhuliang Yang ◽  
Leqin Zou ◽  
Tiantian Sun ◽  
Wenwen Xu ◽  
Linghu Zeng ◽  
...  

Comb traits have potential economic value in the breeding of indigenous chickens in China. Identifying and understanding relevant molecular markers for comb traits can be beneficial for genetic improvement. The purpose of this study was to utilize genome-wide association studies (GWAS) to detect promising loci and candidate genes related to comb traits, namely, comb thickness (CT), comb weight (CW), comb height, comb length (CL), and comb area. Genome-wide single-nucleotide polymorphisms (SNPs) and small insertions/deletions (INDELs) in 300 Nandan-Yao chickens were detected using whole-genome sequencing. In total, we identified 134 SNPs and 25 INDELs that were strongly associated with the five comb traits. A remarkable region spanning from 29.6 to 31.4 Mb on chromosome 6 was found to be significantly associated with comb traits in both SNP- and INDEL-based GWAS. In this region, two lead SNPs (6:30,354,876 for CW and CT and 6:30,264,318 for CL) and one lead INDEL (a deletion from 30,376,404 to 30,376,405 bp for CL and CT) were identified. Additionally, two genes were identified as potential candidates for comb development. The nearby gene fibroblast growth factor receptor 2 (FGFR2)—associated with epithelial cell migration and proliferation—and the gene cytochrome b5 reductase 2 (CYB5R2)—identified on chromosome 5 from INDEL-based GWAS—are significantly correlated with collagen maturation. The findings of this study could provide promising genes and biomarkers to accelerate genetic improvement of comb development based on molecular marker-assisted breeding in Nandan-Yao chickens.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Julia Höglund ◽  
Nima Rafati ◽  
Mathias Rask-Andersen ◽  
Stefan Enroth ◽  
Torgny Karlsson ◽  
...  

AbstractGenome-wide association studies (GWAS) have identified associations between thousands of common genetic variants and human traits. However, common variants usually explain a limited fraction of the heritability of a trait. A powerful resource for identifying trait-associated variants is whole genome sequencing (WGS) data in cohorts comprised of families or individuals from a limited geographical area. To evaluate the power of WGS compared to imputations, we performed GWAS on WGS data for 72 inflammatory biomarkers, in a kinship-structured cohort. When using WGS data, we identified 18 novel associations that were not detected when analyzing the same biomarkers with genotyped or imputed SNPs. Five of the novel top variants were low frequency variants with a minor allele frequency (MAF) of <5%. Our results suggest that, even when applying a GWAS approach, we gain power and precision using WGS data, presumably due to more accurate determination of genotypes. The lack of a comparable dataset for replication of our results is a limitation in our study. However, this further highlights that there is a need for more genetic epidemiological studies based on WGS data.


2021 ◽  
Author(s):  
Marsha M. Wheeler ◽  
Adrienne M Stilp ◽  
Shuquan Rao ◽  
Bjarni V Halldorsson ◽  
Doruk V Beyter ◽  
...  

Genome-wide association studies (GWAS) have identified thousands of single nucleotide variants and small indels that contribute to the genetic architecture of hematologic traits. While structural variants (SVs) are known to cause rare blood or hematopoietic disorders, the genome-wide contribution of SVs to quantitative blood cell trait variation is unknown. Here we utilized SVs detected from whole genome sequencing (WGS) in ancestrally diverse participants of the NHLBI TOPMed program (N=50,675). Using single variant tests, we assessed the association of common and rare SVs with red cell-, white cell-, and platelet-related quantitative traits. The results show 33 independent SVs (23 common and 10 rare) reaching genome-wide significance. The majority of significant association signals (N=27) replicated in independent datasets from deCODE genetics and the UK BioBank. Moreover, most trait-associated SVs (N=24) are within 1Mb of previously-reported GWAS loci. SV analyses additionally discovered an association between a complex structural variant on 17p11.2 and white blood cell-related phenotypes. Based on functional annotation, the majority of significant SVs are located in non-coding regions (N=26) and predicted to impact regulatory elements and/or local chromatin domain boundaries in blood cells. We predict that several trait-associated SVs represent the causal variant. This is supported by genome-editing experiments which provide evidence that a deletion associated with lower monocyte counts leads to disruption of an S1PR3 monocyte enhancer and decreased S1PR3 expression.


2020 ◽  
Vol 29 (3) ◽  
pp. 515-526 ◽  
Author(s):  
Tianzhong Yang ◽  
Chong Wu ◽  
Peng Wei ◽  
Wei Pan

Abstract Transcriptome-wide association studies (TWAS) integrate genome-wide association studies (GWAS) and transcriptomic data to showcase their improved statistical power of identifying gene–trait associations while, importantly, offering further biological insights. TWAS have thus far focused on common variants as available from GWAS. Compared with common variants, the findings for or even applications to low-frequency variants are limited and their underlying role in regulating gene expression is less clear. To fill this gap, we extend TWAS to integrating whole genome sequencing data with transcriptomic data for low-frequency variants. Using the data from the Framingham Heart Study, we demonstrate that low-frequency variants play an important and universal role in predicting gene expression, which is not completely due to linkage disequilibrium with the nearby common variants. By including low-frequency variants, in addition to common variants, we increase the predictivity of gene expression for 79% of the examined genes. Incorporating this piece of functional genomic information, we perform association testing for five lipid traits in two UK10K whole genome sequencing cohorts, hypothesizing that cis-expression quantitative trait loci, including low-frequency variants, are more likely to be trait-associated. We discover that two genes, LDLR and TTC22, are genome-wide significantly associated with low-density lipoprotein cholesterol based on 3203 subjects and that the association signals are largely independent of common variants. We further demonstrate that a joint analysis of both common and low-frequency variants identifies association signals that would be missed by testing on either common variants or low-frequency variants alone.


2018 ◽  
Vol 4 (2) ◽  
pp. e224 ◽  
Author(s):  
Patrick May ◽  
Sabrina Pichler ◽  
Daniela Hartl ◽  
Dheeraj R. Bobbili ◽  
Manuel Mayhaus ◽  
...  

ObjectiveThe aim of this study was to identify variants associated with familial late-onset Alzheimer disease (AD) using whole-genome sequencing.MethodsSeveral families with an autosomal dominant inheritance pattern of AD were analyzed by whole-genome sequencing. Variants were prioritized for rare, likely pathogenic variants in genes already known to be associated with AD and confirmed by Sanger sequencing using standard protocols.ResultsWe identified 2 rare ABCA7 variants (rs143718918 and rs538591288) with varying penetrance in 2 independent German AD families, respectively. The single nucleotide variant (SNV) rs143718918 causes a missense mutation, and the deletion rs538591288 causes a frameshift mutation of ABCA7. Both variants have previously been reported in larger cohorts but with incomplete segregation information. ABCA7 is one of more than 20 AD risk loci that have so far been identified by genome-wide association studies, and both common and rare variants of ABCA7 have previously been described in different populations with higher frequencies in AD cases than in controls and varying penetrance. Furthermore, ABCA7 is known to be involved in several AD-relevant pathways.ConclusionsWe conclude that both SNVs might contribute to the development of AD in the examined family members. Together with previous findings, our data confirm ABCA7 as one of the most relevant AD risk genes.


2019 ◽  
Author(s):  
Ruifei Yang ◽  
Xiaoli Guo ◽  
Di Zhu ◽  
Cheng Bian ◽  
Yiqiang Zhao ◽  
...  

AbstractHigh-density markers discovered in large size samples are essential for mapping complex traits at the gene-level resolution for agricultural livestock and crops. However, the unavailability of large reference panels and array designs for a target population of agricultural species limits the improvement of array-based genotype imputation. Recent studies showed very low coverage sequencing (LCS) of a large number of individuals is a cost-effective approach to discover variations in much greater detail in association studies. Here, we performed cohort-wide whole-genome sequencing at an average depth of 0.73× and identified more than 11.3 M SNPs. We also evaluated the data set and performed genome-wide association analysis (GWAS) in 2885 Duroc boars. We compared two different pipelines and selected a proper method (BaseVar/STITCH) for LCS analyses and determined that sequencing of 1000 individuals with 0.2× depth is enough for identifying SNPs with high accuracy in this population. Of the seven association signals derived from the genome-wide association analysis of the LCS variants, which were associated with four economic traits, we found two QTLs with narrow intervals were possibly responsible for the teat number and back fat thickness traits and identified 7 missense variants in a single sequencing step. This strategy (BaseVar/STITCH) is generally applicable to any populations and any species which have no suitable reference panels. These findings show that the LCS strategy is a proper approach for the construction of new genetic resources to facilitate genome-wide association studies, fine mapping of QTLs, and genomic selection, and implicate that it can be widely used for agricultural animal breeding in the future.


2020 ◽  
Vol 27 (9) ◽  
pp. 1425-1430
Author(s):  
Inès Krissaane ◽  
Carlos De Niz ◽  
Alba Gutiérrez-Sacristán ◽  
Gabor Korodi ◽  
Nneka Ede ◽  
...  

Abstract Objective Advancements in human genomics have generated a surge of available data, fueling the growth and accessibility of databases for more comprehensive, in-depth genetic studies. Methods We provide a straightforward and innovative methodology to optimize cloud configuration in order to conduct genome-wide association studies. We utilized Spark clusters on both Google Cloud Platform and Amazon Web Services, as well as Hail (http://doi.org/10.5281/zenodo.2646680) for analysis and exploration of genomic variants dataset. Results Comparative evaluation of numerous cloud-based cluster configurations demonstrate a successful and unprecedented compromise between speed and cost for performing genome-wide association studies on 4 distinct whole-genome sequencing datasets. Results are consistent across the 2 cloud providers and could be highly useful for accelerating research in genetics. Conclusions We present a timely piece for one of the most frequently asked questions when moving to the cloud: what is the trade-off between speed and cost?


Sign in / Sign up

Export Citation Format

Share Document