Whole genome sequencing identifies common and rare structural variants contributing to hematologic traits in the NHLBI TOPMed program

Genome-wide association studies (GWAS) have identified thousands of single nucleotide variants and small indels that contribute to the genetic architecture of hematologic traits. While structural variants (SVs) are known to cause rare blood or hematopoietic disorders, the genome-wide contribution of SVs to quantitative blood cell trait variation is unknown. Here we utilized SVs detected from whole genome sequencing (WGS) in ancestrally diverse participants of the NHLBI TOPMed program (N=50,675). Using single variant tests, we assessed the association of common and rare SVs with red cell-, white cell-, and platelet-related quantitative traits. The results show 33 independent SVs (23 common and 10 rare) reaching genome-wide significance. The majority of significant association signals (N=27) replicated in independent datasets from deCODE genetics and the UK BioBank. Moreover, most trait-associated SVs (N=24) are within 1Mb of previously-reported GWAS loci. SV analyses additionally discovered an association between a complex structural variant on 17p11.2 and white blood cell-related phenotypes. Based on functional annotation, the majority of significant SVs are located in non-coding regions (N=26) and predicted to impact regulatory elements and/or local chromatin domain boundaries in blood cells. We predict that several trait-associated SVs represent the causal variant. This is supported by genome-editing experiments which provide evidence that a deletion associated with lower monocyte counts leads to disruption of an S1PR3 monocyte enhancer and decreased S1PR3 expression.

Download Full-text

Integration of genome wide association studies and whole genome sequencing provides novel insights into fat deposition in chicken

Scientific Reports ◽

10.1038/s41598-018-34364-0 ◽

2018 ◽

Vol 8 (1) ◽

Cited By ~ 8

Author(s):

Gabriel Costa Monteiro Moreira ◽

Clarissa Boschiero ◽

Aline Silva Mello Cesar ◽

James M. Reecy ◽

Thaís Fernanda Godoy ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Association Studies ◽

Fat Deposition ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Genome Wide

Download Full-text

Quantifying the mapping precision of genome-wide association studies using whole-genome sequencing data

Genome Biology ◽

10.1186/s13059-017-1216-0 ◽

2017 ◽

Vol 18 (1) ◽

Cited By ~ 46

Author(s):

Yang Wu ◽

Zhili Zheng ◽

Peter M. Visscher ◽

Jian Yang

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Association Studies ◽

Genome Wide Association ◽

Whole Genome Sequencing Data ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Sequencing Data ◽

Genome Wide

Download Full-text

On the Threshold from Genome-Wide Association Studies to Whole-Genome Sequencing. Looking for Signal in All the Right Places

American Journal of Respiratory and Critical Care Medicine ◽

10.1164/rccm.201401-0048ed ◽

2014 ◽

Vol 189 (4) ◽

pp. 381-383 ◽

Cited By ~ 1

Author(s):

Nadia N. Hansel ◽

Rasika A. Mathias

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Genome Wide ◽

The Right

Download Full-text

Genome-Wide Association Study Using Whole-Genome Sequencing Identifies a Genomic Region on Chromosome 6 Associated With Comb Traits in Nandan-Yao Chicken

Frontiers in Genetics ◽

10.3389/fgene.2021.682501 ◽

2021 ◽

Vol 12 ◽

Author(s):

Zhuliang Yang ◽

Leqin Zou ◽

Tiantian Sun ◽

Wenwen Xu ◽

Linghu Zeng ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Genetic Improvement ◽

Genome Wide Association Study ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Chromosome 6 ◽

Whole Genome ◽

Genome Wide

Comb traits have potential economic value in the breeding of indigenous chickens in China. Identifying and understanding relevant molecular markers for comb traits can be beneficial for genetic improvement. The purpose of this study was to utilize genome-wide association studies (GWAS) to detect promising loci and candidate genes related to comb traits, namely, comb thickness (CT), comb weight (CW), comb height, comb length (CL), and comb area. Genome-wide single-nucleotide polymorphisms (SNPs) and small insertions/deletions (INDELs) in 300 Nandan-Yao chickens were detected using whole-genome sequencing. In total, we identified 134 SNPs and 25 INDELs that were strongly associated with the five comb traits. A remarkable region spanning from 29.6 to 31.4 Mb on chromosome 6 was found to be significantly associated with comb traits in both SNP- and INDEL-based GWAS. In this region, two lead SNPs (6:30,354,876 for CW and CT and 6:30,264,318 for CL) and one lead INDEL (a deletion from 30,376,404 to 30,376,405 bp for CL and CT) were identified. Additionally, two genes were identified as potential candidates for comb development. The nearby gene fibroblast growth factor receptor 2 (FGFR2)—associated with epithelial cell migration and proliferation—and the gene cytochrome b5 reductase 2 (CYB5R2)—identified on chromosome 5 from INDEL-based GWAS—are significantly correlated with collagen maturation. The findings of this study could provide promising genes and biomarkers to accelerate genetic improvement of comb development based on molecular marker-assisted breeding in Nandan-Yao chickens.

Download Full-text

Improved power and precision with whole genome sequencing data in genome-wide association studies of inflammatory biomarkers

Scientific Reports ◽

10.1038/s41598-019-53111-7 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 4

Author(s):

Julia Höglund ◽

Nima Rafati ◽

Mathias Rask-Andersen ◽

Stefan Enroth ◽

Torgny Karlsson ◽

...

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Association Studies ◽

Accurate Determination ◽

Genome Wide Association ◽

Inflammatory Biomarkers ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Genome Wide ◽

Common Genetic Variants

AbstractGenome-wide association studies (GWAS) have identified associations between thousands of common genetic variants and human traits. However, common variants usually explain a limited fraction of the heritability of a trait. A powerful resource for identifying trait-associated variants is whole genome sequencing (WGS) data in cohorts comprised of families or individuals from a limited geographical area. To evaluate the power of WGS compared to imputations, we performed GWAS on WGS data for 72 inflammatory biomarkers, in a kinship-structured cohort. When using WGS data, we identified 18 novel associations that were not detected when analyzing the same biomarkers with genotyped or imputed SNPs. Five of the novel top variants were low frequency variants with a minor allele frequency (MAF) of <5%. Our results suggest that, even when applying a GWAS approach, we gain power and precision using WGS data, presumably due to more accurate determination of genotypes. The lack of a comparable dataset for replication of our results is a limitation in our study. However, this further highlights that there is a need for more genetic epidemiological studies based on WGS data.

Download Full-text

Integrating DNA sequencing and transcriptomic data for association analyses of low-frequency variants and lipid traits

Human Molecular Genetics ◽

10.1093/hmg/ddz314 ◽

2020 ◽

Vol 29 (3) ◽

pp. 515-526 ◽

Cited By ~ 4

Author(s):

Tianzhong Yang ◽

Chong Wu ◽

Peng Wei ◽

Wei Pan

Keyword(s):

Gene Expression ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Association Studies ◽

Low Frequency ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Common Variants ◽

Transcriptomic Data ◽

Genome Wide

Abstract Transcriptome-wide association studies (TWAS) integrate genome-wide association studies (GWAS) and transcriptomic data to showcase their improved statistical power of identifying gene–trait associations while, importantly, offering further biological insights. TWAS have thus far focused on common variants as available from GWAS. Compared with common variants, the findings for or even applications to low-frequency variants are limited and their underlying role in regulating gene expression is less clear. To fill this gap, we extend TWAS to integrating whole genome sequencing data with transcriptomic data for low-frequency variants. Using the data from the Framingham Heart Study, we demonstrate that low-frequency variants play an important and universal role in predicting gene expression, which is not completely due to linkage disequilibrium with the nearby common variants. By including low-frequency variants, in addition to common variants, we increase the predictivity of gene expression for 79% of the examined genes. Incorporating this piece of functional genomic information, we perform association testing for five lipid traits in two UK10K whole genome sequencing cohorts, hypothesizing that cis-expression quantitative trait loci, including low-frequency variants, are more likely to be trait-associated. We discover that two genes, LDLR and TTC22, are genome-wide significantly associated with low-density lipoprotein cholesterol based on 3203 subjects and that the association signals are largely independent of common variants. We further demonstrate that a joint analysis of both common and low-frequency variants identifies association signals that would be missed by testing on either common variants or low-frequency variants alone.

Download Full-text

Rare ABCA7 variants in 2 German families with Alzheimer disease

Neurology Genetics ◽

10.1212/nxg.0000000000000224 ◽

2018 ◽

Vol 4 (2) ◽

pp. e224 ◽

Cited By ~ 4

Author(s):

Patrick May ◽

Sabrina Pichler ◽

Daniela Hartl ◽

Dheeraj R. Bobbili ◽

Manuel Mayhaus ◽

...

Keyword(s):

Alzheimer Disease ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Rare Variants ◽

Late Onset ◽

Association Studies ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Pathogenic Variants ◽

Genome Wide

ObjectiveThe aim of this study was to identify variants associated with familial late-onset Alzheimer disease (AD) using whole-genome sequencing.MethodsSeveral families with an autosomal dominant inheritance pattern of AD were analyzed by whole-genome sequencing. Variants were prioritized for rare, likely pathogenic variants in genes already known to be associated with AD and confirmed by Sanger sequencing using standard protocols.ResultsWe identified 2 rare ABCA7 variants (rs143718918 and rs538591288) with varying penetrance in 2 independent German AD families, respectively. The single nucleotide variant (SNV) rs143718918 causes a missense mutation, and the deletion rs538591288 causes a frameshift mutation of ABCA7. Both variants have previously been reported in larger cohorts but with incomplete segregation information. ABCA7 is one of more than 20 AD risk loci that have so far been identified by genome-wide association studies, and both common and rare variants of ABCA7 have previously been described in different populations with higher frequencies in AD cases than in controls and varying penetrance. Furthermore, ABCA7 is known to be involved in several AD-relevant pathways.ConclusionsWe conclude that both SNVs might contribute to the development of AD in the examined family members. Together with previous findings, our data confirm ABCA7 as one of the most relevant AD risk genes.

Download Full-text

Modelling Human Regulatory Variation in Mouse: Finding the Function in Genome-Wide Association Studies and Whole-Genome Sequencing

PLoS Genetics ◽

10.1371/journal.pgen.1002544 ◽

2012 ◽

Vol 8 (3) ◽

pp. e1002544 ◽

Cited By ~ 11

Author(s):

Jean-François Schmouth ◽

Russell J. Bonaguro ◽

Ximena Corso-Diaz ◽

Elizabeth M. Simpson

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Regulatory Variation ◽

Genome Wide

Download Full-text

Linked-read whole-genome sequencing resolves common and private structural variants in multiple myeloma

10.1101/2021.12.09.471893 ◽

2021 ◽

Author(s):

Lucía Peña Pérez ◽

Nicolai Frengen ◽

Julia Hauenstein ◽

Charlotte Gran ◽

Charlotte Gustafsson ◽

...

Keyword(s):

Multiple Myeloma ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Cohort Analysis ◽

Genomic Medicine ◽

Molecular Classification ◽

Regulatory Elements ◽

Copy Number Variations ◽

Whole Genome ◽

Structural Variants

Multiple myeloma (MM) is an incurable and aggressive plasma cell malignancy characterized by a complex karyotype with multiple structural variants (SVs) and copy number variations (CNVs). Linked-read whole-genome sequencing (lrWGS) allows for refined detection and reconstruction of SVs by providing long-range genetic information from standard short-read sequencing. This makes lrWGS an attractive solution for capturing the full genomic complexity of MM. Here we show that high-quality lrWGS data can be generated from low numbers of FACS sorted cells without DNA purification. Using this protocol, we analyzed FACS sorted MM cells from 37 MM patients with lrWGS. We found high concordance between lrWGS and FISH for the detection of recurrent translocations and CNVs. Outside of the regions investigated by FISH, we identified >150 additional SVs and CNVs across the cohort. Analysis of the lrWGS data allowed for resolving the structure of diverse SVs affecting the MYC and t(11;14) loci causing the duplication of genes and gene regulatory elements. In addition, we identified private SVs causing the dysregulation of genes recurrently involved in translocations with the IGH locus and show that these can alter the molecular classification of the MM. Overall, we conclude that lrWGS allows for the detection of aberrations critical for MM prognostics and provides a feasible route for providing comprehensive genetics. Implementing lrWGS could provide more accurate clinical prognostics, facilitate genomic medicine initiatives, and greatly improve the stratification of patients included in clinical trials.

Download Full-text

Whole-genome sequencing reveals new Alzheimer’s disease-associated rare variants in loci related to synaptic function and neuronal development

10.1101/2020.11.03.20225540 ◽

2020 ◽

Author(s):

Dmitry Prokopenko ◽

Sarah L. Morgan ◽

Kristina Mullin ◽

Oliver Hofmann ◽

Brad Chapman ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Rare Variants ◽

Spatial Clustering ◽

Synaptic Function ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Genome Wide

AbstractINTRODUCTIONGenome-wide association studies have led to numerous genetic loci associated with Alzheimer’s disease (AD). Whole-genome sequencing (WGS) now permit genome-wide analyses to identify rare variants contributing to AD risk.METHODSWe performed single-variant and spatial clustering-based testing on rare variants (minor allele frequency ≤1%) in a family-based WGS-based association study of 2,247 subjects from 605 multiplex AD families, followed by replication in 1,669 unrelated individuals.RESULTSWe identified 13 new AD candidate loci that yielded consistent rare-variant signals in discovery and replication cohorts (4 from single-variant, 9 from spatial-clustering), implicating these genes: FNBP1L, SEL1L, LINC00298, PRKCH, C15ORF41, C2CD3, KIF2A, APC, LHX9, NALCN, CTNNA2, SYTL3, CLSTN2.DISCUSSIONDownstream analyses of these novel loci highlight synaptic function, in contrast to common AD-associated variants, which implicate innate immunity. These loci have not been previously associated with AD, emphasizing the ability of WGS to identify AD-associated rare variants, particularly outside of coding regions.

Download Full-text