scholarly journals Dynamic Scan Procedure for Detecting Rare-Variant Association Regions in Whole-Genome Sequencing Studies

2019 ◽  
Vol 104 (5) ◽  
pp. 802-814 ◽  
Author(s):  
Zilin Li ◽  
Xihao Li ◽  
Yaowu Liu ◽  
Jincheng Shen ◽  
Han Chen ◽  
...  
2019 ◽  
Author(s):  
Zilin Li ◽  
Xihao Li ◽  
Yaowu Liu ◽  
Jincheng Shen ◽  
Han Chen ◽  
...  

AbstractWhole genome sequencing (WGS) studies are being widely conducted to identify rare variants associated with human diseases and disease-related traits. Classical single-marker association analyses for rare variants have limited power, and variant-set based analyses are commonly used to analyze rare variants. However, existing variant-set based approaches need to pre-specify genetic regions for analysis, and hence are not directly applicable to WGS data due to the large number of intergenic and intron regions that consist of a massive number of non-coding variants. The commonly used sliding window method requires pre-specifying fixed window sizes, which are often unknown as a priori, are difficult to specify in practice and are subject to limitations given genetic association region sizes are likely to vary across the genome and phenotypes. We propose a computationally-efficient and dynamic scan statistic method (Scan the Genome (SCANG)) for analyzing WGS data that flexibly detects the sizes and the locations of rare-variants association regions without the need of specifying a prior fixed window size. The proposed method controls the genome-wise type I error rate and accounts for the linkage disequilibrium among genetic variants. It allows the detected rare variants association region sizes to vary across the genome. Through extensive simulated studies that consider a wide variety of scenarios, we show that SCANG substantially outperforms several alternative rare-variant association detection methods while controlling for the genome-wise type I error rates. We illustrate SCANG by analyzing the WGS lipids data from the Atherosclerosis Risk in Communities (ARIC) study.


2017 ◽  
Author(s):  
Pradeep Natarajan ◽  
Gina M. Peloso ◽  
S. Maryam Zekavat ◽  
May Montasser ◽  
Andrea Ganna ◽  
...  

Deep-coverage whole genome sequencing at the population level is now feasible and offers potential advantages for locus discovery, particularly in the analysis rare mutations in non-coding regions. Here, we performed whole genome sequencing in 16,324 participants from four ancestries at mean depth >29X and analyzed correlations of genotypes with four quantitative traits – plasma levels of total cholesterol, low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol, and triglycerides. We conducted a discovery analysis including common or rare variants in coding as well as non-coding regions and developed a framework to interpret genome sequence for dyslipidemia risk. Common variant association yielded loci previously described with the exception of a few variants not captured earlier by arrays or imputation. In coding sequence, rare variant association yielded known Mendelian dyslipidemia genes and, in non-coding sequence, we detected no rare variant association signals after application of four approaches to aggregate variants in non-coding regions. We developed a new, genome-wide polygenic score for LDL-C and observed that a high polygenic score conferred similar effect size to a monogenic mutation (~30 mg/dl higher LDL-C for each); however, among those with extremely high LDL-C, a high polygenic score was considerably more prevalent than a monogenic mutation (23% versus 2% of participants, respectively).


2018 ◽  
Author(s):  
Zagaa Odgerel ◽  
Nora Hernandez ◽  
Jemin Park ◽  
Ruth Ottman ◽  
Elan D. Louis ◽  
...  

ABSTRACTEssential tremor (ET) is one of the most common movement disorders. The etiology of ET remains largely unexplained. Whole genome sequencing (WGS) is likely to be of value in understanding a large proportion of ET with Mendelian and complex disease inheritance patterns. In ET families with Mendelian inheritance patterns, WGS may lead to gene identification where WES analysis failed to identify the causative variant due to incomplete coverage of the entire coding region of the genome. Alternatively, in ET families with complex disease inheritance patterns with gene x gene and gene x environment interactions enrichment of functional rare coding and non-coding variants may explain the heritability of ET. We performed WGS in eight ET families (n=40 individuals) enrolled in the Family Study of Essential Tremor. The analysis included filtering WGS data based on allele frequency in population databases, rare variant classification and association testing using the Mixed-Model Kernel Based Adaptive Cluster (MM-KBAC) test and prioritization of candidate genes identified within families using phenolyzer. WGS analysis identified candidate genes for ET in 5/8 (62.5%) of the families analyzed. WES analysis in a subset of these families in our previously published study failed to identify candidate genes. In one family, we identified a deleterious and damaging variant (c.1367G>A, p.(Arg456Gln)) in the candidate gene, CACNA1G, which encodes the pore forming subunit of T-type Ca(2+) channels, CaV3.1, and is expressed in various motor pathways and has been previously implicated in neuronal autorhythmicity and ET. Other candidate genes identified include SLIT3 (family D), which encodes an axon guidance molecule and in three families, phenolyzer prioritized genes that are associated with hereditary neuropathies (family A, KARS, family B, KIF5A and family F, NTRK1). This work has identified candidate genes and pathways for ET that can now be prioritized for functional studies.


2020 ◽  
Author(s):  
Yingxi Yang ◽  
Yuchen Yang ◽  
Le Huang ◽  
Jai G. Broome ◽  
Adolfo Correa ◽  
...  

AbstractWith advances in whole genome sequencing (WGS) technology, multiple statistical methods for aggregate association testing have been developed. Many common approaches aggregate variants in a given genomic window of a fixed/varying size and are not reliant on existing knowledge to define appropriate test units, resulting in most identified regions not being clearly linked to genes, limiting biological understanding. Functional information from new technologies (such as Hi-C and its derivatives), which can help link enhancers to the genes they affect, can be leveraged to predefine variant sets for aggregate testing in WGS. Therefore, in this paper we propose the eSCAN (Scan the Enhancers) method for genome-wide assessment of enhancer regions in sequencing studies, combining the advantages of dynamic window selection in SCANG with the advantages of increased incorporation of genomic annotation. eSCAN searches biologically meaningful searching windows, increasing power and aiding biological interpretation, as demonstrated by simulation studies under a wide range of scenarios. We also apply eSCAN for association analysis of blood cell traits using TOPMed WGS data from Women’s Health Initiative (WHI) and Jackson Heart Study (JHS). Results from this real data example show that eSCAN is able to capture more significant signals, and these signals are of shorter length and drive association of larger regions detected by other methods.


PLoS ONE ◽  
2019 ◽  
Vol 14 (8) ◽  
pp. e0220512 ◽  
Author(s):  
Zagaa Odgerel ◽  
Shilpa Sonti ◽  
Nora Hernandez ◽  
Jemin Park ◽  
Ruth Ottman ◽  
...  

2019 ◽  
Vol 104 (2) ◽  
pp. 260-274 ◽  
Author(s):  
Han Chen ◽  
Jennifer E. Huffman ◽  
Jennifer A. Brody ◽  
Chaolong Wang ◽  
Seunggeun Lee ◽  
...  

2018 ◽  
Vol 12 (1) ◽  
Author(s):  
Taketoshi Okita ◽  
Noriko Ohashi ◽  
Daijiro Kabata ◽  
Ayumi Shintani ◽  
Kazuto Kato

Sign in / Sign up

Export Citation Format

Share Document