scholarly journals iCall: a genotype-calling algorithm for rare, low-frequency and common variants on the Illumina exome array

2014 ◽  
Vol 30 (12) ◽  
pp. 1714-1720 ◽  
Author(s):  
Jin Zhou ◽  
Erwin Tantoso ◽  
Lai-Ping Wong ◽  
Rick Twee-Hee Ong ◽  
Jin-Xin Bei ◽  
...  
2012 ◽  
Vol 28 (12) ◽  
pp. 1598-1603 ◽  
Author(s):  
T. S. Shah ◽  
J. Z. Liu ◽  
J. A. B. Floyd ◽  
J. A. Morris ◽  
N. Wirth ◽  
...  

2015 ◽  
Vol 2015 ◽  
pp. 1-6 ◽  
Author(s):  
Tae-Joon Park ◽  
Lyong Heo ◽  
Sanghoon Moon ◽  
Young Jin Kim ◽  
Ji Hee Oh ◽  
...  

Exome-based genotyping arrays are cost-effective and have recently been used as alternative platforms to whole-exome sequencing. However, the automated clustering algorithm in an exome array has a genotype calling problem in accuracy for identifying rare and low-frequency variants. To address these shortcomings, we present a practical approach for accurate genotype calling using the Illumina Infinium HumanExome BeadChip. We present comparison results and a statistical summary of our genotype data sets. Our data set comprises 14,647 Korean samples. To solve the limitation of automated clustering, we performed manual genotype clustering for the targeted identification of 46,076 variants that were identified using GenomeStudio software. To evaluate the effects of applying custom cluster files, we tested cluster files using 804 independent Korean samples and the same platform. Our study firstly suggests practical guidelines for exome chip quality control in Asian populations and provides valuable insight into an association study using exome chip.


2014 ◽  
Vol 15 (1) ◽  
pp. 158 ◽  
Author(s):  
Ruijie Liu ◽  
Zhiyin Dai ◽  
Meredith Yeager ◽  
Rafael A Irizarry ◽  
Matthew E Ritchie

2018 ◽  
Author(s):  
Luke J. O’Connor ◽  
Armin P. Schoech ◽  
Farhad Hormozdiari ◽  
Steven Gazal ◽  
Nick Patterson ◽  
...  

Complex traits and common disease are highly polygenic: thousands of common variants are causal, and their effect sizes are almost always small. Polygenicity could be explained by negative selection, which constrains common-variant effect sizes and may reshape their distribution across the genome. We refer to this phenomenon as flattening, as genetic signal is flattened relative to the underlying biology. We introduce a mathematical definition of polygenicity, the effective number of associated SNPs, and a robust statistical method to estimate it. This definition of polygenicity differs from the number of causal SNPs, a standard definition; it depends strongly on SNPs with large effects. In analyses of 33 complex traits (average N=361k), we determined that common variants are ∼4x more polygenic than low-frequency variants, consistent with pervasive flattening. Moreover, functionally important regions of the genome have increased polygenicity in proportion to their increased heritability, implying that heritability enrichment reflects differences in the number of associations rather than their magnitude (which is constrained by selection). We conclude that negative selection constrains the genetic signal of biologically important regions and genes, reshaping genetic architecture.


Author(s):  
Tianzhong Yang ◽  
Peng Wei ◽  
Wei Pan

Abstract Motivation The abundance of omics data has facilitated integrative analyses of single and multiple molecular layers with genome-wide association studies focusing on common variants. Built on its successes, we propose a general analysis framework to leverage multi-omics data with sequencing data to improve the statistical power of discovering new associations and understanding of the disease susceptibility due to low-frequency variants. The proposed test features its robustness to model misspecification, high power across a wide range of scenarios and the potential of offering insights into the underlying genetic architecture and disease mechanisms. Results Using the Framingham Heart Study data, we show that low-frequency variants are predictive of DNA methylation, even after conditioning on the nearby common variants. In addition, DNA methylation and gene expression provide complementary information to functional genomics. In the Avon Longitudinal Study of Parents and Children with a sample size of 1497, one gene CLPTM1 is identified to be associated with low-density lipoprotein cholesterol levels by the proposed powerful adaptive gene-based test integrating information from gene expression, methylation and enhancer–promoter interactions. It is further replicated in the TwinsUK study with 1706 samples. The signal is driven by both low-frequency and common variants. Availability and implementation Models are available at https://github.com/ytzhong/DNAm. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2009 ◽  
Vol 25 (3) ◽  
pp. 309-314 ◽  
Author(s):  
Jumamurat R. Bayjanov ◽  
Michiel Wels ◽  
Marjo Starrenburg ◽  
Johan E. T. van Hylckama Vlieg ◽  
Roland J. Siezen ◽  
...  

PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0230035
Author(s):  
Julie Hahn ◽  
Yi-Ping Fu ◽  
Michael R. Brown ◽  
Joshua C. Bis ◽  
Paul S. de Vries ◽  
...  

Background Genome-wide association studies have identified multiple genomic loci associated with coronary artery disease, but most are common variants in non-coding regions that provide limited information on causal genes and etiology of the disease. To overcome the limited scope that common variants provide, we focused our investigation on low-frequency and rare sequence variations primarily residing in coding regions of the genome. Methods and results Using samples of individuals of European ancestry from ten cohorts within the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium, both cross-sectional and prospective analyses were conducted to examine associations between genetic variants and myocardial infarction (MI), coronary heart disease (CHD), and all-cause mortality following these events. For prevalent events, a total of 27,349 participants of European ancestry, including 1831 prevalent MI cases and 2518 prevalent CHD cases were used. For incident cases, a total of 55,736 participants of European ancestry were included (3,031 incident MI cases and 5,425 incident CHD cases). There were 1,860 all-cause deaths among the 3,751 MI and CHD cases from six cohorts that contributed to the analysis of all-cause mortality. Single variant and gene-based analyses were performed separately in each cohort and then meta-analyzed for each outcome. A low-frequency intronic variant (rs988583) in PLCL1 was significantly associated with prevalent MI (OR = 1.80, 95% confidence interval: 1.43, 2.27; P = 7.12 × 10−7). We conducted gene-based burden tests for genes with a cumulative minor allele count (cMAC) ≥ 5 and variants with minor allele frequency (MAF) < 5%. TMPRSS5 and LDLRAD1 were significantly associated with prevalent MI and CHD, respectively, and RC3H2 and ANGPTL4 were significantly associated with incident MI and CHD, respectively. No loci were significantly associated with all-cause mortality following a MI or CHD event. Conclusion This study identified one known locus (ANGPTL4) and four new loci (PLCL1, RC3H2, TMPRSS5, and LDLRAD1) associated with cardiovascular disease risk that warrant further investigation.


Sign in / Sign up

Export Citation Format

Share Document