scholarly journals A Bayesian method for rare variant analysis using functional annotations and its application to Autism

2019 ◽  
Author(s):  
Shengtong Han ◽  
Nicholas Knoblauch ◽  
Gao Wang ◽  
Siming Zhao ◽  
Yuwen Liu ◽  
...  

AbstractRare genetic variants make significant contributions to human diseases. Compared to common variants, rare variants have larger effect sizes and are generally free of linkage disequilibrium (LD), which makes it easier to identify causal variants. Numerous methods have been developed to analyze rare variants in a gene or region in association studies, with the goal of finding risk genes by aggregating information of all variants of a gene. These methods, however, often make unrealistic assumptions, e.g. all rare variants in a risk gene would have non-zero effects. In practice, current methods for gene-based analysis often fail to show any advantage over simple single-variant analysis. In this work, we develop a Bayesian method: MIxture model based Rare variant Analysis on GEnes (MIRAGE). MIRAGE captures the heterogeneity of variant effects by treating all variants of a gene as a mixture of risk and non-risk variants, and models the prior probabilities of being risk variants as function of external information of variants, such as allele frequencies and predicted deleterious effects. MIRAGE uses an empirical Bayes approach to estimate these prior probabilities by combining information across genes. We demonstrate in both simulations and analysis of an exome-sequencing dataset of Autism, that MIRAGE significantly outperforms current methods for rare variant analysis. In particular, the top genes identified by MIRAGE are highly enriched with known or plausible Autism risk genes. Our results highlight several novel Autism genes with high Bayesian posterior probabilities and functional connections with Autism. MIRAGE is available at https://xinhe-lab.github.io/mirage.

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Na Zhu ◽  
◽  
Emilia M. Swietlik ◽  
Carrie L. Welch ◽  
Michael W. Pauciulo ◽  
...  

Abstract Background Pulmonary arterial hypertension (PAH) is a lethal vasculopathy characterized by pathogenic remodeling of pulmonary arterioles leading to increased pulmonary pressures, right ventricular hypertrophy, and heart failure. PAH can be associated with other diseases (APAH: connective tissue diseases, congenital heart disease, and others) but often the etiology is idiopathic (IPAH). Mutations in bone morphogenetic protein receptor 2 (BMPR2) are the cause of most heritable cases but the vast majority of other cases are genetically undefined. Methods To identify new risk genes, we utilized an international consortium of 4241 PAH cases with exome or genome sequencing data from the National Biological Sample and Data Repository for PAH, Columbia University Irving Medical Center, and the UK NIHR BioResource – Rare Diseases Study. The strength of this combined cohort is a doubling of the number of IPAH cases compared to either national cohort alone. We identified protein-coding variants and performed rare variant association analyses in unrelated participants of European ancestry, including 1647 IPAH cases and 18,819 controls. We also analyzed de novo variants in 124 pediatric trios enriched for IPAH and APAH-CHD. Results Seven genes with rare deleterious variants were associated with IPAH with false discovery rate smaller than 0.1: three known genes (BMPR2, GDF2, and TBX4), two recently identified candidate genes (SOX17, KDR), and two new candidate genes (fibulin 2, FBLN2; platelet-derived growth factor D, PDGFD). The new genes were identified based solely on rare deleterious missense variants, a variant type that could not be adequately assessed in either cohort alone. The candidate genes exhibit expression patterns in lung and heart similar to that of known PAH risk genes, and most variants occur in conserved protein domains. For pediatric PAH, predicted deleterious de novo variants exhibited a significant burden compared to the background mutation rate (2.45×, p = 2.5e−5). At least eight novel pediatric candidate genes carrying de novo variants have plausible roles in lung/heart development. Conclusions Rare variant analysis of a large international consortium identified two new candidate genes—FBLN2 and PDGFD. The new genes have known functions in vasculogenesis and remodeling. Trio analysis predicted that ~ 15% of pediatric IPAH may be explained by de novo variants.


Author(s):  
Na Zhu ◽  
Emilia M. Swietlik ◽  
Carrie L. Welch ◽  
Michael W. Pauciulo ◽  
Jacob J. Hagen ◽  
...  

AbstractBackgroundGroup 1 pulmonary arterial hypertension (PAH) is a lethal vasculopathy characterized by pathogenic remodeling of pulmonary arterioles leading to increased pulmonary pressures, right ventricular hypertrophy and heart failure. Recent high-throughput sequencing studies have identified additional PAH risk genes and suggested differences in genetic causes by age of onset. However, known risk genes explain only 15-20% of non-familial idiopathic PAH cases.MethodsTo identify new risk genes, we utilized an international consortium of 4,241 PAH cases with 4,175 sequenced exomes (n=2,572 National Biological Sample and Data Repository for PAH; n=469 Columbia University Irving Medical Center, enriched for pediatric trios) and 1,134 sequenced genomes (UK NIHR Bioresource – Rare Diseases Study). Most of the cases were adult-onset disease (93%), and 55% idiopathic (IPAH) and 35% associated with other diseases (APAH). We identified protein-coding variants and performed rare variant association analyses in unrelated participants of European ancestry, including 2,789 cases and 18,819 controls (11,101 unaffected parents from the Simons Powering Autism Research for Knowledge study and 7,718 gnomAD individuals). We analyzed de novo variants in 124 pediatric trios.ResultsSeven genes with rare deleterious variants were significantly associated (false discovery rate <0.1) with IPAH, including three known genes (BMPR2, GDF2, and TBX4), two recently identified candidate genes (SOX17, KDR), and two new candidate genes (FBLN2, fibulin 2; PDGFD, platelet-derived growth factor D). The candidate genes exhibit expression patterns in lung and heart similar to that of known PAH risk genes, and most of the variants occur in conserved protein domains. Variants in known PAH gene, ACVRL1, showed association with APAH. Predicted deleterious de novo variants in pediatric cases exhibited a significant burden compared to the background mutation rate (2.5x, p=7.0E-6). At least eight novel candidate genes carrying de novo variants have plausible roles in lung/heart development.ConclusionsRare variant analysis of a large international consortium identifies two new candidate genes - FBLN2 and PDGFD. The new genes have known functions in vasculogenesis and remodeling but have not been previously implicated in PAH. Trio analysis predicts that ~15% of pediatric IPAH may be explained by de novo variants.


2018 ◽  
Vol 138 (12) ◽  
pp. 2674-2677 ◽  
Author(s):  
Manuela Pigors ◽  
John E.A. Common ◽  
Xuan Fei Colin C. Wong ◽  
Sajid Malik ◽  
Claire A. Scott ◽  
...  

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Jong Jin Oh ◽  
Manu Shivakumar ◽  
Jason Miller ◽  
Shefali Verma ◽  
Hakmin Lee ◽  
...  

AbstractSince prostate cancer is highly heritable, common variants associated with prostate cancer have been studied in various populations, including those in Korea. However, rare and low-frequency variants have a significant influence on the heritability of the disease. The contributions of rare variants to prostate cancer susceptibility have not yet been systematically evaluated in a Korean population. In this work, we present a large-scale exome-wide rare variant analysis of 7,258 individuals (985 cases with prostate cancer and 6,273 controls). In total, 19 rare variant loci spanning 7 genes contributed to an association with prostate cancer susceptibility. In addition to replicating previously known susceptibility genes (e.g., CDYL2, MST1R, GPER1, and PARD3B), 3 novel genes were identified (FDR q < 0.05), including the non-coding RNAs ENTPD3-AS1, LOC102724438, and protein-coding gene SPATA3. Additionally, 6 pathways were identified based on identified variants and genes, including estrogen signaling pathway, signaling by MST1, IL-15 production, MSP-RON signaling pathway, and IL-12 signaling and production in macrophages, which are known to be associated with prostate cancer. In summary, we report novel genes and rare variants that potentially play a role in prostate cancer susceptibility in the Korean population. These observations demonstrated a path towards one of the fundamental goals of precision medicine, which is to identify biomarkers for a subset of the population with a greater risk of disease than others.


2021 ◽  
Author(s):  
Sheila M. Gaynor ◽  
Kenneth E. Westerman ◽  
Lea L. Ackovic ◽  
Xihao Li ◽  
Zilin Li ◽  
...  

AbstractSummaryWe developed the STAAR WDL workflow to facilitate the analysis of rare variants in whole genome sequencing association studies. The open-access STAAR workflow written in the workflow description language (WDL) allows a user to perform rare variant testing for both gene-centric and genetic region approaches, enabling genome-wide, candidate, and conditional analyses. It incorporates functional annotations into the workflow as introduced in the STAAR method in order to boost the rare variant analysis power. This tool was specifically developed and optimized to be implemented on cloud-based platforms such as BioData Catalyst Powered by Terra. It provides easy-to-use functionality for rare variant analysis that can be incorporated into an exhaustive whole genome sequencing analysis pipeline.Availability and implementationThe workflow is freely available from https://dockstore.org/workflows/github.com/sheilagaynor/STAAR_workflow.


2021 ◽  
pp. 1-10
Author(s):  
Zoe Guan ◽  
Ronglai Shen ◽  
Colin B. Begg

<b><i>Background:</i></b> Many cancer types show considerable heritability, and extensive research has been done to identify germline susceptibility variants. Linkage studies have discovered many rare high-risk variants, and genome-wide association studies (GWAS) have discovered many common low-risk variants. However, it is believed that a considerable proportion of the heritability of cancer remains unexplained by known susceptibility variants. The “rare variant hypothesis” proposes that much of the missing heritability lies in rare variants that cannot reliably be detected by linkage analysis or GWAS. Until recently, high sequencing costs have precluded extensive surveys of rare variants, but technological advances have now made it possible to analyze rare variants on a much greater scale. <b><i>Objectives:</i></b> In this study, we investigated associations between rare variants and 14 cancer types. <b><i>Methods:</i></b> We ran association tests using whole-exome sequencing data from The Cancer Genome Atlas (TCGA) and validated the findings using data from the Pan-Cancer Analysis of Whole Genomes Consortium (PCAWG). <b><i>Results:</i></b> We identified four significant associations in TCGA, only one of which was replicated in PCAWG (BRCA1 and ovarian cancer). <b><i>Conclusions:</i></b> Our results provide little evidence in favor of the rare variant hypothesis. Much larger sample sizes may be needed to detect undiscovered rare cancer variants.


2017 ◽  
Vol 42 (3) ◽  
pp. 276-287 ◽  
Author(s):  
Yiwen Luo ◽  
Arnab Maity ◽  
Michael C. Wu ◽  
Chris Smith ◽  
Qing Duan ◽  
...  

2011 ◽  
Vol 5 (S9) ◽  
Author(s):  
Han Chen ◽  
Audrey E Hendricks ◽  
Yansong Cheng ◽  
Adrienne L Cupples ◽  
Josée Dupuis ◽  
...  

2019 ◽  
Vol 39 (6) ◽  
pp. 801-813 ◽  
Author(s):  
Elise Lim ◽  
Han Chen ◽  
Josée Dupuis ◽  
Ching‐Ti Liu

Sign in / Sign up

Export Citation Format

Share Document