scholarly journals mTADA: a framework for analyzing de novo mutations in multiple traits

2018 ◽  
Author(s):  
Hoang T. Nguyen ◽  
Amanda Dobbyn ◽  
Joseph Buxbaum ◽  
Dalila Pinto ◽  
Shaun M Purcell ◽  
...  

AbstractJoint analysis of multiple traits can result in the identification of associations not found through the analysis of each trait in isolation. In addition, approaches that consider multiple traits can aid in the characterization of shared genetic etiology among those traits. In recent years, parent-offspring trio studies have reported an enrichment of de novo mutations (DNMs) in neuropsychiatric disorders. The analysis of DNM data in the context of neuropsychiatric disorders has implicated multiple putatively causal genes, and a number of reported genes are shared across disorders. However, a joint analysis method designed to integrate de novo mutation data from multiple studies has yet to be implemented. We here introduce multi pi e-trait TAD A (mTADA) which jointly analyzes two traits using DNMs from non-overlapping family samples. mTADA uses two single-trait analysis data sets to estimate the proportion of overlapping risk genes, and reports genes shared between and specific to the relevant disorders. We applied mTADA to >13,000 trios for six disorders: schizophrenia (SCZ), autism spectrum disorder (ASD), developmental disorders (DD), intellectual disability (ID), epilepsy (EPI), and congenital heart disease (CHD). We report the proportion of overlapping risk genes and the specific risk genes shared for each pair of disorders. A total of 153 genes were found to be shared in at least one pair of disorders. The largest percentages of shared risk genes were observed for pairs of DD, ID, ASD, and CHD (>20%) whereas SCZ, CHD, and EPI did not show strong overlaps In risk gene set between them. Furthermore, mTADA identified additional SCZ, EPI and CHD risk genes through integration with DD de novo mutation data. For CHD, using DD information, 31 risk genes with posterior probabilities > 0.8 were identified, and 20 of these 31 genes were not in the list of known CHD genes. We find evidence that most significant CHD risk genes are strongly expressed in prenatal stages of the human genes. Finally, we validated our findings for CHD and EPI in independent cohorts comprising 1241 CHD trios, 226 CHD singletons and 197 EPI trios. Multiple novel risk genes identified by mTADA also had de novo mutations in these independent data sets. The joint analysis method introduced here, mTADA, is able to identify risk genes shared by two traits as well as additional risk genes not found through single-trait analysis only. A number of risk genes reported by mTADA are identified only through joint analysis, specifically when ASD, DD, or ID are one of the two traits examined. This suggests that novel genes for the trait or a new trait might converge to a core gene list of the three traits.

PLoS Genetics ◽  
2021 ◽  
Vol 17 (11) ◽  
pp. e1009849
Author(s):  
Yuhan Xie ◽  
Mo Li ◽  
Weilai Dong ◽  
Wei Jiang ◽  
Hongyu Zhao

Recent studies have demonstrated that multiple early-onset diseases have shared risk genes, based on findings from de novo mutations (DNMs). Therefore, we may leverage information from one trait to improve statistical power to identify genes for another trait. However, there are few methods that can jointly analyze DNMs from multiple traits. In this study, we develop a framework called M-DATA (Multi-trait framework for De novo mutation Association Test with Annotations) to increase the statistical power of association analysis by integrating data from multiple correlated traits and their functional annotations. Using the number of DNMs from multiple diseases, we develop a method based on an Expectation-Maximization algorithm to both infer the degree of association between two diseases as well as to estimate the gene association probability for each disease. We apply our method to a case study of jointly analyzing data from congenital heart disease (CHD) and autism. Our method was able to identify 23 genes for CHD from joint analysis, including 12 novel genes, which is substantially more than single-trait analysis, leading to novel insights into CHD disease etiology.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Tan-Hoang Nguyen ◽  
Amanda Dobbyn ◽  
Ruth C. Brown ◽  
Brien P. Riley ◽  
Joseph D. Buxbaum ◽  
...  

2021 ◽  
Author(s):  
Yuhan Xie ◽  
Mo Li ◽  
Weilai Dong ◽  
Wei Jiang ◽  
Hongyu Zhao

Recent studies have demonstrated that multiple early-onset diseases have shared risk genes, based on findings from de novo mutations (DMNs). Therefore, we may leverage information from one trait to improve statistical power to identify genes for another trait. However, there are few methods that can jointly analyze DNMs from multiple traits. In this study, we develop a framework called M-DATA (Multi-trait framework for De novo mutation Association Test with Annotations) to increase the statistical power of association analysis by integrating data from multiple correlated traits and their functional annotations. Using the number of DNMs from multiple diseases, we develop a method based on an Expectation-Maximization algorithm to both infer the degree of association between two diseases as well as to estimate the gene association probability for each disease. We apply our method to a case study of jointly analyzing data from congenital heart disease (CHD) and autism. Our method was able to identify 23 genes from joint analysis, including 12 novel genes, which is substantially more than single-trait analysis, leading to novel insights into CHD disease etiology.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Tianyun Wang ◽  
◽  
Kendra Hoekzema ◽  
Davide Vecchio ◽  
Huidan Wu ◽  
...  

Abstract Most genes associated with neurodevelopmental disorders (NDDs) were identified with an excess of de novo mutations (DNMs) but the significance in case–control mutation burden analysis is unestablished. Here, we sequence 63 genes in 16,294 NDD cases and an additional 62 genes in 6,211 NDD cases. By combining these with published data, we assess a total of 125 genes in over 16,000 NDD cases and compare the mutation burden to nonpsychiatric controls from ExAC. We identify 48 genes (25 newly reported) showing significant burden of ultra-rare (MAF < 0.01%) gene-disruptive mutations (FDR 5%), six of which reach family-wise error rate (FWER) significance (p < 1.25E−06). Among these 125 targeted genes, we also reevaluate DNM excess in 17,426 NDD trios with 6,499 new autism trios. We identify 90 genes enriched for DNMs (FDR 5%; e.g., GABRG2 and UIMC1); of which, 61 reach FWER significance (p < 3.64E−07; e.g., CASZ1). In addition to doubling the number of patients for many NDD risk genes, we present phenotype–genotype correlations for seven risk genes (CTCF, HNRNPU, KCNQ3, ZBTB18, TCF12, SPEN, and LEO1) based on this large-scale targeted sequencing effort.


2018 ◽  
Vol 9 (1) ◽  
Author(s):  
Hui Guo ◽  
Tianyun Wang ◽  
Huidan Wu ◽  
Min Long ◽  
Bradley P. Coe ◽  
...  

2021 ◽  
Author(s):  
Hanmin Guo ◽  
Lin Hou ◽  
Yu Shi ◽  
Sheng Chih Jin ◽  
Xue Zeng ◽  
...  

AbstractExome sequencing on tens of thousands of parent-proband trios has identified numerous deleterious de novo mutations (DNMs) and implicated risk genes for many disorders. Recent studies have suggested shared genes and pathways are enriched for DNMs across multiple disorders. However, existing analytic strategies only focus on genes that reach statistical significance for multiple disorders and require large trio samples in each study. As a result, these methods are not able to characterize the full landscape of genetic sharing due to polygenicity and incomplete penetrance. In this work, we introduce EncoreDNM, a novel statistical framework to quantify shared genetic effects between two disorders characterized by concordant enrichment of DNMs in the exome. EncoreDNM makes use of exome-wide, summary-level DNM data, including genes that do not reach statistical significance in single-disorder analysis, to evaluate the overall and annotation-partitioned genetic sharing between two disorders. Applying EncoreDNM to DNM data of nine disorders, we identified abundant pairwise enrichment correlations, especially in genes intolerant to pathogenic mutations and genes highly expressed in fetal tissues. These results suggest that EncoreDNM improves current analytic approaches and may have broad applications in DNM studies.


2019 ◽  
Vol 2 (1) ◽  
Author(s):  
Pieter W. M. Bonnemaijer ◽  
◽  
Elisabeth M. van Leeuwen ◽  
Adriana I. Iglesias ◽  
Puya Gharahkhani ◽  
...  

AbstractA new avenue of mining published genome-wide association studies includes the joint analysis of related traits. The power of this approach depends on the genetic correlation of traits, which reflects the number of pleiotropic loci, i.e. genetic loci influencing multiple traits. Here, we applied new meta-analyses of optic nerve head (ONH) related traits implicated in primary open-angle glaucoma (POAG); intraocular pressure and central corneal thickness using Haplotype reference consortium imputations. We performed a multi-trait analysis of ONH parameters cup area, disc area and vertical cup-disc ratio. We uncover new variants; rs11158547 in PPP1R36-PLEKHG3 and rs1028727 near SERPINE3 at genome-wide significance that replicate in independent Asian cohorts imputed to 1000 Genomes. At this point, validation of these variants in POAG cohorts is hampered by the high degree of heterogeneity. Our results show that multi-trait analysis is a valid approach to identify novel pleiotropic variants for ONH.


2017 ◽  
Author(s):  
Hon-Cheong So ◽  
Yui-Hang Wong

AbstractRecent studies have suggested an important role of de novo mutations (DNMs) in neuropsychiatric disorders. As DNMs are not subject to elimination due to evolutionary pressure, they are likely to have greater disruptions on biological functions. While a number of sequencing studies have been performed on neuropsychiatric disorders, the implications of DNMs for drug discovery remain to be explored.In this study, we employed a gene-set analysis approach to address this issue. Four neuropsychiatric disorders were studied, including schizophrenia (SCZ), autistic spectrum disorders (ASD), intellectual disability (ID) and epilepsy. We first identified gene-sets associated with different drugs, and analyzed whether the gene-set pertaining to each drug overlaps with DNMs more than expected by chance. We also assessed which medication classes are enriched among the prioritized drugs. We discovered that neuropsychiatric drug classes were indeed significantly enriched for DNMs of all four disorders; in particular, antipsychotics and antiepileptics were the most strongly enriched drug classes for SCZ and epilepsy respectively. Interestingly, we revealed enrichment of several unexpected drug classes, such as lipid-lowering agents for SCZ and anti-neoplastic agents. By inspecting individual hits, we also uncovered other interesting drug candidates or mechanisms (e.g. histone deacetylase inhibition and retinoid signaling) that might warrant further investigations. Taken together, this study provided evidence for the usefulness of DNMs in guiding drug discovery or repositioning.


Author(s):  
Kuokuo Li ◽  
Zhenghuan Fang ◽  
Guihu Zhao ◽  
Bin Li ◽  
Chao Chen ◽  
...  

AbstractThe clinical similarity among different neuropsychiatric disorders (NPDs) suggested a shared genetic basis. We catalogued 23,109 coding de novo mutations (DNMs) from 6511 patients with autism spectrum disorder (ASD), 4,293 undiagnosed developmental disorder (UDD), 933 epileptic encephalopathy (EE), 1022 intellectual disability (ID), 1094 schizophrenia (SCZ), and 3391 controls. We evaluated that putative functional DNMs contribute to 38.11%, 34.40%, 33.31%, 10.98% and 6.91% of patients with ID, EE, UDD, ASD and SCZ, respectively. Consistent with phenotype similarity and heterogeneity in different NPDs, they show different degree of genetic association. Cross-disorder analysis of DNMs prioritized 321 candidate genes (FDR < 0.05) and showed that genes shared in more disorders were more likely to exhibited specific expression pattern, functional pathway, genetic convergence, and genetic intolerance.


Genes ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 1020
Author(s):  
Nicholas S. Diab ◽  
Syndi Barish ◽  
Weilai Dong ◽  
Shujuan Zhao ◽  
Garrett Allington ◽  
...  

Congenital heart disease (CHD) is the most common congenital malformation and the leading cause of mortality therein. Genetic etiologies contribute to an estimated 90% of CHD cases, but so far, a molecular diagnosis remains unsolved in up to 55% of patients. Copy number variations and aneuploidy account for ~23% of cases overall, and high-throughput genomic technologies have revealed additional types of genetic variation in CHD. The first CHD risk genotypes identified through high-throughput sequencing were de novo mutations, many of which occur in chromatin modifying genes. Murine models of cardiogenesis further support the damaging nature of chromatin modifying CHD mutations. Transmitted mutations have also been identified through sequencing of population scale CHD cohorts, and many transmitted mutations are enriched in cilia genes and Notch or VEGF pathway genes. While we have come a long way in identifying the causes of CHD, more work is required to end the diagnostic odyssey for all CHD families. Complex genetic explanations of CHD are emerging but will require increasingly sophisticated analysis strategies applied to very large CHD cohorts before they can come to fruition in providing molecular diagnoses to genetically unsolved patients. In this review, we discuss the genetic architecture of CHD and biological pathways involved in its pathogenesis.


Sign in / Sign up

Export Citation Format

Share Document