scholarly journals Genes and pathways implicated in tetralogy of Fallot revealed by ultra-rare variant burden analysis in 231 genome sequences

2020 ◽  
Author(s):  
Roozbeh Manshaei ◽  
Daniele Merico ◽  
Miriam S. Reuter ◽  
Worrawat Engchuan ◽  
Bahareh A. Mojarad ◽  
...  

AbstractRecent genome-wide studies of rare genetic variants have begun to implicate novel mechanisms for tetralogy of Fallot (TOF), a severe congenital heart defect (CHD).To provide statistical support for case-only data without parental genomes, we re-analyzed genome sequences of 231 individuals with TOF or related CHD. We adapted a burden test originally developed for de novo variants to assess singleton variant burden in individual genes, and in gene-sets corresponding to functional pathways and mouse phenotypes, accounting for highly correlated gene-sets, and for multiple testing.The gene burden test identified a significant burden of deleterious missense variants in NOTCH1 (Bonferroni-corrected p-value <0.01). These NOTCH1 variants showed significant enrichment for those affecting the extracellular domain, and especially for disruption of cysteine residues forming disulfide bonds (OR 39.8 vs gnomAD). Individuals with NOTCH1 variants, all with TOF, were enriched for positive family history of CHD. Other genes not previously implicated in TOF had more modest statistical support and singleton missense variant results were non-significant for gene-set burden. For singleton truncating variants, the gene burden test confirmed significant burden in FLT4. Gene-set burden tests identified a cluster of pathways corresponding to VEGF signaling (FDR=0%), and of mouse phenotypes corresponding to abnormal vasculature (FDR=0.8%), that suggested additional candidate genes not previously identified (e.g., WNT5A and ZFAND5). Analyses using unrelated sequencing datasets supported specificity of the findings for CHD.The findings support the importance of ultra-rare variants disrupting genes involved in VEGF and NOTCH signaling in the genetic architecture of TOF. These proof-of-principle data indicate that this statistical methodology could assist in analyzing case-only sequencing data in which ultra-rare variants, whether de novo or inherited, contribute to the genetic etiopathogenesis of a complex disorder.Author summaryWe analyzed the ultra-rare nonsynonymous variant burden for genome sequencing data from 231 individuals with congenital heart defects, most with tetralogy of Fallot. We adapted a burden test originally developed for de novo variants. In line with other studies, we identified a significant truncating variant burden for FLT4 and deleterious missense burden for NOTCH1, both passing a stringent Bonferroni multiple-test correction. For NOTCH1, we observed frequent disruption of cysteine residues establishing disulfide bonds in the extracellular domain. We also identified genes with BH-FDR <10% that were not previously implicated. To overcome limited power for individual genes, we tested gene-sets corresponding to functional pathways and mouse phenotypes. Gene-set burden of truncating variants was significant for vascular endothelial growth factor signaling and abnormal vasculature phenotypes. These results confirmed previous findings and suggested additional candidate genes for experimental validation in future studies. This methodology can be extended to other case-only sequencing data in which ultra-rare variants make a substantial contribution to genetic etiology.

2018 ◽  
Author(s):  
Hoang T. Nguyen ◽  
Amanda Dobbyn ◽  
Alexander W. Charney ◽  
Julien Bryois ◽  
April Kim ◽  
...  

AbstractTrio family and case-control studies of next-generation sequencing data have proven integral to understanding the contribution of rare inherited and de novo single-nucleotide variants to the genetic architecture of complex disease. Ideally, such studies should identify individual risk genes of moderate to large effect size to generate novel treatment hypotheses for further follow-up. However, due to insufficient power, gene set enrichment analyses have come to be relied upon for detecting differences between cases and controls, implicating sets of hundreds of genes rather than specific targets for further investigation. Here, we present a Bayesian statistical framework, termed gTADA, that integrates gene-set membership information with gene-level de novo and rare inherited case-control counts, to prioritize risk genes with excess rare variant burden within enriched gene sets. Applying gTADA to available whole-exome sequencing datasets for several neuropsychiatric conditions, we replicated previously reported gene set enrichments and identified novel risk genes. For epilepsy, gTADA prioritized 40 risk genes (posterior probabilities > 0.95), 6 of which replicate in an independent whole-genome sequencing study. In addition, 30/40 genes are novel genes. We found that epilepsy genes had high protein-protein interaction (PPI) network connectivity, and show specific expression during human brain development. Some of the top prioritized EPI genes were connected to a PPI subnetwork of immune genes and show specific expression in prenatal microglia. We also identified multiple enriched drug-target gene sets for EPI which included immunostimulants as well as known antiepileptics. Immune biology was supported specifically by case-control variants from familial epilepsies rather than do novo mutations in generalized encephalitic epilepsy.


2019 ◽  
Author(s):  
Soeren Lukassen ◽  
Foo Wei Ten ◽  
Roland Eils ◽  
Christian Conrad

AbstractRecent advances in single-cell RNA sequencing (scRNA-Seq) have driven the simultaneous measurement of the expression of 1,000s of genes in 1,000s of single cells. These growing data sets allow us to model gene sets in biological networks at an unprecedented level of detail, in spite of heterogenous cell populations. Here, we propose an unsupervised deep neural network model that is a hybrid of matrix factorization and conditional variational autoencoders (CVA), which utilizes weights as matrix factorizations to obtain gene sets, while class-specific inputs to the latent variable space facilitate a plausible identification of cell types. This artificial neural network model seamlessly integrates functional gene set inference, experimental batch effect correction, and static gene identification, which we conceptually prove here for three single-cell RNA-Seq datasets and suggest for future single-cell-gene analytics.


2020 ◽  
Author(s):  
Todd Lencz ◽  
Jin Yu ◽  
Raiyan Rashid Khan ◽  
Shai Carmi ◽  
Max Lam ◽  
...  

AbstractIMPORTANCESchizophrenia is a serious mental illness with high heritability. While common genetic variants account for a portion of the heritability, identification of rare variants associated with the disorder has proven challenging.OBJECTIVETo identify genes and gene sets associated with schizophrenia in a founder population (Ashkenazi Jewish), and to determine the relative power of this population for rare variant discovery.DESIGN, SETTING, AND PARTICIPANTSData on exonic variants were extracted from whole genome sequences drawn from 786 patients with schizophrenia and 463 healthy control subjects, all drawn from the Ashkenazi Jewish population. Variants observed in two large publicly available datasets (total n≈153,000, excluding neuropsychiatric patients) were filtered out, and novel ultra-rare variants (URVs) were compared in cases and controls.MAIN OUTCOMES AND MEASURESThe number of novel URVs and genes carrying them were compared across cases and controls. Genes in which only cases or only controls carried novel, functional URVs were examined using gene set analyses.RESULTSCases had a higher frequency of novel missense or loss of function (MisLoF) variants compared to controls, as well as a greater number of genes impacted by MisLoF variants. Characterizing 141 “case-only” genes (in which ≥ 3 AJ cases in our dataset had MisLoF URVs with none found in our AJ controls), we replicated prior findings of both enrichment for synaptic gene sets, as well as specific genes such as SETD1A and TRIO. Additionally, we identified cadherins as a novel gene set associated with schizophrenia including a recurrent mutation in PCDHA3. Several genes associated with autism and other neurodevelopmental disorders including CACNA1E, ASXL3, SETBP1, and WDFY3, were also identified in our case-only gene list, as was TSC2, which is linked to tuberous sclerosis. Modeling the effects of purifying selection demonstrated that deleterious rare variants are greatly over-represented in a founder population with a tight bottleneck and rapidly expanding census, resulting in enhanced power for rare variant association studies.CONCLUSIONS AND RELEVANCEIdentification of cell adhesion genes in the cadherin/protocadherin family is consistent with evidence from large-scale GWAS in schizophrenia, helps specify the synaptic abnormalities that may be central to the disorder, and suggests novel potential treatment strategies (e.g., inhibition of protein kinase C). Study of founder populations may serve as a cost-effective way to rapidly increase gene discovery in schizophrenia and other complex disorders.


2020 ◽  
Vol 295 (42) ◽  
pp. 14510-14521 ◽  
Author(s):  
Mark F. Fisher ◽  
Colton D. Payne ◽  
Thaveshini Chetty ◽  
Darren Crayn ◽  
Oliver Berkowitz ◽  
...  

Cyclic peptides are reported to have antibacterial, antifungal, and other bioactivities. Orbitides are a class of cyclic peptides that are small, head-to-tail cyclized, composed of proteinogenic amino acids and lack disulfide bonds; they are also known in several genera of the plant family Rutaceae. Melicope xanthoxyloides is the Australian rain forest tree of the Rutaceae family in which evolidine, the first plant cyclic peptide, was discovered. Evolidine (cyclo-SFLPVNL) has subsequently been all but forgotten in the academic literature, so to redress this we used tandem MS and de novo transcriptomics to rediscover evolidine and decipher its biosynthetic origin from a short precursor just 48 residues in length. We also identified another six M. xanthoxyloides orbitides using the same techniques. These peptides have atypically diverse C termini consisting of residues not recognized by either of the known proteases plants use to macrocyclize peptides, suggesting new cyclizing enzymes await discovery. We examined the structure of two of the novel orbitides by NMR, finding one had a definable structure, whereas the other did not. Mining RNA-seq and whole genome sequencing data from other species of the Rutaceae family revealed that a large and diverse family of peptides is encoded by similar sequences across the family and demonstrates how powerful de novo transcriptomics can be at accelerating the discovery of new peptide families.


2020 ◽  
Author(s):  
Mark F. Fisher ◽  
Colton Payne ◽  
Thaveshini Chetty ◽  
Darren Crayn ◽  
Oliver Berkowitz ◽  
...  

AbstractCyclic peptides are reported to have antibacterial, antifungal and other bioactivities. Several genera of the Rutaceae family are known to produce orbitides, which are small head-to-tail cyclic peptides composed of proteinogenic amino acids and lacking disulfide bonds. Melicope xanthoxyloides is an Australian rain forest tree of the Rutaceae family in which evolidine - the first plant cyclic peptide - was discovered. Evolidine (cyclo-SFLPVNL) has subsequently been all but forgotten in the academic literature, but here we use tandem mass spectrometry to rediscover evolidine and using de novo transcriptomics we show its biosynthetic origin to be from a short precursor just 48 residues in length. In all, seven M. xanthoxyloides orbitides were found and they had atypically diverse C-termini consisting of residues not recognized by either of the known proteases plants use to macrocyclize peptides. Two of the novel orbitides were studied by nuclear magnetic resonance spectroscopy and although one had definable structure, the other did not. By mining RNA-seq and whole genome sequencing data from other species, it was apparent that a large and diverse family of peptides is encoded by sequences like these across the Rutaceae.


2021 ◽  
Author(s):  
Mahmoud Koko ◽  
Roland Krause ◽  
Thomas Sander ◽  
Dheeraj Reddi Bobbili ◽  
Michael Nothnagel ◽  
...  

Background: Burden analysis in epilepsy has shown an excess of deleterious ultra-rare variants (URVs) in few gene-sets, such as known epilepsy genes, constrained genes, ion channel or GABAA receptor genes. We set out to investigate the burden of URVs in a comprehensive range of gene-sets presumed to be implicated in epileptogenesis. Methods: We investigated several constraint and conservation-based strategies to study whole exome sequencing data from European individuals with developmental and epileptic encephalopathies (DEE, n = 1,003), genetic generalized epilepsy (GGE, n = 3,064), and non-acquired focal epilepsy (NAFE, n = 3,522), collected by the Epi25 Collaborative, compared to 3,962 ancestry-matched controls. The burden of 12 URVs types in 92 gene-sets was compared between epilepsy cases (DDE, GGE, NAFE) and controls using logistic regression analysis. Results: Burden analysis of brain-expressed genes revealed an excess of different URVs types in all three epilepsy categories which was largest for constrained missense variants. The URVs burden was prominent in neuron-specific, synaptic and developmental genes as well as genes encoding ion channels and receptors, and it was generally higher for DEE and GGE compared to NAFE. The patterns of URVs burden in gene-sets expressed in inhibitory vs. excitatory neurons or receptors suggested a high burden in both in DEE but a differential involvement of inhibitory genes in GGE, while excitatory genes were predominantly affected in NAFE. Top ranking susceptibility genes from a recent genome-wide association study (GWAS) of generalized and focal epilepsies displayed a higher URVs burden in constrained coding regions in GGE and NAFE, respectively. Conclusions: Using exome-based gene-set burden analysis, we demonstrate that missense URVs affecting mainly constrained sites are enriched in neuronal genes in both common and rare severe epilepsy syndromes. Our results indicate a differential impact of these URVs in genes expressed in inhibitory vs. excitatory neurons and receptors in generalized vs. focal epilepsies. The excess of URVs in top-ranking GWAS risk-genes suggests a convergence of rare deleterious and common risk-variants in the pathogenesis of generalized and focal epilepsies.


Author(s):  
Tan-Hoang Nguyen ◽  
Xin He ◽  
Ruth C Brown ◽  
Bradley T Webb ◽  
Kenneth S Kendler ◽  
...  

Abstract Motivation: Rare variant-based analyses are beginning to identify risk genes for neuropsychiatric disorders and other diseases. However, the identified genes only account for a fraction of predicted causal genes. Recent studies have shown that rare damaging variants are significantly enriched in specific gene-sets. Methods which are able to jointly model rare variants and gene-sets to identify enriched gene-sets and use these enriched gene-sets to prioritize additional risk genes could improve understanding of the genetic architecture of diseases. Results: We propose DECO (Integrated analysis of de novo mutations, rare case/control variants and omics information via gene-sets), an integrated method for rare-variant and gene-set analysis. The method can (i) test the enrichment of gene-sets directly within the statistical model, and (ii) use enriched gene-sets to rank existing genes and prioritize additional risk genes for tested disorders. In simulations, DECO performs better than a homologous method that uses only variant data. To demonstrate the application of the proposed protocol, we have applied this approach to rare-variant datasets of schizophrenia. Compared with a method which only uses variant information, DECO is able to prioritize additional risk genes. Availability: DECO can be used to analyze rare-variants and biological pathways or cell types for any disease. The package is available on Github https://github.com/hoangtn/DECO.


2016 ◽  
Author(s):  
Andrea Ganna ◽  
Giulio Genovese ◽  
Daniel P. Howrigan ◽  
Andrea Byrnes ◽  
Mitja Kurki ◽  
...  

Ultra-rare inherited and de novo disruptive variants in highly constrained (HC) genes are enriched in neurodevelopmental disorders 1–5. However, their impact on cognition in the general population has not been explored. We hypothesize that disruptive and damaging ultra-rare variants (URVs) in HC genes not only confer risk to neurodevelopmental disorders, but also influence general cognitive abilities measured indirectly by years of education (YOE). We tested this hypothesis in 14,133 individuals with whole exome or genome sequencing data. The presence of one or more URVs was associated with a decrease in YOE (3.1 months less for each additional mutation; P-value=3.3×10−8) and the effect was stronger in HC genes enriched for brain expression (6.5 months less, P-value=3.4×10−5). The effect of these variants was more pronounced than the estimated effects of runs of homozygosity and pathogenic copy number variation 6–9. Our findings suggest that effects of URVs in HC genes are not confined to severe neurodevelopmental disorder, but influence the cognitive spectrum in the general population


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Shan Jiang ◽  
Daizhan Zhou ◽  
Yin-Ying Wang ◽  
Peilin Jia ◽  
Chunling Wan ◽  
...  

AbstractSchizophrenia (SCZ) is a severe psychiatric disorder with a strong genetic component. High heritability of SCZ suggests a major role for transmitted genetic variants. Furthermore, SCZ is also associated with a marked reduction in fecundity, leading to the hypothesis that alleles with large effects on risk might often occur de novo. In this study, we conducted whole-genome sequencing for 23 families from two cohorts with unaffected siblings and parents. Two nonsense de novo mutations (DNMs) in GJC1 and HIST1H2AD were identified in SCZ patients. Ten genes (DPYSL2, NBPF1, SDK1, ZNF595, ZNF718, GCNT2, SNX9, AACS, KCNQ1, and MSI2) were found to carry more DNMs in SCZ patients than their unaffected siblings by burden test. Expression analyses indicated that these DNM implicated genes showed significantly higher expression in prefrontal cortex in prenatal stage. The DNM in the GJC1 gene is highly likely a loss function mutation (pLI = 0.94), leading to the dysregulation of ion channel in the glutamatergic excitatory neurons. Analysis of rare variants in independent exome sequencing dataset indicates that GJC1 has significantly more rare variants in SCZ patients than in unaffected controls. Data from genome-wide association studies suggested that common variants in the GJC1 gene may be associated with SCZ and SCZ-related traits. Genes co-expressed with GJC1 are involved in SCZ, SCZ-associated pathways, and drug targets. These evidences suggest that GJC1 may be a risk gene for SCZ and its function may be involved in prenatal and early neurodevelopment, a vulnerable period for developmental disorders such as SCZ.


2019 ◽  
Author(s):  
Shan Jiang ◽  
Daizhan Zhou ◽  
Yin-Ying Wang ◽  
Peilin Jia ◽  
Chunling Wan ◽  
...  

AbstractSchizophrenia (SCZ) is a severe psychiatric disorder with a strong genetic component. High heritability of SCZ suggests a major role for transmitted genetic variants. Furthermore, SCZ is also associated with a marked reduction in fecundity, leading to the hypothesis that alleles with large effects on risk might often occur de novo. In this study, we conducted whole-genome sequencing for 23 families from two cohorts with matched unaffected siblings and parents. Two nonsense de novo mutations (DNMs) in GJC1 and HIST1H2AD were identified in SCZ patients. Ten genes (DPYSL2, NBPF1, SDK1, ZNF595, ZNF718, GCNT2, SNX9, AACS, KCNQ1 and MSI2) were found to carry more DNMs in SCZ patients than their unaffected siblings by burden test. Expression analyses indicated that these DNM implicated genes showed significantly higher expression in prefrontal cortex in prenatal stage. The DNM in the GJC1 gene is highly likely a loss function mutation (pLI = 0.94), leading to the dysregulation of ion channel in the glutamatergic excitatory neurons. Analysis of rare variants in independent exome sequencing dataset indicates that GJC1 has significantly more rare variants in SCZ patients than in unaffected controls. Data from genome-wide association studies suggested that common variants in the GJC1 gene may be associated with SCZ and SCZ-related traits. Genes co-expressed with GJC1 are involved in SCZ, SCZ-associated pathways and drug targets. These evidence suggest that GJC1 may be a risk gene for SCZ and its function may be involved in prenatal and early neurodevelopment, a vulnerable period for developmental disorders such as SCZ.


Sign in / Sign up

Export Citation Format

Share Document