scholarly journals Genetic Dissection of Hypertrophic Cardiomyopathy with Myocardial RNA-Seq

2020 ◽  
Vol 21 (9) ◽  
pp. 3040 ◽  
Author(s):  
Jun Gao ◽  
John Collyer ◽  
Maochun Wang ◽  
Fengping Sun ◽  
Fuyi Xu

Hypertrophic cardiomyopathy (HCM) is an inherited disorder of the myocardium, and pathogenic mutations in the sarcomere genes myosin heavy chain 7 (MYH7) and myosin-binding protein C (MYBPC3) explain 60%–70% of observed clinical cases. The heterogeneity of phenotypes observed in HCM patients, however, suggests that novel causative genes or genetic modifiers likely exist. Here, we systemically evaluated RNA-seq data from 28 HCM patients and 9 healthy controls with pathogenic variant identification, differential expression analysis, and gene co-expression and protein–protein interaction network analyses. We identified 43 potential pathogenic variants in 19 genes in 24 HCM patients. Genes with more than one variant included the following: MYBPC3, TTN, MYH7, PSEN2, and LDB3. A total of 2538 protein-coding genes, six microRNAs (miRNAs), and 1617 long noncoding RNAs (lncRNAs) were identified differentially expressed between the groups, including several well-characterized cardiomyopathy-related genes (ANKRD1, FHL2, TGFB3, miR-30d, and miR-154). Gene enrichment analysis revealed that those genes are significantly involved in heart development and physiology. Furthermore, we highlighted four subnetworks: mtDNA-subnetwork, DSP-subnetwork, MYH7-subnetwork, and MYBPC3-subnetwork, which could play significant roles in the progression of HCM. Our findings further illustrate that HCM is a complex disease, which results from mutations in multiple protein-coding genes, modulation by non-coding RNAs and perturbations in gene networks.

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Mikhail Pomaznoy ◽  
Ashu Sethi ◽  
Jason Greenbaum ◽  
Bjoern Peters

Abstract RNA-seq methods are widely utilized for transcriptomic profiling of biological samples. However, there are known caveats of this technology which can skew the gene expression estimates. Specifically, if the library preparation protocol does not retain RNA strand information then some genes can be erroneously quantitated. Although strand-specific protocols have been established, a significant portion of RNA-seq data is generated in non-strand-specific manner. We used a comprehensive stranded RNA-seq dataset of 15 blood cell types to identify genes for which expression would be erroneously estimated if strand information was not available. We found that about 10% of all genes and 2.5% of protein coding genes have a two-fold or higher difference in estimated expression when strand information of the reads was ignored. We used parameters of read alignments of these genes to construct a machine learning model that can identify which genes in an unstranded dataset might have incorrect expression estimates and which ones do not. We also show that differential expression analysis of genes with biased expression estimates in unstranded read data can be recovered by limiting the reads considered to those which span exonic boundaries. The resulting approach is implemented as a package available at https://github.com/mikpom/uslcount.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
J. Pei ◽  
M. Schuldt ◽  
E. Nagyova ◽  
Z. Gu ◽  
S. el Bouhaddani ◽  
...  

Abstract Background Hypertrophic cardiomyopathy (HCM) is the most common genetic disease of the cardiac muscle, frequently caused by mutations in MYBPC3. However, little is known about the upstream pathways and key regulators causing the disease. Therefore, we employed a multi-omics approach to study the pathomechanisms underlying HCM comparing patient hearts harboring MYBPC3 mutations to control hearts. Results Using H3K27ac ChIP-seq and RNA-seq we obtained 9310 differentially acetylated regions and 2033 differentially expressed genes, respectively, between 13 HCM and 10 control hearts. We obtained 441 differentially expressed proteins between 11 HCM and 8 control hearts using proteomics. By integrating multi-omics datasets, we identified a set of DNA regions and genes that differentiate HCM from control hearts and 53 protein-coding genes as the major contributors. This comprehensive analysis consistently points toward altered extracellular matrix formation, muscle contraction, and metabolism. Therefore, we studied enriched transcription factor (TF) binding motifs and identified 9 motif-encoded TFs, including KLF15, ETV4, AR, CLOCK, ETS2, GATA5, MEIS1, RXRA, and ZFX. Selected candidates were examined in stem cell-derived cardiomyocytes with and without mutated MYBPC3. Furthermore, we observed an abundance of acetylation signals and transcripts derived from cardiomyocytes compared to non-myocyte populations. Conclusions By integrating histone acetylome, transcriptome, and proteome profiles, we identified major effector genes and protein networks that drive the pathological changes in HCM with mutated MYBPC3. Our work identifies 38 highly affected protein-coding genes as potential plasma HCM biomarkers and 9 TFs as potential upstream regulators of these pathomechanisms that may serve as possible therapeutic targets.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Martin Bilbao-Arribas ◽  
Endika Varela-Martínez ◽  
Naiara Abendaño ◽  
Damián de Andrés ◽  
Lluís Luján ◽  
...  

Abstract Background Long non-coding RNAs (lncRNAs) are involved in several immune processes, including the immune response to vaccination, but most of them remain uncharacterised in livestock species. The mechanism of action of aluminium adjuvants as vaccine components is neither not fully understood. Results We built a transcriptome from sheep PBMCs RNA-seq data in order to identify unannotated lncRNAs and analysed their expression patterns along protein coding genes. We found 2284 novel lncRNAs and assessed their conservation in terms of sequence and synteny. Differential expression analysis performed between animals inoculated with commercial vaccines or aluminium adjuvant alone and the co-expression analysis revealed lncRNAs related to the immune response to vaccines and adjuvants. A group of co-expressed genes enriched in cytokine signalling and production highlighted the differences between different treatments. A number of differentially expressed lncRNAs were correlated with a divergently located protein-coding gene, such as the OSM cytokine. Other lncRNAs were predicted to act as sponges of miRNAs involved in immune response regulation. Conclusions This work enlarges the lncRNA catalogue in sheep and puts an accent on their involvement in the immune response to repetitive vaccination, providing a basis for further characterisation of the non-coding sheep transcriptome within different immune cells.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Lars Gabriel ◽  
Katharina J. Hoff ◽  
Tomáš Brůna ◽  
Mark Borodovsky ◽  
Mario Stanke

Abstract Background BRAKER is a suite of automatic pipelines, BRAKER1 and BRAKER2, for the accurate annotation of protein-coding genes in eukaryotic genomes. Each pipeline trains statistical models of protein-coding genes based on provided evidence and, then predicts protein-coding genes in genomic sequences using both the extrinsic evidence and statistical models. For training and prediction, BRAKER1 and BRAKER2 incorporate complementary extrinsic evidence: BRAKER1 uses only RNA-seq data while BRAKER2 uses only a database of cross-species proteins. The BRAKER suite has so far not been able to reliably exceed the accuracy of BRAKER1 and BRAKER2 when incorporating both types of evidence simultaneously. Currently, for a novel genome project where both RNA-seq and protein data are available, the best option is to run both pipelines independently, and to pick one, likely better output. Therefore, one or another type of the extrinsic evidence would remain unexploited. Results We present TSEBRA, a software that selects gene predictions (transcripts) from the sets generated by BRAKER1 and BRAKER2. TSEBRA uses a set of rules to compare scores of overlapping transcripts based on their support by RNA-seq and homologous protein evidence. We show in computational experiments on genomes of 11 species that TSEBRA achieves higher accuracy than either BRAKER1 or BRAKER2 running alone and that TSEBRA compares favorably with the combiner tool EVidenceModeler. Conclusion TSEBRA is an easy-to-use and fast software tool. It can be used in concert with the BRAKER pipeline to generate a gene prediction set supported by both RNA-seq and homologous protein evidence.


2018 ◽  
Author(s):  
Matthew A. Reyna ◽  
David Haan ◽  
Marta Paczkowska ◽  
Lieven P.C. Verbeke ◽  
Miguel Vazquez ◽  
...  

AbstractThe catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notablyTERTpromoter mutations, have been reported. Motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes, we performed multi-faceted pathway and network analyses of non-coding mutations across 2,583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project. While few non-coding genomic elements were recurrently mutated in this cohort, we identified 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression inTP53, TLE4, andTCF4. We found that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing was primarily targeted by non-coding mutations in this cohort, with samples containing non-coding mutations exhibiting similar gene expression signatures as coding mutations in well-known RNA splicing factors. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments.


Blood ◽  
2016 ◽  
Vol 128 (22) ◽  
pp. 2705-2705 ◽  
Author(s):  
Lara Rizzotto ◽  
Arianna Bottoni ◽  
Tzung-Huei Lai ◽  
Chaomei Liu ◽  
Pearlly S Yan ◽  
...  

Abstract Chronic lymphocytic leukemia (CLL) follows a variable clinical course mostly dependent upon genomic factors, with a subset of patients having low risk disease and others displaying rapid progression associated with clonal evolution. Epigenetic mechanisms such as DNA promoter hypermethylation were shown to have a role in CLL evolution where the acquisition of increasingly heterogeneous DNA methylation patters occurred in conjunction with clonal evolution of genetic aberrations and was associated with disease progression. However the role of epigenetic mechanisms regulated by the histone deacetylase group of transcriptional repressors in the progression of CLL has not been well characterized. The histone deacetylases (HDACs) 1 and 2 are recruited onto gene promoters and form a complex with the histone demethylase KDM1. Once recruited, the complex mediate the removal of acetyl groups from specific lysines on histones (H3K9 and H3K14) thus triggering the demethylation of lysine 4 (H3K4me3) and the silencing of gene expression. CLL is characterized by the dysregulation of numerous coding and non coding genes, many of which have key roles in regulating the survival or progression of CLL. For instance, our group showed that the levels of HDAC1 were elevated in high risk as compared to low risk CLL or normal lymphocytes and this over-expression was responsible for the silencing of miR-106b, mR-15, miR-16, and miR-29b which affected CLL survival by modulating the expression of key anti-apoptotic proteins Bcl-2 and Mcl-1. To characterize the HDAC-repressed gene signature in high risk CLL, we conducted chromatin immunoprecipitation (ChIP) of the nuclear lysates from 3 high risk and 3 low risk CLL patients using antibodies against HDAC1, HDAC2 and KDM1 or non-specific IgG, sequenced and aligned the eluted DNA to a reference genome and determined the binding of HDAC1, HDAC2 and KDM1 at the promoters for all protein coding and microRNA genes. Preliminary results from this ChIP-seq showed a strong recruitment of HDAC1, HDAC2 and KDM1 to the promoters of several microRNA as well as protein coding genes in high risk CLL. To further corroborate these data we performed ChIP-Seq in the same 6 CLL samples to analyze the levels of H3K4me2 and H3K4me3 around gene promoters before and after 6h exposure to the HDACi panobinostat. Our goal was to demonstrate that HDAC inhibition elicited an increase in the levels of acetylation on histones and triggered the accrual of H3K4me2 at the repressed promoter, events likely to facilitate the recruitment of RNA polymerase II to this promoter. Initial analysis confirmed a robust accumulation of H3K4me2 and H3K4me3 marks at the gene promoters of representative genes that recruited HDAC1 and its co-repressors in the previous ChIP-Seq analysis in high risk CLL patients. Finally, 5 aggressive CLL samples were treated with the HDACi abexinostat for 48h and RNA before and after treatment was subjected to RNA-seq for small and large RNA to confirm that the regions of chromatin uncoiled by HDACi treatment were actively transcribed. HDAC inhibition induced the expression of a large number of miRNA genes as well as key protein coding genes, such as miR-29b, miR-210, miR-182, miR-183, miR-95, miR-940, FOXO3, EBF1 and BCL2L11. Of note, some of the predicted or validated targets of the induced miRNAs were key facilitators in the progression of CLL, such as BTK, SYK, MCL-1, BCL-2, TCL1, and ROR1. Moreover, RNA-seq showed that the expression of these protein coding genes was reduced by 2-33 folds upon HDAC inhibition. We plan to extend the RNA-seq to 5 CLL samples with indolent disease and combine all the data to identify a common signature of protein coding and miRNA genes that recruited the HDAC1 complex, accumulated activating histone modifications upon treatment with HDACi and altered gene and miRNA expression after HDAC inhibition in high risk CLL versus low risk CLL. The signature will be than validated on a large cohort of indolent and aggressive CLL patients. Our final goal is to define a signature of coding and non coding genes silenced by HDACs in high risk CLL and its role in facilitating disease progression. Disclosures Woyach: Acerta: Research Funding; Karyopharm: Research Funding; Morphosys: Research Funding.


Blood ◽  
2012 ◽  
Vol 120 (21) ◽  
pp. 3298-3298 ◽  
Author(s):  
Eric R. Londin ◽  
Eleftheria Hatzimichael ◽  
Phillipe Loher ◽  
Yue Zhao ◽  
Yi Jing ◽  
...  

Abstract Abstract 3298 The anucleate platelets play a critical role in the formation of thrombi and prevention of bleeding. While the repertoire of platelet transcripts is a reflection of the megakaryocyte at the time of platelet differentiation, post-transcriptional events are known to occur. Furthermore, a strong correlation between the expressed mRNAs and proteome has been identified. Having a complete understanding of the platelet transcriptome is important for generating insights into the genetic basis of platelet disease traits. To capture the complexity of the platelet transcriptome, we performed RNA sequencing (RNA-seq) in leukocyte-depleted platelets from 10 males, with median age of 24.5 yrs and unremarkable medical history. Their short and long RNA platelet transcriptomes were analyzed on the SOLiD 5500xl sequencing platform. We generated ∼3.5 billion sequence reads ∼40% of which could be mapped uniquely to the human genome. Our analysis revealed that ∼9,000 distinct protein-coding mRNAs and ∼800 microRNAs (miRNAs) were present in the transcriptome of each of the 10 sequenced individuals. Comparison of the levels of mRNA expression across the 10 individuals showed an exceptional level of consistency with pair-wise Pearson correlation values ≥0.98. The miRNA expression profiles across the 10 individuals showed a similar consistency with pair-wise Pearson correlation values ≥0.98. Surprisingly, we found that these mRNAs and miRNAs accounted for a little over 1/2 of all of the uniquely mapped sequence reads suggesting the abundant presence of additional non-protein coding RNA (ncRNA) transcripts. Using the annotated entries of the latest release of the ENSEMBL database, we investigated the genetic make-up of these other transcripts. We found that ∼25% of each individual's uniquely mapped reads corresponded to non-protein coding transcripts from mRNA-coding loci. These reads accounted for more than 10,000 distinct such transcripts. In addition, each of the individuals in our cohort expressed an average of ∼1,500 pseudogenes and ∼200 long intergenic non-coding RNAs (lincRNAs). The short RNA profiles of the ten individuals revealed an abundance of diverse categories of ncRNAs including the signal recognition particle RNA (srpRNA), small nuclear RNA (snRNA) and small cytoplasmic RNAs (scRNA). These ncRNAs are involved in the processing of pre-mRNAs and their presence and prevalence in the anucleate platetet suggests the existence of a complex network of mRNA processing that persists after the megakaryocyte fragmentation. We also investigated the RNA-omes of the ten individuals for evidence of transcription of the pyknon category of ncRNAs. Pyknons are of particular interest because each has numerous intergenic and intronic copies whereas nearly all known human protein-coding genes contain one or more pyknons in their mRNA. Recent experimental work has shown that intergenic instances of the pyknons are transcribed in a tissue- and cell-state specific manner. An average of ∼100,000 pyknons are transcribed in each of the 10 sequenced individuals suggesting the possibility of a far-reaching network of interactions that link exonic space to distant non-exonic regions and are active in platelets. Lastly, we found that a large variety of distinct repeat element categories are expressed in the RNA-omes (both short and long) of these individuals. Among the most abundantly represented categories of repeat elements were DNA transposons, long terminal repeat (LTR) retrotransposons, and non-LTR retrotransposons such as long interspersed elements (LINEs) and short interspersed elements (SINEs). In summary, our RNA-seq analyses have revealed a spectrum of platelet transcripts that transcends protein-coding genes and miRNAs. Indeed, the transcripts that have their source in genomic features not previously discussed or analyzed in the platelet context represent a very significant portion of all platelet transcripts. This in turn suggests an unanticipated richness, and presumably commensurate complexity, for the platelet transcriptome. While the role of these novel non-protein coding RNAs is currently unknown it is expected that at least some of them may be of functional significance which will in turn permit a better understanding of the molecular mechanisms that regulate platelet physiology and may contribute to processes beyond thrombosis and hemostasis. Disclosures: No relevant conflicts of interest to declare.


mBio ◽  
2015 ◽  
Vol 6 (6) ◽  
Author(s):  
Vojtěch David ◽  
Pavel Flegontov ◽  
Evgeny Gerasimov ◽  
Goro Tanifuji ◽  
Hassan Hashimi ◽  
...  

ABSTRACT Perkinsela is an enigmatic early-branching kinetoplastid protist that lives as an obligate endosymbiont inside Paramoeba (Amoebozoa). We have sequenced the highly reduced mitochondrial genome of Perkinsela, which possesses only six protein-coding genes (cox1, cox2, cox3, cob, atp6, and rps12), despite the fact that the organelle itself contains more DNA than is present in either the host or endosymbiont nuclear genomes. An in silico analysis of two Perkinsela strains showed that mitochondrial RNA editing and processing machineries typical of kinetoplastid flagellates are generally conserved, and all mitochondrial transcripts undergo U-insertion/deletion editing. Canonical kinetoplastid mitochondrial ribosomes are also present. We have developed software tools for accurate and exhaustive mapping of transcriptome sequencing (RNA-seq) reads with extensive U-insertions/deletions, which allows detailed investigation of RNA editing via deep sequencing. With these methods, we show that up to 50% of reads for a given edited region contain errors of the editing system or, less likely, correspond to alternatively edited transcripts. IMPORTANCE Uridine insertion/deletion-type RNA editing, which occurs in the mitochondrion of kinetoplastid protists, has been well-studied in the model parasite genera Trypanosoma, Leishmania, and Crithidia. Perkinsela provides a unique opportunity to broaden our knowledge of RNA editing machinery from an evolutionary perspective, as it represents the earliest kinetoplastid branch and is an obligatory endosymbiont with extensive reductive trends. Interestingly, up to 50% of mitochondrial transcripts in Perkinsela contain errors. Our study was complemented by use of newly developed software designed for accurate mapping of extensively edited RNA-seq reads obtained by deep sequencing.


2018 ◽  
Vol 31 (10) ◽  
pp. 1083-1094 ◽  
Author(s):  
Chantal E. McCabe ◽  
Silvia R. Cianzio ◽  
Jamie A. O’Rourke ◽  
Michelle A. Graham

Brown stem rot, caused by the fungus Phialophora gregata, reduces soybean yield by up to 38%. Although three dominant resistance loci have been identified (Rbs1 to Rbs3), the gene networks responsible for pathogen recognition and defense remain unknown. Further, identification and characterization of resistant and susceptible germplasm remains difficult. We conducted RNA-Seq of infected and mock-infected leaf, stem, and root tissues of a resistant (PI 437970, Rbs3) and susceptible (Corsoy 79) genotype. Combining historical mapping data with genotype expression differences allowed us to identify a cluster of receptor-like proteins that are candidates for the Rbs3 resistance gene. Reads mapping to the Rbs3 locus were used to identify potential novel single-nucleotide polymorphisms within candidate genes that could improve phenotyping and breeding efficiency. Comparing responses to infection revealed little overlap in differential gene expression between genotypes or tissues. Gene networks associated with defense, DNA replication, and iron homeostasis are hallmarks of resistance to P. gregata. This novel research demonstrates the utility of combining contrasting genotypes, gene expression, and classical genetic studies to characterize complex disease resistance loci.


Sign in / Sign up

Export Citation Format

Share Document