scholarly journals Systematic detection of functional proteoform groups from bottom-up proteomic datasets

2020 ◽  
Author(s):  
Isabell Bludau ◽  
Max Frank ◽  
Christian Dörig ◽  
Yujia Cai ◽  
Moritz Heusel ◽  
...  

AbstractThe cellular proteome, the ensemble of proteins derived from a genome, catalyzes and controls thousands of biochemical functions that are the basis of living cells. Whereas the protein coding regions of the genome of the human and many other species are well known, the complexity and composition of proteomes largely remains to be explored. This task is challenging because mechanisms including alternative splicing and post-translational modifications generally give rise to multiple distinct, but related proteins – proteoforms – per coding gene that expand the functional capacity of a cell.Bottom-up proteomics is a mass spectrometric method that infers the identity and quantity of proteins from the measurement of peptides derived from these proteins by proteolytic digestion. Whereas bottom-up proteomics has become the method of choice for the detection of translation products from essentially any gene, the inherent missing link between measured peptides and their parental proteins has so far precluded the systematic assessment of proteoforms and thus limited the resolution of proteome maps. Here we present a novel, data-driven strategy to assign peptides to unique functional proteoform groups based on peptide correlation patterns across large bottom-up proteomic datasets. Our strategy does not fully characterize specific proteoforms, as is achievable in top-down approaches. Rather, it clusters peptides into functional proteoform groups that are directly linked to the biological context of the study. This allows the detection of tens to hundreds of proteoform groups in an untargeted fashion from bottom-up proteomics experiments.We applied the strategy to two types of bottom-up proteomic datasets. The first is a protein complex co-fractionation dataset where native complexes across two different cell cycle stages were resolved and analyzed. Here, our approach enabled the systematic detection and evaluation of assembly specific proteoforms at an unprecedented scale. The second is a protein abundance vs. sample data matrix typical for bottom-up cohort studies consisting of tissue samples from the mouse BXD genetic reference panel. In either data type the method detected state-specific proteoform groups that could be linked to distinct molecular mechanisms including proteolytic cleavage, alternative splicing and phosphorylation. We envision that the presented approach lays the foundation for a systematic assessment of proteoforms and their functional implications directly from bottom-up proteomic datasets.

Circulation ◽  
2014 ◽  
Vol 130 (suppl_2) ◽  
Author(s):  
Jennifer Davis ◽  
Michelle Sargent ◽  
Jianjian Shi ◽  
Lei Wei ◽  
Maurice S Swanson ◽  
...  

Rationale: During the cardiac injury response fibroblasts differentiate into myofibroblasts, a cell type that enhances extracellular matrix production and facilitates ventricular remodeling. To better understand the molecular mechanisms whereby myofibroblasts are generated in the heart we performed a genome-wide screen with 18,000 cDNAs, which identified the RNA-binding protein muscleblind-like splicing regulator 1 (MBNL1), suggesting a novel association between mRNA alternative splicing and the regulation of myofibroblast differentiation. Objective: To determine the mechanism whereby MBNL1 regulates myofibroblast differentiation and the cardiac fibrotic response. Methods and Results: Confirming the results from our genome wide screen, adenoviral-mediated overexpression of MBNL1 promoted transformation of rat cardiac fibroblasts and mouse embryonic fibroblasts (MEFs) into myofibroblasts, similar to the level of conversion obtained by the profibrotic agonist transforming growth factor β (TGFβ). Antithetically, Mbnl1 -/- MEFs were refractory to TGFβ-induced myofibroblast differentiation. MBNL1 expression is induced in transforming fibroblasts in response to TGFβ and angiotensin II. These results were extended in vivo by analysis of dermal wound healing, a process dependent on myofibroblast differentiation and their proper activity. By day 6 control mice had achieved 82% skin wound closure compared with only 40% in Mbnl1 -/- mice. Moreover, Mbnl1 -/- mice had reduced survival following myocardial infarction injury due to defective fibrotic scar formation and healing. High throughput RNA sequencing (RNAseq) and RNA immunoprecipitation revealed that MBNL1 directly regulates the alternative splicing of transcripts for myofibroblast signaling factors and cytoskeletal-assembly elements. Functional analysis of these factors as mediators of MBNL1 activity is also described here. Conclusions: Collectively, our data suggest that MBNL1 coordinates myofibroblast transformation by directly mediating the alternative splicing of an array of mRNAs encoding differentiation-specific signaling transcripts, which then alter the fibroblast proteome for myofibroblast structure and function.


2021 ◽  
Author(s):  
Samantha C Chomyshen ◽  
Cheng-Wei Wu

Splicing of pre-mRNA is an essential process for dividing cells and splicing defects have been linked to aging and various chronic diseases. Environmental stress has recently been shown to alter splicing fidelity and molecular mechanisms that protect against splicing disruption remains unclear. Using an in vivo RNA splicing reporter, we performed a genome-wide RNAi screen in Caenorhabditis elegans and found that protein translation suppression via silencing of the conserved initiation factor 4G (IFG-1/eIF4G) protects against cadmium-induced splicing disruption. Transcriptome analysis of an ifg-1 deficient mutant revealed an overall increase in splicing fidelity and resistance towards cadmium-induced alternative splicing compared to the wild-type. We found that the ifg-1 mutant up-regulates >80 RNA splicing regulatory genes that are controlled by the TGF-β transcription factor SMA-2. The extended lifespan of the ifg-1 mutant is partially reduced upon sma-2 depletion and completely nullified when core spliceosome genes including snr-1, snr-2, and uaf-2 are knocked down. Together, these data describe a molecular mechanism that provides resistance towards stress-induced alternative splicing and demonstrate an essential role for RNA homeostasis in promoting longevity in a translation-compromised mutant.


2019 ◽  
Author(s):  
Xiaoyun Huang ◽  
Yue Song ◽  
Suyu Zhang ◽  
A Yunga ◽  
Mengqi Zhang ◽  
...  

AbstractChelmon rostratus (Teleostei, Perciformes, Chaetodontidae) is a copperband butterflyfish. As an ornamental fish, the genome information for this species might help understanding the genome evolution of Chaetodontidae and adaptation/evolution of coral reef fish.In this study, using the stLFR co-Barcode reads data, we assembled a genome of 638.70 Mb in size with contig and scaffold N50 sizes of 294.41 kb and 2.61 Mb, respectively. 94.40% of scaffold sequences were assigned to 24 chromosomes using Hi-C data and BUSCO analysis showed that 97.3% (2,579) of core genes were found in our assembly. Up to 21.47 % of the genome was found to be repetitive sequences and 21,375 protein-coding genes were annotated. Among these annotated protein-coding genes, 20,163 (94.33%) proteins were assigned with possible functions.As the first genome for Chaetodontidae family, the information of these data helpfully to improve the essential to the further understanding and exploration of marine ecological environment symbiosis with coral and the genomic innovations and molecular mechanisms contributing to its unique morphology and physiological features.


2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Daowei Li ◽  
Yue Tan

Abstract Background Although numerous risk loci for ulcerative colitis (UC) have been identified in the human genome, the pathogenesis of UC remains unclear. Recently, multiple transcriptomic analyses have shown that aberrant gene expression in the colon tissues of UC patients is associated with disease progression. A pioneering study also demonstrated that altered post-transcriptional regulation is involved in the progression of UC. Here, we provide a genome-wide analysis of alternative splicing (AS) signatures in UC patients. We analyzed three datasets containing 74 tissue samples from UC patients and identified over 2000 significant AS events. Results Skipped exon and alternative first exon were the two most significantly altered AS events in UC patients. The immune response-related pathways were remarkably enriched in the UC-related AS events. Genes with significant AS events were more likely to be dysregulated at the expression level. Conclusions We present a genomic landscape of AS events in UC patients based on a combined analysis of two cohorts. Our results indicate that dysregulation of AS may have a pivotal role in determining the pathogenesis of UC. In addition, our study uncovers genes with potential therapeutic implications for UC treatment.


2020 ◽  
Vol 13 (S11) ◽  
Author(s):  
Young-Joo Jin ◽  
Habtamu Minassie Aycheh ◽  
Seonggyun Han ◽  
John Chamberlin ◽  
Jaehang Shin ◽  
...  

Abstract Background Serum alpha-fetoprotein (AFP) is the approved serum marker for hepatocellular carcinoma (HCC) screening. However, not all HCC patients show high (≥ 20 ng/mL) serum AFP, and the molecular mechanisms of HCCs with normal (< 20 ng/mL) serum AFP remain to be elucidated. Therefore, we aimed to identify biological features of HCCs with normal serum AFP by investigating differential alternative splicing (AS) between HCCs with normal and high serum AFP. Methods We performed a genome-wide survey of AS events in 249 HCCs with normal (n = 131) and high (n = 118) serum AFP levels using RNA-sequencing data obtained from The Cancer Genome Atlas. Results In group comparisons of RNA-seq profiles from HCCs with normal and high serum AFP levels, 161 differential AS events (125 genes; ΔPSI > 0.05, FDR < 0.05) were identified to be alternatively spliced between the two groups. Those genes were enriched in cell migration or proliferation terms such as “the cell migration and growth-cone collapse” and “regulation of insulin-like growth factor (IGF) transport and uptake by IGF binding proteins”. Most of all, two AS genes (FN1 and FAM20A) directly interact with AFP; these relate to the regulation of IGF transport and post-translational protein phosphorylation. Interestingly, 42 genes and 27 genes were associated with gender and vascular invasion (VI), respectively, but only eighteen genes were significant in survival analysis. We especially highlight that FN1 exhibited increased differential expression of AS events (ΔPSI > 0.05), in which exons 25 and 33 were more frequently skipped in HCCs with normal (low) serum AFP compared to those with high serum AFP. Moreover, these events were gender and VI dependent. Conclusion We found that AS may influence the regulation of transcriptional differences inherent in the occurrence of HCC maintaining normal rather than elevated serum AFP levels.


2020 ◽  
Vol 38 (15_suppl) ◽  
pp. e15672-e15672
Author(s):  
Claudia Escher ◽  
Jakob Vowinckel ◽  
Karel Novy ◽  
Thomas Corwin ◽  
Tobias Treiber ◽  
...  

e15672 Background: The rise of precision oncology therapeutics requires deep understanding of all molecular mechanisms involved in cancer biology. IndivuType offers the world’s first multi-omics database for individualized cancer therapy, analyzing the highest quality cancer biospecimens to generate the most comprehensive dataset, including genomics (WGS), transcriptomics, proteomics, and clinical outcome information. Indivumed is committed to the quality of the IndivuType ecosystem starting with stringent SOP-driven sample collection combined with thorough validation of clinical information and data integrity. The availability of multi-omics data from the same tumor can provide a comprehensive molecular picture of cancer for a given patient. Protein expression and activation are directly related to cellular function and hence provide actionable information about druggable targets. Until recently, the proteomics technology could not match the scale of next-gen sequencing and consequently precision medicine has almost exclusively been based on gene level data. Here we present the first large-scale data set for protein expression and phosphorylation. Enabled by the data independent acquisition (DIA) workflow, a mass spectrometric method provided by Biognosys that obtains peptide fragmentation data in a highly parallelized way with high sensitivity, more than 7,000 proteins in the whole proteome (WP) and 20,000 phospho-peptides in the phospho-proteome (PP) workflow were profiled. Methods: Sample processing from 5 mg of tissue per sample was performed using liquid handling robot. Phospho-peptide enrichment was carried out with a Kingfisher Flex device and MagReSyn Ti-IMAC magnetic beads. DIA LC-MS/MS was performed on multiple platforms consisting of a Thermo Scientific Q Exactive HF-X mass spectrometer coupled to a Waters M-Class LC. Chromatography was operating at 5 µL/min, and separation was achieved using 45 min (WP) and 60 min (PP) gradients. Results: Several thousands of high-quality patient samples of various cancer types have been analyzed to date. The resulting proteome and phospho-proteome data has been integrated into the IndivuType database, thereby providing a solid foundation to advance our understanding of cancer. Conclusions: With the ongoing addition of more samples and associated deep and rich data, the platform could unravel key molecular events and is expected to transform knowledge into actionable treatments and personalized therapies.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. e15536-e15536
Author(s):  
Jakob Vowinckel ◽  
Thomas Corwin ◽  
Jonathan Woodsmith ◽  
Tobias Treiber ◽  
Roland Bruderer ◽  
...  

e15536 Background: The rise of precision oncology therapeutics requires deep understanding of the molecular mechanisms implicated in cancer biology. Colorectal cancer (CRC) is one of the first solid tumors to be molecularly characterized by defined genes and pathways. Advances in tumor profiling have revealed a profound molecular heterogeneity in CRC leading to the definition of several consensus molecular subtypes (CMS). However, this molecular heterogeneity is still largely defined on the genomic and transcriptomics level. To complement the understanding of genetically defined molecular subgroups, we performed large-scale deep proteomic and phospho-proteomic profiling of CRC patient biopsies and adjacent healthy control tissue, which has enabled to explore the phenotype and obtain more functional insights in cancer biology. Methods: Sample processing from 5-10 mg of tissue per sample was performed using a liquid handling robot. Phospho-peptide enrichment was carried out with a Kingfisher Flex device and MagReSyn Ti-IMAC magnetic beads. Data-Independent Acquisition (DIA) LC-MS/MS was performed on multiple platforms consisting of a Thermo Scientific Q Exactive HF-X mass spectrometer coupled to a Waters M-Class LC. Chromatography was operating at 5 µL/min, and separation was achieved using 45 min (whole proteome) and 60 min (phospho-proteome) gradients. Results: Indivumed has built IndivuType, the world’s first multi-omics database for individualized cancer therapy, analyzing the highest quality cancer biospecimens to generate the most comprehensive dataset, including genomics, transcriptomics, proteomics, and clinical outcome information. Enabled by the DIA technology, a mass spectrometric method developed by Biognosys that obtains peptide fragmentation data in a highly parallelized way with high sensitivity, more than 7,000 proteins in the whole proteome and 20,000 phospho-peptides in the phospho-proteome workflow were profiled across more than 900 resected tissue samples of various CMS of CRC. The resulting proteome and phospho-proteome data were integrated into the IndivuType database and cross-analyzed with genomic and transcriptomic markers. Through this combined analysis, novel insights in clinically relevant signaling pathways in CRC subtypes were revealed. Conclusions: The deep phenotypic profiling of cancer samples, using next generation proteomics and phospho-proteomics, has enabled us to go beyond the genomic level in the characterization of tumor molecular heterogeneity. This multi-omics approach provides a solid foundation to advance the understanding of cancer biology, unravel key molecular events, and support the identification of novel therapeutic targets for precision medicine in CRC.


2019 ◽  
Vol 12 (S8) ◽  
Author(s):  
Young-Joo Jin ◽  
Seyoun Byun ◽  
Seonggyun Han ◽  
John Chamberlin ◽  
Dongwook Kim ◽  
...  

Abstract Background Hepatitis B virus (HBV), hepatitis C virus (HCV), and alcohol consumption are predominant causes of hepatocellular carcinoma (HCC). However, the molecular mechanisms underlying how differently these causes are implicated in HCC development are not fully understood. Therefore, we investigated differential alternative splicing (AS) regulation among HCC patients with these risk factors. Methods We conducted a genome-wide survey of AS events associated with HCCs among HBV (n = 95), HCV (n = 47), or alcohol (n = 76) using RNA-sequencing data obtained from The Cancer Genome Atlas. Results In three group comparisons of HBV vs. HCV, HBV vs. alcohol, and HCV vs. alcohol for RNA seq (ΔPSI> 0.05, FDR < 0.05), 133, 93, and 29 differential AS events (143 genes) were identified, respectively. Of 143 AS genes, eight and one gene were alternatively spliced specific to HBV and HCV, respectively. Through functional analysis over the canonical pathways and gene ontologies, we identified significantly enriched pathways in 143 AS genes including immune system, mRNA splicing-major pathway, and nonsense-mediated decay, which may be important to carcinogenesis in HCC risk factors. Among eight genes with HBV-specific splicing events, HLA-A, HLA-C, and IP6K2 exhibited more differential expression of AS events (ΔPSI> 0.1). Intron retention of HLA-A was observed more frequently in HBV-associated HCC than HCV- or alcohol-associated HCC, and intron retention of HLA-C showed vice versa. Exon 3 (based on ENST00000432678) of IP6K2 was less skipped in HBV-associated in HCC compared to HCV- or alcohol-associated HCC. Conclusion AS may play an important role in regulating transcription differences implicated in HBV-, HCV-, and alcohol-related HCC development.


Nutrients ◽  
2020 ◽  
Vol 12 (6) ◽  
pp. 1771
Author(s):  
Saivageethi Nuthikattu ◽  
Dragan Milenkovic ◽  
John C. Rutledge ◽  
Amparo C. Villablanca

The Western diet (WD) and hyperlipidemia are risk factors for vascular disease, dementia, and cognitive impairment. However, the molecular mechanisms are poorly understood. This pilot study investigated the genomic pathways by which the WD and hyperlipidemia regulate gene expression in brain microvessels. Five-week-old C57BL/6J wild type (WT) control and low-density lipoprotein receptor deficient (LDL-R−/−) male mice were fed the WD for eight weeks. Differential gene expression, gene networks and pathways, transcription factors, and non-protein coding RNAs were evaluated by a genome-wide microarray and bioinformatics analysis of laser-captured hippocampal microvessels. The WD resulted in the differential expression of 1972 genes. Much of the differentially expressed gene (DEG) was attributable to the differential regulation of cell signaling proteins and their transcription factors, approximately 4% was attributable to the differential expression of miRNAs, and 10% was due to other non-protein coding RNAs, primarily long non-coding RNAs (lncRNAs) and small nucleolar RNAs (snoRNAs) not previously described to be modified by the WD. Lipotoxic injury resulted in complex and multilevel molecular regulation of the hippocampal microvasculature involving transcriptional and post-transcriptional regulation and may provide a molecular basis for a better understanding of hyperlipidemia-associated dementia risk.


Sign in / Sign up

Export Citation Format

Share Document