scholarly journals Reinspection of a Clinical Proteomics Tumor Analysis Consortium (CPTAC) Dataset with Cloud Computing Reveals Abundant Post-Translational Modifications and Protein Sequence Variants

Cancers ◽  
2021 ◽  
Vol 13 (20) ◽  
pp. 5034
Author(s):  
Amol Prakash ◽  
Lorne Taylor ◽  
Manu Varkey ◽  
Nate Hoxie ◽  
Yassene Mohammed ◽  
...  

The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has provided some of the most in-depth analyses of the phenotypes of human tumors ever constructed. Today, the majority of proteomic data analysis is still performed using software housed on desktop computers which limits the number of sequence variants and post-translational modifications that can be considered. The original CPTAC studies limited the search for PTMs to only samples that were chemically enriched for those modified peptides. Similarly, the only sequence variants considered were those with strong evidence at the exon or transcript level. In this multi-institutional collaborative reanalysis, we utilized unbiased protein databases containing millions of human sequence variants in conjunction with hundreds of common post-translational modifications. Using these tools, we identified tens of thousands of high-confidence PTMs and sequence variants. We identified 4132 phosphorylated peptides in nonenriched samples, 93% of which were confirmed in the samples which were chemically enriched for phosphopeptides. In addition, our results also cover 90% of the high-confidence variants reported by the original proteogenomics study, without the need for sample specific next-generation sequencing. Finally, we report fivefold more somatic and germline variants that have an independent evidence at the peptide level, including mutations in ERRB2 and BCAS1. In this reanalysis of CPTAC proteomic data with cloud computing, we present an openly available and searchable web resource of the highest-coverage proteomic profiling of human tumors described to date.

2015 ◽  
Vol 14 (8) ◽  
pp. 3292-3304 ◽  
Author(s):  
Ningning Liu ◽  
Yun Xiong ◽  
Yiran Ren ◽  
Linlin Zhang ◽  
Xianfei He ◽  
...  

2021 ◽  
Author(s):  
Dmitry Tikhonov ◽  
Liudmila Kulikova ◽  
Arthur T. Kopylov ◽  
Vladimir Rudnev ◽  
Alexander Stepanov ◽  
...  

Abstract Post-translational processing leads to conformational changes in protein structure that modulate molecular functions and change the signature of metabolic transformations and immune responses. Some post-translational modifications (PTMs), such as phosphorylation and acetylation, are strongly related to oncogenic processes and malignancy. This study investigated a PTM pattern in patients with gender-specific ovarian or breast cancer. Proteomic profiling and analysis of cancer-specific PTM patterns were performed using high-resolution UPLC-MS/MS. Structural analysis, topology, and stability of PTMs associated with sex-specific cancers were analyzed using molecular dynamics modeling. We identified highly specific PTMs, of which 12 modified peptides from eight distinct proteins derived from patients with ovarian cancer and 6 peptides of three proteins favored patients from the group with breast cancer. We found that all defined PTMs were localized in the compact and stable structural motifs exposed outside the solvent environment. PTMs increase the solvent-accessible surface area of the modified moiety and its active environment. The observed conformational changes are still inadequate to activate the structural degradation and enhance protein elimination/clearance; however, it is sufficient for the significant modulation of protein activity.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7046 ◽  
Author(s):  
Jacob M. Wozniak ◽  
David J. Gonzalez

Background Mass-spectrometry-based proteomics is a prominent field of study that allows for the unbiased quantification of thousands of proteins from a particular sample. A key advantage of these techniques is the ability to detect protein post-translational modifications (PTMs) and localize them to specific amino acid residues. These approaches have led to many significant findings in a wide range of biological disciplines, from developmental biology to cancer and infectious diseases. However, there is a current lack of tools available to connect raw PTM site information to biologically meaningful results in a high-throughput manner. Furthermore, many of the available tools require significant programming knowledge to implement. Results The R package PTMphinder was designed to enable researchers, particularly those with minimal programming background, to thoroughly analyze PTMs in proteomic data sets. The package contains three functions: parseDB, phindPTMs and extractBackground. Together, these functions allow users to reformat proteome databases for easier analysis, localize PTMs within full proteins, extract motifs surrounding the identified sites and create proteome-specific motif backgrounds for statistical purposes. Beta-testing of this R package has demonstrated its simplicity and ease of integration with existing tools. Conclusion PTMphinder empowers researchers to fully analyze and interpret PTMs derived from proteomic data. This package is simple enough for researchers with limited programming experience to understand and implement. The data produced from this package can inform subsequent research by itself and also be used in conjunction with other tools, such as motif-x, for further analysis.


2019 ◽  
Author(s):  
Pierre-Alain Binz ◽  
Jim Shofstahl ◽  
Juan Antonio Vizcaíno ◽  
Harald Barsnes ◽  
Robert J. Chalkley ◽  
...  

AbstractMass spectrometry-based proteomics enables the high-throughput identification and quantification of proteins, including sequence variants and post-translational modifications (PTMs), in biological samples. However, most workflows require that such variations be included in the search space used to analyze the data, and doing so remains challenging with most analysis tools. In order to facilitate the search for known sequence variants and PTMs, the Proteomics Standards Initiative (PSI) has designed and implemented the PSI Extended FASTA Format (PEFF). PEFF is based on the very popular FASTA format but adds a uniform mechanism for encoding substantially more metadata about the sequence collection as well as individual entries, including support for encoding known sequence variants, PTMs, and proteoforms. The format is very nearly backwards compatible, and as such, existing FASTA parsers will require little or no changes to be able to read PEFF files as FASTA files, although without supporting any of the extra capabilities of PEFF. PEFF is defined by a full specification document, controlled vocabulary terms, a set of example files, software libraries, and a file validator. Popular software and resources are starting to support PEFF, including the sequence search engine Comet and the knowledge bases neXtProt and UniProtKB. Widespread implementation of PEFF is expected to further enable proteogenomics and top-down proteomics applications by providing a standardized mechanism for encoding protein sequences and their known variations. All the related documentation, including the detailed file format specification and example files, are available athttp://www.psidev.info/peff.


2022 ◽  
Author(s):  
Ksenia G Kuznetsova ◽  
Sofia S Zvonareva ◽  
Rustam Ziganshin ◽  
Elena S Mekhova ◽  
Polina Yu Dgebuadze ◽  
...  

Venoms of predatory marine cone snails (the family Conidae, order Neogastropoda) are intensely studied because of the broad range of biomedical applications of the neuropeptides that they contain, conotoxins. Meanwhile anatomy in some other neogastropod lineages strongly suggests that they have evolved similar venoms independently of cone snails, nevertheless their venom composition remains unstudied. Here we focus on the most diversified of these lineages, the genus Vexillum (the family Costellariidae). We have generated comprehensive multi-specimen, multi-tissue RNA-Seq data sets for three Vexillum species, and supported our findings in two species by proteomic profiling. We show that venoms of Vexillum are dominated by highly diversified short cysteine-rich peptides that in many aspects are very similar to conotoxins. Vexitoxins possess the same precursor organization, display overlapping cysteine frameworks and share several common post-translational modifications with conotoxins. Some vexitoxins show detectable sequence similarity to conotoxins, and are predicted to adopt similar domain conformations, including a pharmacologically relevant inhibitory cysteine-know motif (ICK). The tubular gL of Vexillum is a notably more recent evolutionary novelty than the conoidean venom gland. Thus, we hypothesize lower divergence between the toxin genes, and their somatic counterparts compared to that in conotoxins, and we find support for this hypothesis in the molecular evolution of the vexitoxin cluster V027. We use this example to discuss how future studies on vexitoxins can inform origin and evolution of conotoxins, and how they may help addressing standing questions in venom evolution.


2019 ◽  
Vol 48 (D1) ◽  
pp. D1136-D1144 ◽  
Author(s):  
Xinhao Shao ◽  
Isra N Taha ◽  
Karl R Clauser ◽  
Yu (Tom) Gao ◽  
Alexandra Naba

Abstract The extracellular matrix (ECM) is a complex and dynamic meshwork of cross-linked proteins that supports cell polarization and functions and tissue organization and homeostasis. Over the past few decades, mass-spectrometry-based proteomics has emerged as the method of choice to characterize the composition of the ECM of normal and diseased tissues. Here, we present a new release of MatrisomeDB, a searchable collection of curated proteomic data from 17 studies on the ECM of 15 different normal tissue types, six cancer types (different grades of breast cancers, colorectal cancer, melanoma, and insulinoma) and other diseases including vascular defects and lung and liver fibroses. MatrisomeDB (http://www.pepchem.org/matrisomedb) was built by retrieving raw mass spectrometry data files and reprocessing them using the same search parameters and criteria to allow for a more direct comparison between the different studies. The present release of MatrisomeDB includes 847 human and 791 mouse ECM proteoforms and over 350 000 human and 600 000 mouse ECM-derived peptide-to-spectrum matches. For each query, a hierarchically-clustered tissue distribution map, a peptide coverage map, and a list of post-translational modifications identified, are generated. MatrisomeDB is the most complete collection of ECM proteomic data to date and allows the building of a comprehensive ECM atlas.


2009 ◽  
Vol 28 (4) ◽  
pp. 223-234 ◽  
Author(s):  
Harald Mischak ◽  
Eric Schiffer ◽  
Petra Zürbig ◽  
Mohammed Dakna ◽  
Jochen Metzger

Urinary Proteome Analysis using Capillary Electrophoresis Coupled to Mass Spectrometry: A Powerful Tool in Clinical Diagnosis, Prognosis and Therapy EvaluationProteome analysis has emerged as a powerful tool to decipher (patho) physiological processes, resulting in the establishment of the field of clinical proteomics. One of the main goals is to discover biomarkers for diseases from tissues and body fluids. Due to the enormous complexity of the proteome, a separation step is required for mass spectrometry (MS)-based proteome analysis. In this review, the advantages and limitations of protein separation by two-dimensional gel electrophoresis, liquid chromatography, surface-enhanced laser desorption/ionization and capillary electrophoresis (CE) for proteomic analysis are described, focusing on CE-MS. CE-MS enables separation and detection of the small molecular weight proteome in biological fluids with high reproducibility and accuracy in one single processing step and in a short time. As sensitive and specific single biomarkers generally may not exist, a strategy to overcome this diagnostic void is shifting from single analyte detection to simultaneous analysis of multiple analytes that together form a disease-specific pattern. Such approaches, however, are accompanied with additional challenges, which we will outline in this review. Besides the choice of adequate technological platforms, a high level of standardization of proteomic measurements and data processing is also necessary to establish proteomic profiling. In this regard, demands concerning study design, choice of specimens, sample preparation, proteomic data mining, and clinical evaluation should be considered before performing a proteomic study.


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii470-iii470
Author(s):  
Francesca Petralia ◽  
Nicole Tignor ◽  
Dmitri Rykunov ◽  
Boris Revas ◽  
Shrabanti Chowdhury ◽  
...  

Abstract We performed a comprehensive proteogenomic analysis across seven childhood brain tumors for a deeper understanding of their functional biology. Whole genome sequencing, RNAseq, quantitative proteomic profiling and phosphoproteomics were performed on 219 fresh frozen tumor samples representing the histologic diagnoses of: low grade astrocytoma (93), ependymoma (32), high grade astrocytoma (26), medulloblastoma (22), ganglioglioma (18), craniopharyngioma (16) and atypical teratoid rhabdoid tumor (12). Unsupervised clustering analysis based on proteomics data reveals eight clusters with distinct protein profiles and pathway activities. While some clusters coincide with histologic diagnoses, a couple of clusters appear to be a mixture of different diagnoses, including one cluster consisting of “aggressive” tumors characterized by poor survival and high stemness scores. By integrating proteomic data with RNAseq and WGS data, we characterize the impact of mutations (H3K27M, BRAFV600E, BRAF fusion) and CNVs upon the proteome across various diagnoses. Multiomics based kinase-substrate association analysis and co-expression network analysis reveal targetable active kinase networks within these tumors. Proteomic data reveals unique biology associated with H3K27M mutation status in HGG and BRAF aberrations in LGG. Characterization of the tumor microenvironment through deconvolution analyses based on multi-omics data reveals 5 distinct tumor clusters associated with different populations of infiltrating immune cells and the relative activity of the immune system based upon the expression of pro-inflammation or immunosuppressive markers. This study reports the first large-scale deep comprehensive proteogenomic analysis crossing traditional histologic boundaries to uncover foundational pediatric brain tumor biology including functional insight that helps drive translational efforts.


Sign in / Sign up

Export Citation Format

Share Document