Management and Analysis of Mass Spectrometry Proteomics Data on the Grid

Author(s):  
Mario Cannataro ◽  
Pietro Hiram Guzzi ◽  
Giuseppe Tradigo ◽  
Pierangelo Veltri

Recent advances in high throughput technologies analysing biological samples enabled the researchers to collect a huge amount of data. In particular, mass spectrometry-based proteomics uses the mass spectrometry to investigate proteins expressed in an organism or a cell. The manual inspection of spectra is unfeasible, so the need to introduce a set of algorithms, tools and platforms to manage and analyze them arises. Computational Proteomics regards the computational methods for analyzing spectra data in qualitative (i.e. peptide/protein identification in tandem mass spectrometry), and quantitative proteomics (i.e. protein expression in samples), as well as in biomarker discovery (i.e. the identification of a molecular signature of a disease directly from spectra). This chapter presents main standards, tools, and technologies for building scalable, reusable, and portable applications in this field. The chapter surveys available solutions for computational proteomics and includes a deep description of MS-Analyzer, a Grid-based software platform for the integrated management and analysis of spectra data. MS-Analyzer provides efficient spectra management through a specialized spectra database, and supports the semantic composition of pre-processing and data mining services to analyze spectra on the Grid.

Author(s):  
Rocco J. Rotello ◽  
Timothy D. Veenstra

: In the current omics-age of research, major developments have been made in technologies that attempt to survey the entire repertoire of genes, transcripts, proteins, and metabolites present within a cell. While genomics has led to a dramatic increase in our understanding of such things as disease morphology and how organisms respond to medications, it is critical to obtain information at the proteome level since proteins carry out most of the functions within the cell. The primary tool for obtaining proteome-wide information on proteins within the cell is mass spectrometry (MS). While it has historically been associated with the protein identification, developments over the past couple of decades have made MS a robust technology for protein quantitation as well. Identifying quantitative changes in proteomes is complicated by its dynamic nature and the inability of any technique to guarantee complete coverage of every protein within a proteome sample. Fortunately, the combined development of sample preparation and MS methods have made it capable to quantitatively compare many thousands of proteins obtained from cells and organisms.


2018 ◽  
Vol 1 (1) ◽  
pp. 207-234 ◽  
Author(s):  
Pavel Sinitcyn ◽  
Jan Daniel Rudolph ◽  
Jürgen Cox

Computational proteomics is the data science concerned with the identification and quantification of proteins from high-throughput data and the biological interpretation of their concentration changes, posttranslational modifications, interactions, and subcellular localizations. Today, these data most often originate from mass spectrometry–based shotgun proteomics experiments. In this review, we survey computational methods for the analysis of such proteomics data, focusing on the explanation of the key concepts. Starting with mass spectrometric feature detection, we then cover methods for the identification of peptides. Subsequently, protein inference and the control of false discovery rates are highly important topics covered. We then discuss methods for the quantification of peptides and proteins. A section on downstream data analysis covers exploratory statistics, network analysis, machine learning, and multiomics data integration. Finally, we discuss current developments and provide an outlook on what the near future of computational proteomics might bear.


2020 ◽  
Vol 21 (8) ◽  
pp. 2873 ◽  
Author(s):  
Chen Chen ◽  
Jie Hou ◽  
John J. Tanner ◽  
Jianlin Cheng

Recent advances in mass spectrometry (MS)-based proteomics have enabled tremendous progress in the understanding of cellular mechanisms, disease progression, and the relationship between genotype and phenotype. Though many popular bioinformatics methods in proteomics are derived from other omics studies, novel analysis strategies are required to deal with the unique characteristics of proteomics data. In this review, we discuss the current developments in the bioinformatics methods used in proteomics and how they facilitate the mechanistic understanding of biological processes. We first introduce bioinformatics software and tools designed for mass spectrometry-based protein identification and quantification, and then we review the different statistical and machine learning methods that have been developed to perform comprehensive analysis in proteomics studies. We conclude with a discussion of how quantitative protein data can be used to reconstruct protein interactions and signaling networks.


2008 ◽  
Vol 33 (1) ◽  
pp. 12-17 ◽  
Author(s):  
Peter Matt ◽  
Zongming Fu ◽  
Qin Fu ◽  
Jennifer E. Van Eyk

Proteomics, analogous with genomics, is the analysis of the protein complement present in a cell, organ, or organism at any given time. While the genome provides information about the theoretical status of the cellular proteins, the proteome describes the actual content, which ultimately determines the phenotype. The broad application of proteomic technologies in basic science and clinical medicine has the potential to accelerate our understanding of the molecular mechanisms underlying disease and may facilitate the discovery of new drug targets and diagnostic disease markers. Proteomics is a rapidly developing and changing scientific discipline, and the last 5 yr have seen major advances in the underlying techniques as well as expansion into new applications. Core technologies for the separation of proteins and/or peptides are one- and two-dimensional gel electrophoresis and one- and two-dimensional liquid chromatography, and these are coupled almost exclusively with mass spectrometry. Proteomic studies have shown that the most effective analysis of even simple biological samples requires subfractionation and/or enrichment before protein identification by mass spectrometry. Selection of the appropriate technology or combination of technologies to match the biological questions is essential for maximum coverage of the selected subproteome and to ensure both the full interpretation and the downstream utility of the data. In this review, we describe the current technologies for proteome fractionation and separation of biological samples, based on our lab workflow for biomarker discovery and validation.


Author(s):  
Vu Anh Le ◽  
Cam Quyen Thi Phan ◽  
Thuy Huong Nguyen

The post-genomic era consists of experimental and computational efforts to meet the challenge of clarifying and understanding the function of genes and their products. Proteomic studies play a key role in this endeavour by complementing other functional genomics approaches, encompasses the large-scale analysis of complex mixtures, including the identification and quantification of proteins expressed under different conditions, the determination of their properties, modifications and functions. Understanding how biological processes are regulated at the protein level is crucial to understanding the molecular basis of diseases and often highlights the prevention, diagnosis and treatment of diseases. High-throughput technologies are widely used in proteomics to perform the analysis of thousands of proteins. Specifically, mass spectrometry (MS) is an analytical technique for characterizing biological samples and is increasingly used in protein studies because of its targeted, nontargeted, and high performance abilities. However, as large data sets are created, computational methods such as data mining techniques are required to analyze and interpret the relevant data. More specifically, the application of data mining techniques in large proteomic data sets can assist in many interpretations of data; it can reveal protein-protein interactions, improve protein identification, evaluate the experimental methods used and facilitate the diagnosis and biomarker discovery. With the rapid advances in mass spectrometry devices and experimental methodologies, MS-based proteomics has become a reliable and necessary tool for elucidating biological processes at the protein level. Over the past decade, we have witnessed a great expansion of our knowledge of human diseases with the adoption of proteomic technologies based on MS, which leads to many interesting discoveries. Here, we review recent advances of data mining in MS-based proteomics in biomedical research. Recent research in many fields shows that proteomics goes beyond the simple classification of proteins in biological systems and finally reaches its initial potential – as an essential tool to aid related disciplines, notably biomedical research. From here, there is great potential for data mining in MS-based proteomics to move beyond basic research, into clinical research and diagnostics.


2008 ◽  
Vol 3 ◽  
pp. BMI.S689 ◽  
Author(s):  
Mamoun Ahram ◽  
Emanuel F. Petricoin

Recent technological developments in proteomics have shown promising initiatives in identifying novel biomarkers of various diseases. Such technologies are capable of investigating multiple samples and generating large amount of data end-points. Examples of two promising proteomics technologies are mass spectrometry, including an instrument based on surface enhanced laser desorption/ionization, and protein microarrays. Proteomics data must, however, undergo analytical processing using bioinformatics. Due to limitations in proteomics tools including shortcomings in bioinformatics analysis, predictive bioinformatics can be utilized as an alternative strategy prior to performing elaborate, high-throughput proteomics procedures. This review describes mass spectrometry, protein microarrays, and bioinformatics and their roles in biomarker discovery, and highlights the significance of integration between proteomics and bioinformatics.


2009 ◽  
Vol 8 (11) ◽  
pp. 2405-2417 ◽  
Author(s):  
Lukas Reiter ◽  
Manfred Claassen ◽  
Sabine P. Schrimpf ◽  
Marko Jovanovic ◽  
Alexander Schmidt ◽  
...  

2012 ◽  
Vol 26 (1) ◽  
pp. 41-47 ◽  
Author(s):  
Nai-Jun Fan ◽  
Chun-Fang Gao ◽  
Chang-Song Wang ◽  
Jing-Jing Lv ◽  
Guang Zhao ◽  
...  

Despite the wide range of available colorectal cancer (CRC) screening tests, less than 50% of cases are detected at early stages. However, the identification of differentially expressed proteins or novel protein biomarkers in CRC may have some utility and, ultimately, improve patient care and survival. Proteomics combined with mass spectroscopy and liquid chromatography are emerging as powerful tools that have led to the discovery of potential markers in cancer biomarker discovery in several types of cancers. This article describes a novel technology that uses isotopic reagents to tag selected proteins that show a consistent pattern of differential expression in CRC.OBJECTIVE: To identify and validate potential biomarkers of colorectal adenocarcinoma using a proteomic approach.METHODS: Multidimensional liquid chromatography/mass spectrometry was used to analyze biological samples labelled with isobaric mass tags for relative and absolute quantitation to identify differentially expressed proteins in human colorectal adenocarcinoma and paired normal mucosa for the discovery of cancerous biomarkers. Cancerous and noncancerous samples were compared using online and offline separation. Protein identification was performed using mass spectrometry. The downregulation of gelsolin protein in colorectal adenocarcinoma samples was confirmed by Western blot analysis and validated using immunohistochemistry.RESULTS: A total of 802 nonredundant proteins were identified in colorectal adenocarcinoma samples, 82 of which fell outside the expression range of 0.8 to 1.2, and were considered to be potential cancer-specific proteins. Immunohistochemistry revealed a complete absence of gelsolin expression in 86.89% of samples and a reduction of expression in 13.11% of samples, yielding a sensitivity of 86.89% and a specificity of 100% for distinguishing colorectal adenocarcinoma from normal tissue.CONCLUSIONS: These findings suggest that decreased expression of gelsolin is a potential biomarker of colorectal adenocarcinoma.


2018 ◽  
Author(s):  
Eric J. Verbeke ◽  
Anna L. Mallam ◽  
Kevin Drew ◽  
Edward M. Marcotte ◽  
David W. Taylor

SummaryMulti-protein complexes are necessary for nearly all cellular processes, and understanding their structure is required for elucidating their function. Current high-resolution strategies in structural biology are effective, but lag behind other fields (e.g. genomics and proteomics) due to their reliance on purified samples rather than characterizing heterogeneous mixtures. Here, we present a method combining single particle analysis by electron microscopy with protein identification by mass spectrometry to structurally characterize macromolecular complexes from extracts of human cells. We obtain three-dimensional structures of native proteasomes directly from ab initio classification of a heterogeneous mixture of protein complexes. In addition, we find an ~1 MDa size structure of unknown composition and reference our proteomics data to suggest possible identities. Our study shows the power of using a shotgun approach to electron microscopy (shotgun EM) when coupled with mass spectrometry as a tool to uncover the structures of macromolecular machines in parallel.


Sign in / Sign up

Export Citation Format

Share Document