Computational Methods for Understanding Mass Spectrometry–Based Shotgun Proteomics Data

2018 ◽  
Vol 1 (1) ◽  
pp. 207-234 ◽  
Author(s):  
Pavel Sinitcyn ◽  
Jan Daniel Rudolph ◽  
Jürgen Cox

Computational proteomics is the data science concerned with the identification and quantification of proteins from high-throughput data and the biological interpretation of their concentration changes, posttranslational modifications, interactions, and subcellular localizations. Today, these data most often originate from mass spectrometry–based shotgun proteomics experiments. In this review, we survey computational methods for the analysis of such proteomics data, focusing on the explanation of the key concepts. Starting with mass spectrometric feature detection, we then cover methods for the identification of peptides. Subsequently, protein inference and the control of false discovery rates are highly important topics covered. We then discuss methods for the quantification of peptides and proteins. A section on downstream data analysis covers exploratory statistics, network analysis, machine learning, and multiomics data integration. Finally, we discuss current developments and provide an outlook on what the near future of computational proteomics might bear.

2019 ◽  
Vol 16 (2) ◽  
pp. 516-520
Author(s):  
B. Kalaiselvi ◽  
M. Thangamani

Computational approach of proteomic data science is used to identification and quantification of protein and provides the high-throughput data, concentration changes, interactions, posttranslational modifications and cellular localizations. The high-quality mass spectrometry recall to understanding the different sources of unsigned high-quality spectra features. The iterative computational method is interrogating the high efficiency of mass spectrometry protein data. The approach contains several databases searching with different search parameters, spectral library searching, modified peptides using blind search and genomic database searching. The mass spectrometry computational method is analysis the proteomics data focusing the key concepts with explanations, mass spectral feature detection, identifying the peptides, protein inference and control the false discovery rate. Then the method discusses the quantification of peptides and proteins, the downstream data analysis on machine learning, network analysis and multiomics integration of protein data and finally discuss the future of computational proteomics data.


Author(s):  
Mario Cannataro ◽  
Pietro Hiram Guzzi ◽  
Giuseppe Tradigo ◽  
Pierangelo Veltri

Recent advances in high throughput technologies analysing biological samples enabled the researchers to collect a huge amount of data. In particular, mass spectrometry-based proteomics uses the mass spectrometry to investigate proteins expressed in an organism or a cell. The manual inspection of spectra is unfeasible, so the need to introduce a set of algorithms, tools and platforms to manage and analyze them arises. Computational Proteomics regards the computational methods for analyzing spectra data in qualitative (i.e. peptide/protein identification in tandem mass spectrometry), and quantitative proteomics (i.e. protein expression in samples), as well as in biomarker discovery (i.e. the identification of a molecular signature of a disease directly from spectra). This chapter presents main standards, tools, and technologies for building scalable, reusable, and portable applications in this field. The chapter surveys available solutions for computational proteomics and includes a deep description of MS-Analyzer, a Grid-based software platform for the integrated management and analysis of spectra data. MS-Analyzer provides efficient spectra management through a specialized spectra database, and supports the semantic composition of pre-processing and data mining services to analyze spectra on the Grid.


2019 ◽  
Author(s):  
Yasset Perez-Riverol ◽  
Pablo Moreno

AbstractThe recent improvements in mass spectrometry instruments and new analytical methods are increasing the intersection between proteomics and big data science. In addition, the bioinformatics analysis is becoming an increasingly complex and convoluted process involving multiple algorithms and tools. A wide variety of methods and software tools have been developed for computational proteomics and metabolomics during recent years, and this trend is likely to continue. However, most of the computational proteomics and metabolomics tools are targeted and design for single desktop application limiting the scalability and reproducibility of the data analysis. In this paper we overview the key steps of metabolomic and proteomics data processing including main tools and software use to perform the data analysis. We discuss the combination of software containers with workflows environments for large scale metabolomics and proteomics analysis. Finally, we introduced to the proteomics and metabolomics communities a new approach for reproducible and large-scale data analysis based on BioContainers and two of the most popular workflows environments: Galaxy and Nextflow.


2019 ◽  
Author(s):  
Nikita Prianichnikov ◽  
Heiner Koch ◽  
Scarlet Koch ◽  
Markus Lubeck ◽  
Raphael Heilig ◽  
...  

SummaryIon mobility can add a dimension to LC-MS based shotgun proteomics which has the potential to boost proteome coverage, quantification accuracy and dynamic range. Required for this is suitable software that extracts the information contained in the four-dimensional (4D) data space spanned by m/z, retention time, ion mobility and signal intensity. Here we describe the ion mobility enhanced MaxQuant software, which utilizes the added data dimension. It offers an end to end computational workflow for the identification and quantification of peptides, proteins and posttranslational modification sites in LC-IMS-MS/MS shotgun proteomics data. We apply it to trapped ion mobility spectrometry (TIMS) coupled to a quadrupole time-of-flight (QTOF) analyzer. A highly parallelizable 4D feature detection algorithm extracts peaks which are assembled to isotope patterns. Masses are recalibrated with a non-linear m/z, retention time, ion mobility and signal intensity dependent model, based on peptides from the sample. A new matching between runs (MBR) algorithm that utilizes collisional cross section (CCS) values of MS1 features in the matching process significantly gains specificity from the extra dimension. Prerequisite for using CCS values in MBR is a relative alignment of the ion mobility values between the runs. The missing value problem in protein quantification over many samples is greatly reduced by CCS aware MBR.MS1 level label-free quantification is also implemented which proves to be highly precise and accurate on a benchmark dataset with known ground truth. MaxQuant for LC-IMS-MS/MS is part of the basic MaxQuant release and can be downloaded from http://maxquant.org.


Author(s):  
T. Durai Ananda Kumar ◽  
Sandhya Desai ◽  
Soumya Venkaraddiyavar ◽  
Naraparaju Swathi ◽  
Gurubasavaraj V. Pujar

Abstract:: Drug discovery research focuses on Rational Drug Design (RDD) concepts and the major obstacles in the drug discovery process are lack of target specificity and selectivity. The realization of higher target selectivity of peptide drugs promoted the peptide research. Rapid growth in the genomics along with recombinant DNA (rDNA) technology and gene expression studies stimulated the peptide research. The promising use of peptide therapeutics demands sensitive and selective quantification methods. Protein sequencing and proteomic investigations can be successfully accomplished through Mass Spectroscopy (MS) based methods. Mass spectroscopy-based soft ionization methods namely, electrospray ionization (ESI) and Matrix-Assisted Laser Desorption/Ionization (MALDI) offers high-throughput sequencing provide the characterization (sequence and structure) of intact proteins/peptides. The advent of tandem Mass Spectrometry (MS/MS) along with data acquisition methods are the basis for the evolution in peptide therapeutics research. The evolution of data science, helped in developing computational proteomics, which assists in the quantitative determination of protein samples. This review narrates the role of mass spectrometry in the peptide drug discovery in particular the sequence characterization along with latest developments, such as computational proteomics.


2005 ◽  
Vol 4 (6) ◽  
pp. 2273-2282 ◽  
Author(s):  
Manfred Heller ◽  
Mingliang Ye ◽  
Philippe E. Michel ◽  
Patrick Morier ◽  
Daniel Stalder ◽  
...  

2020 ◽  
Vol 19 (6) ◽  
pp. 1058-1069 ◽  
Author(s):  
Nikita Prianichnikov ◽  
Heiner Koch ◽  
Scarlet Koch ◽  
Markus Lubeck ◽  
Raphael Heilig ◽  
...  

Ion mobility can add a dimension to LC-MS based shotgun proteomics which has the potential to boost proteome coverage, quantification accuracy and dynamic range. Required for this is suitable software that extracts the information contained in the four-dimensional (4D) data space spanned by m/z, retention time, ion mobility and signal intensity. Here we describe the ion mobility enhanced MaxQuant software, which utilizes the added data dimension. It offers an end to end computational workflow for the identification and quantification of peptides and proteins in LC-IMS-MS/MS shotgun proteomics data. We apply it to trapped ion mobility spectrometry (TIMS) coupled to a quadrupole time-of-flight (QTOF) analyzer. A highly parallelizable 4D feature detection algorithm extracts peaks which are assembled to isotope patterns. Masses are recalibrated with a non-linear m/z, retention time, ion mobility and signal intensity dependent model, based on peptides from the sample. A new matching between runs (MBR) algorithm that utilizes collisional cross section (CCS) values of MS1 features in the matching process significantly gains specificity from the extra dimension. Prerequisite for using CCS values in MBR is a relative alignment of the ion mobility values between the runs. The missing value problem in protein quantification over many samples is greatly reduced by CCS aware MBR.MS1 level label-free quantification is also implemented which proves to be highly precise and accurate on a benchmark dataset with known ground truth. MaxQuant for LC-IMS-MS/MS is part of the basic MaxQuant release and can be downloaded from http://maxquant.org.


2017 ◽  
Vol 63 (5) ◽  
pp. 405-412
Author(s):  
A.L. Rusanov ◽  
N.A. Petushkova ◽  
E.V. Poverennaya ◽  
K.V. Nakhod ◽  
O.V. Larina ◽  
...  

The effects of sodium dodecyl sulfate (25 mg/ml) and Triton X-100 (12.5 mg/ml and 25 mg/ml) on the HaCaT immortalized keratinocytes exposed to these surfactants for 48 h were studied. Using shotgun proteomics, a comparative analysis of the proteomic profiles of control and experimental cells after surfactants exposure was carried out. 260 common proteins were identified in control and experimental cells; 33 proteins were found in cells exposed to all three treatments, but not in control cells. These 33 proteins apparently reflect a nonspecific (universal) response of cells to toxic damage by the surfactants. These proteins are associated with activation of cell proliferation, changes in the functional activity of their ER and mitochondria, increased mRNA stability and activation of protein degradation processes in the cells. The possibility of using these proteins as a nonspecific parameter of cell response to cytotoxic damage is discussed. The mass spectrometry proteomics data (“raw”, “mgf” and “xml” files) have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifiers PXD007789 and PXD007776.


Sign in / Sign up

Export Citation Format

Share Document