MaxQuant Software for Ion Mobility Enhanced Shotgun Proteomics

Ion mobility can add a dimension to LC-MS based shotgun proteomics which has the potential to boost proteome coverage, quantification accuracy and dynamic range. Required for this is suitable software that extracts the information contained in the four-dimensional (4D) data space spanned by m/z, retention time, ion mobility and signal intensity. Here we describe the ion mobility enhanced MaxQuant software, which utilizes the added data dimension. It offers an end to end computational workflow for the identification and quantification of peptides and proteins in LC-IMS-MS/MS shotgun proteomics data. We apply it to trapped ion mobility spectrometry (TIMS) coupled to a quadrupole time-of-flight (QTOF) analyzer. A highly parallelizable 4D feature detection algorithm extracts peaks which are assembled to isotope patterns. Masses are recalibrated with a non-linear m/z, retention time, ion mobility and signal intensity dependent model, based on peptides from the sample. A new matching between runs (MBR) algorithm that utilizes collisional cross section (CCS) values of MS1 features in the matching process significantly gains specificity from the extra dimension. Prerequisite for using CCS values in MBR is a relative alignment of the ion mobility values between the runs. The missing value problem in protein quantification over many samples is greatly reduced by CCS aware MBR.MS1 level label-free quantification is also implemented which proves to be highly precise and accurate on a benchmark dataset with known ground truth. MaxQuant for LC-IMS-MS/MS is part of the basic MaxQuant release and can be downloaded from http://maxquant.org.

Download Full-text

MaxQuant software for ion mobility enhanced shotgun proteomics

10.1101/651760 ◽

2019 ◽

Cited By ~ 6

Author(s):

Nikita Prianichnikov ◽

Heiner Koch ◽

Scarlet Koch ◽

Markus Lubeck ◽

Raphael Heilig ◽

...

Keyword(s):

Ion Mobility ◽

Retention Time ◽

Signal Intensity ◽

Feature Detection ◽

Dynamic Range ◽

Shotgun Proteomics ◽

Detection Algorithm ◽

Protein Quantification ◽

Label Free ◽

Proteomics Data

SummaryIon mobility can add a dimension to LC-MS based shotgun proteomics which has the potential to boost proteome coverage, quantification accuracy and dynamic range. Required for this is suitable software that extracts the information contained in the four-dimensional (4D) data space spanned by m/z, retention time, ion mobility and signal intensity. Here we describe the ion mobility enhanced MaxQuant software, which utilizes the added data dimension. It offers an end to end computational workflow for the identification and quantification of peptides, proteins and posttranslational modification sites in LC-IMS-MS/MS shotgun proteomics data. We apply it to trapped ion mobility spectrometry (TIMS) coupled to a quadrupole time-of-flight (QTOF) analyzer. A highly parallelizable 4D feature detection algorithm extracts peaks which are assembled to isotope patterns. Masses are recalibrated with a non-linear m/z, retention time, ion mobility and signal intensity dependent model, based on peptides from the sample. A new matching between runs (MBR) algorithm that utilizes collisional cross section (CCS) values of MS1 features in the matching process significantly gains specificity from the extra dimension. Prerequisite for using CCS values in MBR is a relative alignment of the ion mobility values between the runs. The missing value problem in protein quantification over many samples is greatly reduced by CCS aware MBR.MS1 level label-free quantification is also implemented which proves to be highly precise and accurate on a benchmark dataset with known ground truth. MaxQuant for LC-IMS-MS/MS is part of the basic MaxQuant release and can be downloaded from http://maxquant.org.

Download Full-text

Mass Dynamics 1.0: A streamlined, web-based environment for analyzing, sharing and integrating Label-Free Data

10.1101/2021.03.03.433806 ◽

2021 ◽

Author(s):

Joseph Bloom ◽

Aaron Triantafyllidis ◽

Paula Burton (Ngov) ◽

Giuseppe Infusini ◽

Andrew Webb

Keyword(s):

Dynamic Range ◽

Shotgun Proteomics ◽

Label Free ◽

Proteomics Data ◽

Web Based ◽

Free Data ◽

Benchmark Datasets ◽

Free Quantification ◽

Analysis Environment

AbstractLabel Free Quantification (LFQ) of shotgun proteomics data is a popular and robust method for the characterization of relative protein abundance between samples. Many analytical pipelines exist for the automation of this analysis and some tools exist for the subsequent representation and inspection of the results of these pipelines. Mass Dynamics 1.0 (MD 1.0) is a web based analysis environment that can analyze and visualize LFQ data produced by software such as Maxquant. Unlike other tools, MD 1.0 utilizes cloud-based architecture to enable researchers to store their data, enabling researchers to not only automatically process and visualize their LFQ data but annotate and share their findings with collaborators and, if chosen, to easily publish results to the community. With a view toward increased reproducibility and standardisation in proteomics data analysis and streamlining collaboration between researchers, MD 1.0 requires minimal parameter choices and automatically generates quality control reports to verify experiment integrity. Here, we demonstrate that MD 1.0 provides reliable results for protein expression quantification, emulating Perseus on benchmark datasets over a wide dynamic range.The MD 1.0 platform is available globally via: https://app.massdynamics.com/[email protected]

Download Full-text

Integrating identification and quantification uncertainty for differential protein abundance analysis with Triqler

10.1101/2020.09.24.311605 ◽

2020 ◽

Author(s):

Matthew The ◽

Lukas Käll

Keyword(s):

Missing Values ◽

Shotgun Proteomics ◽

Protein Quantification ◽

Protein Abundance ◽

Posterior Distributions ◽

Label Free ◽

Differential Abundance ◽

Complicated Process ◽

Python Package ◽

Different Parts

AbstractProtein quantification for shotgun proteomics is a complicated process where errors can be introduced in each of the steps. Triqler is a Python package that estimates and integrates errors of the different parts of the label-free protein quantification pipeline into a single Bayesian model. Specifically, it weighs the quantitative values by the confidence we have in the correctness of the corresponding PSM. Furthermore, it treats missing values in a way that reflects their uncertainty relative to observed values. Finally, it combines these error estimates in a single differential abundance FDR that not only reflects the errors and uncertainties in quantification but also in identification. In this tutorial, we show how to (1) generate input data for Triqler from quantification packages such as MaxQuant and Quandenser, (2) run Triqler and what the different options are, (3) interpret the results, (4) investigate the posterior distributions of a protein of interest in detail and (5) verify that the hyperparameter estimations are sensible.

Download Full-text

Computational Methods for Understanding Mass Spectrometry–Based Shotgun Proteomics Data

Annual Review of Biomedical Data Science ◽

10.1146/annurev-biodatasci-080917-013516 ◽

2018 ◽

Vol 1 (1) ◽

pp. 207-234 ◽

Cited By ~ 53

Author(s):

Pavel Sinitcyn ◽

Jan Daniel Rudolph ◽

Jürgen Cox

Keyword(s):

Mass Spectrometry ◽

Computational Methods ◽

Posttranslational Modifications ◽

Data Science ◽

Feature Detection ◽

Shotgun Proteomics ◽

Proteomics Data ◽

Computational Proteomics ◽

Biological Interpretation ◽

Concentration Changes

Computational proteomics is the data science concerned with the identification and quantification of proteins from high-throughput data and the biological interpretation of their concentration changes, posttranslational modifications, interactions, and subcellular localizations. Today, these data most often originate from mass spectrometry–based shotgun proteomics experiments. In this review, we survey computational methods for the analysis of such proteomics data, focusing on the explanation of the key concepts. Starting with mass spectrometric feature detection, we then cover methods for the identification of peptides. Subsequently, protein inference and the control of false discovery rates are highly important topics covered. We then discuss methods for the quantification of peptides and proteins. A section on downstream data analysis covers exploratory statistics, network analysis, machine learning, and multiomics data integration. Finally, we discuss current developments and provide an outlook on what the near future of computational proteomics might bear.

Download Full-text

A flexible statistical model for alignment of label-free proteomics data - incorporating ion mobility and product ion information

BMC Bioinformatics ◽

10.1186/1471-2105-14-364 ◽

2013 ◽

Vol 14 (1) ◽

Cited By ~ 6

Author(s):

Ashlee M Benjamin ◽

J Will Thompson ◽

Erik J Soderblom ◽

Scott J Geromanos ◽

Ricardo Henao ◽

...

Keyword(s):

Statistical Model ◽

Ion Mobility ◽

Label Free ◽

Proteomics Data

Download Full-text

IceR improves proteome coverage and data completeness in global and single-cell proteomics

Nature Communications ◽

10.1038/s41467-021-25077-6 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Mathias Kalxdorf ◽

Torsten Müller ◽

Oliver Stegle ◽

Jeroen Krijgsveld

Keyword(s):

Single Cell ◽

Large Scale ◽

Missing Values ◽

Peptide Identification ◽

Protein Quantification ◽

Developmental Trajectory ◽

Ion Current ◽

Label Free ◽

Proteomics Data ◽

Data Completeness

AbstractLabel-free proteomics by data-dependent acquisition enables the unbiased quantification of thousands of proteins, however it notoriously suffers from high rates of missing values, thus prohibiting consistent protein quantification across large sample cohorts. To solve this, we here present IceR (Ion current extraction Re-quantification), an efficient and user-friendly quantification workflow that combines high identification rates of data-dependent acquisition with low missing value rates similar to data-independent acquisition. Specifically, IceR uses ion current information for a hybrid peptide identification propagation approach with superior quantification precision, accuracy, reliability and data completeness compared to other quantitative workflows. Applied to plasma and single-cell proteomics data, IceR enhanced the number of reliably quantified proteins, improved discriminability between single-cell populations, and allowed reconstruction of a developmental trajectory. IceR will be useful to improve performance of large scale global as well as low-input proteomics applications, facilitated by its availability as an easy-to-use R-package.

Download Full-text

Comparison of Different Label-Free Techniques for the Semi-Absolute Quantification of Protein Abundance

Proteomes ◽

10.3390/proteomes10010002 ◽

2022 ◽

Vol 10 (1) ◽

pp. 2

Author(s):

Aarón Millán-Oropeza ◽

Mélisande Blein-Nicolas ◽

Véronique Monnet ◽

Michel Zivy ◽

Céline Henry

Keyword(s):

Shotgun Proteomics ◽

Absolute Quantification ◽

Biological Data ◽

Protein Quantification ◽

Protein Abundance ◽

Label Free ◽

Absolute Abundance ◽

Yeast Saccharomyces Cerevisiae ◽

Genome Scale ◽

Free Quantification

In proteomics, it is essential to quantify proteins in absolute terms if we wish to compare results among studies and integrate high-throughput biological data into genome-scale metabolic models. While labeling target peptides with stable isotopes allow protein abundance to be accurately quantified, the utility of this technique is constrained by the low number of quantifiable proteins that it yields. Recently, label-free shotgun proteomics has become the “gold standard” for carrying out global assessments of biological samples containing thousands of proteins. However, this tool must be further improved if we wish to accurately quantify absolute levels of proteins. Here, we used different label-free quantification techniques to estimate absolute protein abundance in the model yeast Saccharomyces cerevisiae. More specifically, we evaluated the performance of seven different quantification methods, based either on spectral counting (SC) or extracted-ion chromatogram (XIC), which were applied to samples from five different proteome backgrounds. We also compared the accuracy and reproducibility of two strategies for transforming relative abundance into absolute abundance: a UPS2-based strategy and the total protein approach (TPA). This study mentions technical challenges related to UPS2 use and proposes ways of addressing them, including utilizing a smaller, more highly optimized amount of UPS2. Overall, three SC-based methods (PAI, SAF, and NSAF) yielded the best results because they struck a good balance between experimental performance and protein quantification.

Download Full-text

Does Filter Aided Sample Preparation (FASP) Provide Sufficient Method Linearity for Quantitative Plant Shotgun Proteomics?

10.26434/chemrxiv.14663448.v1 ◽

2021 ◽

Author(s):

Tatiana Leonova ◽

Christian Ihling ◽

Mohamad Saoud ◽

Robert Rennert ◽

Ludger A. Wessjohann ◽

...

Keyword(s):

Sample Preparation ◽

Dynamic Range ◽

Proteome Analysis ◽

Shotgun Proteomics ◽

Specific Protein ◽

Protein Quantification ◽

Protein Isolation ◽

Isolation Technique ◽

Bottle Neck ◽

Shotgun Approach

Gel-free LC-based shotgun proteomics represents the current gold standard of proteome analysis due to its outstanding throughput, analytical resolution and reproducibility. Thereby, the efficiency of sample preparation, i.e., protein isolation, solubilization and proteolysis, directly affects the correctness and reliability of quantification, being therefore the bottle neck of shotgun proteomics. The desired performance of the sample preparation protocols can be achieved by application of detergents. However, these ultimately compromise reverse phase chromatographic separation and disrupt electrospray ionization. Filter aided sample preparation (FASP) represents an elegant approach to overcome these limitations. Although this method is comprehensively validated for cell proteomics, its applicability to plants and compatibility with plant-specific protein isolation protocols is still unknown, i.e., no data on linearity of underlying protein quantification methods for plant matrices is available. To fill this gap, we address here the potential of FASP in combination with two protein isolation protocols for quantitative analysis of pea (Pisum sativum) seed and Arabidopsis thaliana leaf proteomes by the shotgun approach. For this, in comprehensive spiking experiments with bovine serum albumin (BSA), we evaluated the linear dynamic range (LDR) of protein quantification in the presence of plant matrices. Further, we addressed the interference of two different plant matrices in quantitative experiments, accomplished with two alternative sample preparation workflows in comparison to conventional FASP-based digestion of cell lysates, considered here as a reference. Our results indicate very good applicability of FASP to quantitative plant proteomics with an only limited impact of the protein isolation technique on the methods overall performance.

Download Full-text

Benchmarking accuracy and precision of intensity-based absolute quantification of protein abundances in Saccharomyces cerevisiae

10.1101/2020.03.23.998237 ◽

2020 ◽

Cited By ~ 1

Author(s):

Benjamín J. Sánchez ◽

Petri-Jaan Lahtvee ◽

Kate Campbell ◽

Sergo Kasvandik ◽

Rosemary Yu ◽

...

Keyword(s):

Saccharomyces Cerevisiae ◽

Absolute Quantification ◽

Protein Quantification ◽

Label Free ◽

Proteomics Data ◽

Popular Method ◽

The Poor ◽

Genome Wide ◽

Accuracy And Precision ◽

Technical Reproducibility

AbstractProtein quantification via label-free mass spectrometry (MS) has become an increasingly popular method for determining genome-wide absolute protein abundances. A known caveat of this approach is the poor technical reproducibility, i.e. how consistent the estimations are when the same sample is measured repeatedly. Here, we measured proteomics data for Saccharomyces cerevisiae with both biological and inter-batch technical triplicates, to analyze both accuracy and precision of protein quantification via MS. Moreover, we analyzed how these metrics vary when applying different methods for converting MS intensities to absolute protein abundances. We found that a simple normalization and rescaling approach performs as accurately yet more precisely than methods that rely on external standards. Additionally, we show that inter-batch reproducibility is worse than biological reproducibility for all evaluated methods. These results subsequently serve as a benchmark for assessing MS data quality for protein quantification, whilst also underscoring current limitations in this approach.

Download Full-text

Precursor Intensity-Based Label-Free Quantification Software Tools for Proteomic and Multi-Omic Analysis within the Galaxy Platform

Proteomes ◽

10.3390/proteomes8030015 ◽

2020 ◽

Vol 8 (3) ◽

pp. 15

Author(s):

Subina Mehta ◽

Caleb W. Easterly ◽

Ray Sajulga ◽

Robert J. Millikin ◽

Andrea Argentini ◽

...

Keyword(s):

Dynamic Range ◽

Software Tools ◽

Protein Quantification ◽

Label Free ◽

Post Translational Modifications ◽

Label Free Quantification ◽

File Formats ◽

Rigorous Testing ◽

The Galaxy ◽

Free Quantification

For mass spectrometry-based peptide and protein quantification, label-free quantification (LFQ) based on precursor mass peak (MS1) intensities is considered reliable due to its dynamic range, reproducibility, and accuracy. LFQ enables peptide-level quantitation, which is useful in proteomics (analyzing peptides carrying post-translational modifications) and multi-omics studies such as metaproteomics (analyzing taxon-specific microbial peptides) and proteogenomics (analyzing non-canonical sequences). Bioinformatics workflows accessible via the Galaxy platform have proven useful for analysis of such complex multi-omic studies. However, workflows within the Galaxy platform have lacked well-tested LFQ tools. In this study, we have evaluated moFF and FlashLFQ, two open-source LFQ tools, and implemented them within the Galaxy platform to offer access and use via established workflows. Through rigorous testing and communication with the tool developers, we have optimized the performance of each tool. Software features evaluated include: (a) match-between-runs (MBR); (b) using multiple file-formats as input for improved quantification; (c) use of containers and/or conda packages; (d) parameters needed for analyzing large datasets; and (e) optimization and validation of software performance. This work establishes a process for software implementation, optimization, and validation, and offers access to two robust software tools for LFQ-based analysis within the Galaxy platform.

Download Full-text