Group-DIA: analyzing multiple data-independent acquisition mass spectrometry data files

AbstractArabidopsis is an important model organism and the first plant with its genome sequenced. Knowledge from studying this species has either direct or indirect applications to agriculture and human health. Quantitative proteomics by data-independent acquisition (SWATH/DIA-MS) was recently developed and considered as a high-throughput targetedlike approach for accurate proteome quantitation. In this approach, a high-quality and comprehensive library is a prerequisite. Here, we generated a protein expression atlas of 10 organs of Arabidopsis and created a library consisting of 15,514 protein groups, 187,265 unique peptide sequences, and 278,278 precursors. The identified protein groups correspond to ~56.5% of the predicted proteome. Further proteogenomics analysis identified 28 novel proteins. We subsequently applied DIA-mass spectrometry using this library to quantify the effect of abscisic acid on Arabidopsis. We were able to recover 8,793 protein groups with 1,787 of them being differentially expressed which includes 65 proteins known to respond to abscisic acid stress. Mass spectrometry data are available via ProteomeXchange with identifier PXD012710 for data-dependent acquisition and PXD014032 for DIA analyses.

Download Full-text

MatrisomeDB: the ECM-protein knowledge database

Nucleic Acids Research ◽

10.1093/nar/gkz849 ◽

2019 ◽

Vol 48 (D1) ◽

pp. D1136-D1144 ◽

Cited By ~ 20

Author(s):

Xinhao Shao ◽

Isra N Taha ◽

Karl R Clauser ◽

Yu (Tom) Gao ◽

Alexandra Naba

Keyword(s):

Mass Spectrometry ◽

Cell Polarization ◽

Mass Spectrometry Data ◽

Breast Cancers ◽

Complete Collection ◽

Post Translational Modifications ◽

Proteomic Data ◽

Tissue Organization ◽

Data Files ◽

Cancer Types

Abstract The extracellular matrix (ECM) is a complex and dynamic meshwork of cross-linked proteins that supports cell polarization and functions and tissue organization and homeostasis. Over the past few decades, mass-spectrometry-based proteomics has emerged as the method of choice to characterize the composition of the ECM of normal and diseased tissues. Here, we present a new release of MatrisomeDB, a searchable collection of curated proteomic data from 17 studies on the ECM of 15 different normal tissue types, six cancer types (different grades of breast cancers, colorectal cancer, melanoma, and insulinoma) and other diseases including vascular defects and lung and liver fibroses. MatrisomeDB (http://www.pepchem.org/matrisomedb) was built by retrieving raw mass spectrometry data files and reprocessing them using the same search parameters and criteria to allow for a more direct comparison between the different studies. The present release of MatrisomeDB includes 847 human and 791 mouse ECM proteoforms and over 350 000 human and 600 000 mouse ECM-derived peptide-to-spectrum matches. For each query, a hierarchically-clustered tissue distribution map, a peptide coverage map, and a list of post-translational modifications identified, are generated. MatrisomeDB is the most complete collection of ECM proteomic data to date and allows the building of a comprehensive ECM atlas.

Download Full-text

PECAN: library-free peptide detection for data-independent acquisition tandem mass spectrometry data

Nature Methods ◽

10.1038/nmeth.4390 ◽

2017 ◽

Vol 14 (9) ◽

pp. 903-908 ◽

Cited By ~ 74

Author(s):

Ying S Ting ◽

Jarrett D Egertson ◽

James G Bollinger ◽

Brian C Searle ◽

Samuel H Payne ◽

...

Keyword(s):

Mass Spectrometry ◽

Tandem Mass Spectrometry ◽

Mass Spectrometry Data ◽

Tandem Mass ◽

Free Peptide ◽

Data Independent Acquisition ◽

Peptide Detection ◽

Tandem Mass Spectrometry Data

Download Full-text

LipidMiner: a software for automated identification and quantification of lipids from multiple liquid chromatography/mass spectrometry data files

Rapid Communications in Mass Spectrometry ◽

10.1002/rcm.6865 ◽

2014 ◽

Vol 28 (8) ◽

pp. 981-985 ◽

Cited By ~ 4

Author(s):

Da Meng ◽

Qibin Zhang ◽

Xiaoli Gao ◽

Si Wu ◽

Guang Lin

Keyword(s):

Mass Spectrometry ◽

Liquid Chromatography ◽

Mass Spectrometry Data ◽

Liquid Chromatography Mass Spectrometry ◽

Automated Identification ◽

Chromatography Mass Spectrometry ◽

Data Files ◽

Identification And Quantification

Download Full-text

Metaproteomics boosted up by untargeted data-independent acquisition data analysis framework

10.1101/2020.12.21.423800 ◽

2020 ◽

Author(s):

Sami Pietilä ◽

Tomi Suomi ◽

Laura L. Elo

Keyword(s):

Mass Spectrometry ◽

Open Source Software ◽

Previous Method ◽

Mass Spectrometry Data ◽

Analysis Framework ◽

Spectral Library ◽

Microbial Composition ◽

Data Independent Acquisition ◽

Open Source Software Package ◽

First Time

AbstractMass spectrometry based metaproteomics is a relatively new field of research that provides the ability to characterize the functionality of microbiota. Recently, we were the first to demonstrate the applicability of data-independent acquisition (DIA) mass spectrometry to the analysis of complex metaproteomic samples. This allowed us to circumvent many of the drawbacks of the conventionally used data-dependent acquisition (DDA) mass spectrometry, mainly the limited reproducibility when analyzing samples with complex microbial composition. However, the previous method still required additional DDA data on the samples to assist the DIA analysis. Here, we introduce, for the first time, a DIA metaproteomics approach that does not require any DDA data, but instead replaces a spectral library generated from DDA data with a pseudospectral library generated directly from the metaproteomics DIA samples. We demonstrate that using the new DIA-only approach, we can achieve higher peptide yields than with the DDA-assisted approach, while the amount of required mass spectrometry data is reduced to a single DIA run per sample. The new DIA-only metaproteomics approach is implemented as open-source software package DIAtools 2.0, which is freely available from DockerHub.

Download Full-text

DIAmeter: Matching peptides to data-independent acquisition mass spectrometry data

10.1101/2021.01.29.428872 ◽

2021 ◽

Author(s):

Yang Young Lu ◽

Jeff Bilmes ◽

Ricard A Rodriguez-Mias ◽

Judit Villén ◽

William Stafford Noble

Keyword(s):

Mass Spectrometry ◽

Complex Structure ◽

Mass Spectrometry Data ◽

Peptide Sequence ◽

Sequence Database ◽

Post Translational Modifications ◽

Data Independent Acquisition ◽

Using Data ◽

Tandem Mass Spectrometry Data ◽

Analyze Data

AbstractTandem mass spectrometry data acquired using data independent acquisition (DIA) is challenging to interpret because the data exhibits complex structure along both the mass-to-charge (m/z) and time axes. The most common approach to analyzing this type of data makes use of a library of previously observed DIA data patterns (a “spectral library”), but this approach is expensive because the libraries do not typically generalize well across laboratories. Here we propose DIAmeter, a search engine that detects peptides in DIA data using only a peptide sequence database. Unlike other library-free DIA analysis methods, DIAmeter supports data generated using both wide and narrow isolation windows, can readily detect peptides containing post-translational modifications, can analyze data from a variety of instrument platforms, and is capable of detecting peptides even in the absence of detectable signal in the survey (MS1) scan.

Download Full-text

A flexible workflow for building spectral libraries from narrow window data independent acquisition mass spectrometry data

10.1101/2021.11.22.469568 ◽

2021 ◽

Author(s):

Lilian R. Heil ◽

William E. Fondrie ◽

Christopher D. McGann ◽

Alexander J. Federation ◽

William S. Noble ◽

...

Keyword(s):

Mass Spectrometry ◽

Mass Spectrometry Data ◽

Spectral Library ◽

Post Translational Modifications ◽

Data Independent Acquisition ◽

Peptide Detection ◽

Phosphorylated Peptides ◽

Single Mass ◽

Fragmentation Patterns ◽

Spectral Libraries

Advances in library-based methods for peptide detection from data independent acquisition (DIA) mass spectrometry have made it possible to detect and quantify tens of thousands of peptides in a single mass spectrometry run. However, many of these methods rely on a comprehensive, high quality spectral library containing information about the expected retention time and fragmentation patterns of peptides in the sample. Empirical spectral libraries are often generated through data-dependent acquisition and may suffer from biases as a result. Spectral libraries can be generated in silico but these models are not trained to handle all possible post-translational modifications. Here, we propose a false discovery rate controlled spectrum-centric search workflow to generate spectral libraries directly from gas-phase fractionated DIA tandem mass spectrometry data. We demonstrate that this strategy is able to detect phosphorylated peptides and can be used to generate a spectral library for accurate peptide detection and quantitation in wide window DIA data. We compare the results of this search workflow to other library-free approaches and demonstrate that our search is competitive in terms of accuracy and sensitivity. These results demonstrate that the proposed workflow has the capacity to generate spectral libraries while avoiding the limitations of other methods.

Download Full-text

Cloud-based DIA data analysis module for signal refinement improves accuracy and throughput of large datasets

10.1101/2021.07.14.452243 ◽

2021 ◽

Author(s):

Karen E. Christianson ◽

Jacob. D. Jaffe ◽

Steven A. Carr ◽

Alvaro Sebastian Vaca Jacome

Keyword(s):

Mass Spectrometry ◽

Data Analysis ◽

Large Scale ◽

Mass Spectrometry Data ◽

Avant Garde ◽

Data Independent Acquisition ◽

Large Scale Data ◽

Biological Insight ◽

Computational Resources ◽

User Friendly

AbstractData-independent acquisition (DIA) is a powerful mass spectrometry method that promises higher coverage, reproducibility, and throughput than traditional quantitative proteomics approaches. However, the complexity of DIA data caused by fragmentation of co-isolating peptides presents significant challenges for confident assignment of identity and quantity, information that is essential for deriving meaningful biological insight from the data. To overcome this problem, we previously developed Avant-garde, a tool for automated signal refinement of DIA and other targeted mass spectrometry data. AvG is designed to work alongside existing tools for peptide detection to address the reliability and quantitative suitability of signals extracted for the identified peptides. While its use is straightforward and offers efficient refinement for small datasets, the execution of AvG for large DIA datasets is time-consuming, especially if run with limited computational resources. To overcome these limitations, we present here an improved, cloud-based implementation of the AvG algorithm deployed on Terra, a user-friendly cloud-based platform for large-scale data analysis and sharing, as an accessible and standardized resource to the wider community.

Download Full-text