scholarly journals proDA: Probabilistic Dropout Analysis for Identifying Differentially Abundant Proteins in Label-Free Mass Spectrometry

2020 ◽  
Author(s):  
Constantin Ahlmann-Eltze ◽  
Simon Anders

Abstract Protein mass spectrometry with label-free quantification (LFQ) is widely used for quantitative proteomics studies. Nevertheless, well-principled statistical inference procedures are still lacking, and most practitioners adopt methods from transcriptomics. These, however, cannot properly treat the principal complication of label-free proteomics, namely many non-randomly missing values. We present proDA, a method to perform statistical tests for differential abundance of proteins. It models missing values in an intensity-dependent probabilistic manner. proDA is based on linear models and thus suitable for complex experimental designs, and boosts statistical power for small sample sizes by using variance moderation. We show that the currently widely used methods based on ad hoc imputation schemes can report excessive false positives, and that proDA not only overcomes this serious issue but also offers high sensitivity. Thus, proDA fills a crucial gap in the toolbox of quantitative proteomics.

2019 ◽  
Author(s):  
Constantin Ahlmann-Eltze ◽  
Simon Anders

AbstractProtein mass spectrometry with label-free quantification (LFQ) is widely used for quantitative proteomics studies. Nevertheless, well-principled statistical inference procedures are still lacking, and most practitioners adopt methods from transcriptomics. These, however, cannot properly treat the principal complication of label-free proteomics, namely many non-randomly missing values.We present proDA, a method to perform statistical tests for differential abundance of proteins. It models missing values in an intensity-dependent probabilistic manner. proDA is based on linear models and thus suitable for complex experimental designs, and boosts statistical power for small sample sizes by using variance moderation. We show that the currently widely used methods based on ad hoc imputation schemes can report excessive false positives, and that proDA not only overcomes this serious issue but also offers high sensitivity. Thus, proDA fills a crucial gap in the toolbox of quantitative proteomics.


2020 ◽  
Vol 48 (14) ◽  
pp. e83-e83 ◽  
Author(s):  
Shisheng Wang ◽  
Wenxue Li ◽  
Liqiang Hu ◽  
Jingqiu Cheng ◽  
Hao Yang ◽  
...  

Abstract Mass spectrometry (MS)-based quantitative proteomics experiments frequently generate data with missing values, which may profoundly affect downstream analyses. A wide variety of imputation methods have been established to deal with the missing-value issue. To date, however, there is a scarcity of efficient, systematic, and easy-to-handle tools that are tailored for proteomics community. Herein, we developed a user-friendly and powerful stand-alone software, NAguideR, to enable implementation and evaluation of different missing value methods offered by 23 widely used missing-value imputation algorithms. NAguideR further evaluates data imputation results through classic computational criteria and, unprecedentedly, proteomic empirical criteria, such as quantitative consistency between different charge-states of the same peptide, different peptides belonging to the same proteins, and individual proteins participating protein complexes and functional interactions. We applied NAguideR into three label-free proteomic datasets featuring peptide-level, protein-level, and phosphoproteomic variables respectively, all generated by data independent acquisition mass spectrometry (DIA-MS) with substantial biological replicates. The results indicate that NAguideR is able to discriminate the optimal imputation methods that are facilitating DIA-MS experiments over those sub-optimal and low-performance algorithms. NAguideR further provides downloadable tables and figures supporting flexible data analysis and interpretation. NAguideR is freely available at http://www.omicsolution.org/wukong/NAguideR/ and the source code: https://github.com/wangshisheng/NAguideR/.


2011 ◽  
Vol 38 (6) ◽  
pp. 506-518 ◽  
Author(s):  
Wei ZHANG ◽  
Ji-Yang ZHANG ◽  
Hui LIU ◽  
Han-Chang SUN ◽  
Chang-Ming XU ◽  
...  

Molecules ◽  
2020 ◽  
Vol 25 (21) ◽  
pp. 4979
Author(s):  
Marco Giampà ◽  
Elvira Sgobba

Noncovalent interactions are the keys to the structural organization of biomolecule e.g., proteins, glycans, lipids in the process of molecular recognition processes e.g., enzyme-substrate, antigen-antibody. Protein interactions lead to conformational changes, which dictate the functionality of that protein-protein complex. Besides biophysics techniques, noncovalent interaction and conformational dynamics, can be studied via mass spectrometry (MS), which represents a powerful tool, due to its low sample consumption, high sensitivity, and label-free sample. In this review, the focus will be placed on Matrix-Assisted Laser Desorption Ionization Mass Spectrometry (MALDI-MS) and its role in the analysis of protein-protein noncovalent assemblies exploring the relationship within noncovalent interaction, conformation, and biological function.


2019 ◽  
Author(s):  
Veit Schwämmle ◽  
Christina E Hagensen ◽  
Adelina Rogowska-Wrzesinska ◽  
Ole N. Jensen

AbstractStatistical testing remains one of the main challenges for high-confidence detection of differentially regulated proteins or peptides in large-scale quantitative proteomics experiments by mass spectrometry. Statistical tests need to be sufficiently robust to deal with experiment intrinsic data structures and variations and often also reduced feature coverage across different biological samples due to ubiquitous missing values. A robust statistical test provides accurate confidence scores of large-scale proteomics results, regardless of instrument platform, experimental protocol and software tools. However, the multitude of different combinations of experimental strategies, mass spectrometry techniques and informatics methods complicate the decision of choosing appropriate statistical approaches. We address this challenge by introducing PolySTest, a user-friendly web service for statistical testing, data browsing and data visualization. We introduce a new method, Miss Test, that simultaneously tests for missingness and feature abundance, thereby complementing common statistical tests by rescuing otherwise discarded data features. We demonstrate that PolySTest with integrated Miss Test achieves higher confidence and higher sensitivity for artificial and experimental proteomics data sets with known ground truth. Application of PolySTest to mass spectrometry based large-scale proteomics data obtained from differentiating muscle cells resulted in the rescue of 10%-20% additional proteins in the identified molecular networks relevant to muscle differentiation. We conclude that PolySTest is a valuable addition to existing tools and instrument enhancements that improve coverage and depth of large-scale proteomics experiments. A fully functional demo version of PolySTest and Miss Test is available via http://computproteomics.bmb.sdu.dk/Apps/PolySTest.


2021 ◽  
Author(s):  
Jian Song ◽  
Changbin Yu

AbstractThe label-free mass spectrometry-based proteomics data inevitably suffer from the problem of missing values. The existence of missing values prevents the downstream analyses which need a complete data matrix. Our motivation is to introduce the state-of-art machine learning algorithm XGboost to realize a method of imputation which can improve the accuracy of imputation. But in practical, XGboost has many parameters need to be tuned to deliver on its potential high performance. Although cross validation may find the best parameters, it is much time-consuming. Alternatively, we empirically determined the parameters to two kinds of base learners of XGboost. To explore the robustness and performance of XGboost based imputation with predetermined parameters, we conducted tests on three benchmark datasets. As a comparative, six common imputation methods were also experimented in terms of normalized root mean squared error and Pearson correlation coefficient. The comparative experimental results indicated that the XGboost based imputation method using the linear base learner is competitive to or out-performs its competitors, including the random forest based imputation, by achieving smaller imputation errors and better structure preservation under the empirical parameters for the three benchmark datasets.


Author(s):  
Daisy Unsihuay ◽  
Daniela Mesa Sanchez ◽  
Julia Laskin

Mass spectrometry imaging (MSI) is a powerful, label-free technique that provides detailed maps of hundreds of molecules in complex samples with high sensitivity and subcellular spatial resolution. Accurate quantification in MSI relies on a detailed understanding of matrix effects associated with the ionization process along with evaluation of the extraction efficiency and mass-dependent ion losses occurring in the analysis step. We present a critical summary of approaches developed for quantitative MSI of metabolites, lipids, and proteins in biological tissues and discuss their current and future applications. Expected final online publication date for the Annual Review of Physical Chemistry, Volume 72 is April 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Sign in / Sign up

Export Citation Format

Share Document