scholarly journals The local-balanced model for improved machine learning outcomes on mass spectrometry data sets and other instrumental data

2021 ◽  
Vol 413 (6) ◽  
pp. 1583-1593
Author(s):  
Heather Desaire ◽  
Milani Wijeweera Patabandige ◽  
David Hua
2021 ◽  
Author(s):  
Boris M. Zühlke ◽  
Ewelina M. Sokolowska ◽  
Marcin Luzarowski ◽  
Denis Schlossarek ◽  
Monika Chodasiewicz ◽  
...  

AbstractMetabolite-protein interactions affect and shape diverse cellular processes. Yet, despite advances, approaches for identifying metabolite-protein interactions at a genome-wide scale are lacking. Here we present an approach termed SLIMP that predicts metabolite-protein interactions using supervised machine learning on features engineered from metabolic and proteomic profiles from a co-fractionation mass spectrometry-based technique. By applying SLIMP with gold standards, assembled from public databases, along with metabolic and proteomic data sets from multiple conditions and growth stages we predicted over 9,000 and 20,000 metabolite-protein interactions for Saccharomyces cerevisiae and Arabidopsis thaliana, respectively. Extensive comparative analyses corroborated the quality of the predictions from SLIMP with respect to widely-used performance measures (e.g. F1-score exceeding 0.8). SLIMP predicted novel targets of 2’, 3’ cyclic nucleotides and dipeptides, which we analysed comparatively between the two organisms. Finally, predicted interactions for the dipeptide Tyr-Asp in Arabidopsis and the dipeptide Ser-Leu in yeast were independently validated, opening the possibility for future applications of supervised machine learning approaches in this area of systems biology.


2017 ◽  
Vol 108 ◽  
pp. 51-60 ◽  
Author(s):  
Peter Herzsprung ◽  
Wolf von Tümpling ◽  
Katrin Wendt-Potthoff ◽  
Norbert Hertkorn ◽  
Mourad Harir ◽  
...  

2020 ◽  
Author(s):  
Leonoor E.M. Tideman ◽  
Lukasz G. Migas ◽  
Katerina V. Djambazova ◽  
Nathan Heath Patterson ◽  
Richard M. Caprioli ◽  
...  

AbstractThe search for molecular species that are differentially expressed between biological states is an important step towards discovering promising biomarker candidates. In imaging mass spectrometry (IMS), performing this search manually is often impractical due to the large size and high-dimensionality of IMS datasets. Instead, we propose an interpretable machine learning workflow that automatically identifies biomarker candidates by their mass-to-charge ratios, and that quantitatively estimates their relevance to recognizing a given biological class using Shapley additive explanations (SHAP). The task of biomarker candidate discovery is translated into a feature ranking problem: given a classification model that assigns pixels to different biological classes on the basis of their mass spectra, the molecular species that the model uses as features are ranked in descending order of relative predictive importance such that the top-ranking features have a higher likelihood of being useful biomarkers. Besides providing the user with an experiment-wide measure of a molecular species’ biomarker potential, our workflow delivers spatially localized explanations of the classification model’s decision-making process in the form of a novel representation called SHAP maps. SHAP maps deliver insight into the spatial specificity of biomarker candidates by highlighting in which regions of the tissue sample each feature provides discriminative information and in which regions it does not. SHAP maps also enable one to determine whether the relationship between a biomarker candidate and a biological state of interest is correlative or anticorrelative. Our automated approach to estimating a molecular species’ potential for characterizing a user-provided biological class, combined with the untargeted and multiplexed nature of IMS, allows for the rapid screening of thousands of molecular species and the obtention of a broader biomarker candidate shortlist than would be possible through targeted manual assessment. Our biomarker candidate discovery workflow is demonstrated on mouse-pup and rat kidney case studies.HighlightsOur workflow automates the discovery of biomarker candidates in imaging mass spectrometry data by using state-of-the-art machine learning methodology to produce a shortlist of molecular species that are differentially expressed with regards to a user-provided biological class.A model interpretability method called Shapley additive explanations (SHAP), with observational Shapley values, enables us to quantify the local and global predictive importance of molecular species with respect to recognizing a user-provided biological class.By providing spatially localized explanations for a classification model’s decision-making process, SHAP maps deliver insight into the spatial specificity of biomarker candidates and enable one to determine whether (and where) the relationship between a biomarker candidate and the class of interest is correlative or anticorrelative.


2011 ◽  
Vol 11 (1) ◽  
pp. O111.011379 ◽  
Author(s):  
Mathias Wilhelm ◽  
Marc Kirchner ◽  
Judith A. J. Steen ◽  
Hanno Steen

Sign in / Sign up

Export Citation Format

Share Document