Analysis of mass spectrometry data for serum biomarker discovery

Author(s):  
Habtom W. Ressom ◽  
Rency S Varghese ◽  
Lenka Goldman ◽  
Christopher A. Loffredo ◽  
Mohamed Abdel-Hamid ◽  
...  
Author(s):  
Alexia Kakourou ◽  
Werner Vach ◽  
Simone Nicolardi ◽  
Yuri van der Burgt ◽  
Bart Mertens

AbstractMass spectrometry based clinical proteomics has emerged as a powerful tool for high-throughput protein profiling and biomarker discovery. Recent improvements in mass spectrometry technology have boosted the potential of proteomic studies in biomedical research. However, the complexity of the proteomic expression introduces new statistical challenges in summarizing and analyzing the acquired data. Statistical methods for optimally processing proteomic data are currently a growing field of research. In this paper we present simple, yet appropriate methods to preprocess, summarize and analyze high-throughput MALDI-FTICR mass spectrometry data, collected in a case-control fashion, while dealing with the statistical challenges that accompany such data. The known statistical properties of the isotopic distribution of the peptide molecules are used to preprocess the spectra and translate the proteomic expression into a condensed data set. Information on either the intensity level or the shape of the identified isotopic clusters is used to derive summary measures on which diagnostic rules for disease status allocation will be based. Results indicate that both the shape of the identified isotopic clusters and the overall intensity level carry information on the class outcome and can be used to predict the presence or absence of the disease.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e6699
Author(s):  
So Young Ryu ◽  
George A. Wendt

Mass spectrometry-based proteomics facilitate disease understanding by providing protein abundance information about disease progression. For the same type of disease studies, multiple mass spectrometry datasets may be generated. Integrating multiple mass spectrometry datasets can provide valuable information that a single dataset analysis cannot provide. In this article, we introduce a meta-analysis software, MetaMSD (Meta Analysis for Mass Spectrometry Data) that is specifically designed for mass spectrometry data. Using Stouffer’s or Pearson’s test, MetaMSD detects significantly more differential proteins than the analysis based on the single best experiment. We demonstrate the performance of MetaMSD using simulated data, urinary proteomic data of kidney transplant patients, and breast cancer proteomic data. Noting the common practice of performing a pilot study prior to a main study, this software will help proteomics researchers fully utilize the benefit of multiple studies (or datasets), thus optimizing biomarker discovery. MetaMSD is a command line tool that automatically outputs various graphs and differential proteins with confidence scores. It is implemented in R and is freely available for public use at https://github.com/soyoungryu/MetaMSD. The user manual and data are available at the site. The user manual is written in such a way that scientists who are not familiar with R software can use MetaMSD.


2007 ◽  
Vol 3 ◽  
pp. 117693510700300 ◽  
Author(s):  
Masaru Ushijima ◽  
Satoshi Miyata ◽  
Shinto Eguchi ◽  
Masanori Kawakita ◽  
Masataka Yoshimoto ◽  
...  

We propose a method for biomarker discovery from mass spectrometry data, improving the common peak approach developed by Fushiki et al. ( BMC Bioinformatics, 7:358, 2006). The common peak method is a simple way to select the sensible peaks that are shared with many subjects among all detected peaks by combining a standard spectrum alignment and kernel density estimates. The key idea of our proposed method is to apply the common peak approach to each class label separately. Hence, the proposed method gains more informative peaks for predicting class labels, while minor peaks associated with specific subjects are deleted correctly. We used a SELDI-TOF MS data set from laser microdissected cancer tissues for predicting the treatment effects of neoadjuvant therapy using an anticancer drug on breast cancer patients. The AdaBoost algorithm is adopted for pattern recognition, based on the set of candidate peaks selected by the proposed method. The analysis gives good performance in the sense of test errors for classifying the class labels for a given feature vector of selected peak values.


2009 ◽  
Vol 29 (1) ◽  
pp. 57-69 ◽  
Author(s):  
Gordon R. Whiteley ◽  
Simona Colantonio ◽  
Andrea Sacconi ◽  
Richard G. Saul

2020 ◽  
Author(s):  
Walid M. Abdelmoula ◽  
Begona Gimenez-Cassina Lopez ◽  
Elizabeth C. Randall ◽  
Tina Kapur ◽  
Jann N. Sarkaria ◽  
...  

AbstractMass spectrometry imaging (MSI) is an emerging technology that holds potential for improving clinical diagnosis, biomarker discovery, metabolomics research and pharmaceutical applications. The large data size and high dimensional nature of MSI pose computational and memory complexities that hinder accurate identification of biologically-relevant molecular patterns. We propose msiPL, a robust and generic probabilistic generative model based on a fully-connected variational autoencoder for unsupervised analysis and peak learning of MSI data. The method can efficiently learn and visualize the underlying non-linear spectral manifold, reveal biologically-relevant clusters of tumor heterogeneity and identify underlying informative m/z peaks. The method provides a probabilistic parametric mapping to allow a trained model to rapidly analyze a new unseen MSI dataset in a few seconds. The computational model features a memory-efficient implementation using a minibatch processing strategy to enable the analyses of big MSI data (encompassing more than 1 million high-dimensional datapoints) with significantly less memory. We demonstrate the robustness and generic applicability of the application on MSI data of large size from different biological systems and acquired using different mass spectrometers at different centers, namely: 2D Matrix-Assisted Laser Desorption Ionization (MALDI) Fourier Transform Ion Cyclotron Resonance (FT ICR) MSI data of human prostate cancer, 3D MALDI Time-of-Flight (TOF) MSI data of human oral squamous cell carcinoma, 3D Desorption Electrospray Ionization (DESI) Orbitrap MSI data of human colorectal adenocarcinoma, 3D MALDI TOF MSI data of mouse kidney, and 3D MALDI FT ICR MSI data of a patient-derived xenograft (PDX) mouse brain model of glioblastoma.SignificanceMass spectrometry imaging (MSI) provides detailed molecular characterization of a tissue specimen while preserving spatial distributions. However, the complex nature of MSI data slows down the processing time and poses computational and memory challenges that hinder the analysis of multiple specimens required to extract biologically relevant patterns. Moreover, the subjectivity in the selection of parameters for conventional pre-processing approaches can lead to bias. Here, we present a generative probabilistic deep-learning model that can analyze and non-linearly visualize MSI data independent of the nature of the specimen and of the MSI platform. We demonstrate robustness of the method with application to different tissue types, and envision it as a new generation of rapid and robust analysis for mass spectrometry data.


2012 ◽  
Vol 8 (11) ◽  
pp. 2845 ◽  
Author(s):  
Marco Chierici ◽  
Davide Albanese ◽  
Pietro Franceschi ◽  
Cesare Furlanello

Sign in / Sign up

Export Citation Format

Share Document