Accounting for isotopic clustering in Fourier transform mass spectrometry data analysis for clinical diagnostic studies

Author(s):  
Alexia Kakourou ◽  
Werner Vach ◽  
Simone Nicolardi ◽  
Yuri van der Burgt ◽  
Bart Mertens

AbstractMass spectrometry based clinical proteomics has emerged as a powerful tool for high-throughput protein profiling and biomarker discovery. Recent improvements in mass spectrometry technology have boosted the potential of proteomic studies in biomedical research. However, the complexity of the proteomic expression introduces new statistical challenges in summarizing and analyzing the acquired data. Statistical methods for optimally processing proteomic data are currently a growing field of research. In this paper we present simple, yet appropriate methods to preprocess, summarize and analyze high-throughput MALDI-FTICR mass spectrometry data, collected in a case-control fashion, while dealing with the statistical challenges that accompany such data. The known statistical properties of the isotopic distribution of the peptide molecules are used to preprocess the spectra and translate the proteomic expression into a condensed data set. Information on either the intensity level or the shape of the identified isotopic clusters is used to derive summary measures on which diagnostic rules for disease status allocation will be based. Results indicate that both the shape of the identified isotopic clusters and the overall intensity level carry information on the class outcome and can be used to predict the presence or absence of the disease.

2007 ◽  
Vol 3 ◽  
pp. 117693510700300 ◽  
Author(s):  
Masaru Ushijima ◽  
Satoshi Miyata ◽  
Shinto Eguchi ◽  
Masanori Kawakita ◽  
Masataka Yoshimoto ◽  
...  

We propose a method for biomarker discovery from mass spectrometry data, improving the common peak approach developed by Fushiki et al. ( BMC Bioinformatics, 7:358, 2006). The common peak method is a simple way to select the sensible peaks that are shared with many subjects among all detected peaks by combining a standard spectrum alignment and kernel density estimates. The key idea of our proposed method is to apply the common peak approach to each class label separately. Hence, the proposed method gains more informative peaks for predicting class labels, while minor peaks associated with specific subjects are deleted correctly. We used a SELDI-TOF MS data set from laser microdissected cancer tissues for predicting the treatment effects of neoadjuvant therapy using an anticancer drug on breast cancer patients. The AdaBoost algorithm is adopted for pattern recognition, based on the set of candidate peaks selected by the proposed method. The analysis gives good performance in the sense of test errors for classifying the class labels for a given feature vector of selected peak values.


2007 ◽  
Vol 177 (4S) ◽  
pp. 52-53
Author(s):  
Stefano Ongarello ◽  
Eberhard Steiner ◽  
Regina Achleitner ◽  
Isabel Feuerstein ◽  
Birgit Stenzel ◽  
...  

2007 ◽  
Vol 3 ◽  
pp. 117693510700300 ◽  
Author(s):  
Nadège Dossat ◽  
Alain Mangé ◽  
Jérôme Solassol ◽  
William Jacot ◽  
Ludovic Lhermitte ◽  
...  

A key challenge in clinical proteomics of cancer is the identification of biomarkers that could allow detection, diagnosis and prognosis of the diseases. Recent advances in mass spectrometry and proteomic instrumentations offer unique chance to rapidly identify these markers. These advances pose considerable challenges, similar to those created by microarray-based investigation, for the discovery of pattern of markers from high-dimensional data, specific to each pathologic state (e.g. normal vs cancer). We propose a three-step strategy to select important markers from high-dimensional mass spectrometry data using surface enhanced laser desorption/ionization (SELDI) technology. The first two steps are the selection of the most discriminating biomarkers with a construction of different classifiers. Finally, we compare and validate their performance and robustness using different supervised classification methods such as Support Vector Machine, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Neural Networks, Classification Trees and Boosting Trees. We show that the proposed method is suitable for analysing high-throughput proteomics data and that the combination of logistic regression and Linear Discriminant Analysis outperform other methods tested.


Author(s):  
Habtom W. Ressom ◽  
Rency S Varghese ◽  
Lenka Goldman ◽  
Christopher A. Loffredo ◽  
Mohamed Abdel-Hamid ◽  
...  

2011 ◽  
Vol 21 (9-10) ◽  
pp. 656
Author(s):  
F.C. Martin ◽  
S. Oonk ◽  
P.A.C. ’t Hoen ◽  
V.D. Nadarajah ◽  
A. Chaouch ◽  
...  

Author(s):  
Michael Kiehntopf ◽  
Robert Siegmund ◽  
Thomas Deufel

AbstractSurface-enhanced laser desorption time of flight mass spectrometry (SELDI-TOF-MS) is an important proteomic technology that is immediately available for the high throughput analysis of complex protein samples. Over the last few years, several studies have demonstrated that comparative protein profiling using SELDI-TOF-MS breaks new ground in diagnostic protein analysis particularly with regard to the identification of novel biomarkers. Importantly, researchers have acquired a better understanding also of the limitations of this technology and various pitfalls in biomarker discovery. Bearing these in mind, great emphasis must be placed on the development of rigorous standards and quality control procedures for the pre-analytical as well as the analytical phase and subsequent bioinformatics applied to analysis of the data. To avoid the risk of false-significant results studies must be designed carefully and control groups accurately selected. In addition, appropriate tools, already established for analysis of highly complex microarray data, need to be applied to protein profiling data. To validate the significance of any candidate biomarker derived from pilot studies in appropriately designed prospective multi-center studies is mandatory; reproducibility of the clinical results must be shown over time and in different diagnostic settings. SELDI-TOF-MS-based studies that are in compliance with these requirements are now required; only a few have been published so far. In the meantime, further evaluation and optimization of both technique and marker validation strategies are called for before MS-based proteomic algorithms can be translated into routine laboratory testing.Clin Chem Lab Med 2007;45:1435–49.


2007 ◽  
Vol 05 (05) ◽  
pp. 1023-1045 ◽  
Author(s):  
WAYNE G. FISHER ◽  
KEVIN P. ROSENBLATT ◽  
DAVID A. FISHMAN ◽  
GORDON R. WHITELEY ◽  
ALVYDAS MIKULSKIS ◽  
...  

A high-throughput software pipeline for analyzing high-performance mass spectral data sets has been developed to facilitate rapid and accurate biomarker determination. The software exploits the mass precision and resolution of high-performance instrumentation, bypasses peak-finding steps, and instead uses discrete m/z data points to identify putative biomarkers. The technique is insensitive to peak shape, and works on overlapping and non-Gaussian peaks which can confound peak-finding algorithms. Methods are presented to assess data set quality and the suitability of groups of m/z values that map to peaks as potential biomarkers. The algorithm is demonstrated with serum mass spectra from patients with and without ovarian cancer. Biomarker candidates are identified and ranked by their ability to discriminate between cancer and noncancer conditions. Their discriminating power is tested by classifying unknowns using a simple distance calculation, and a sensitivity of 95.6% and a specificity of 97.1% are obtained. In contrast, the sensitivity of the ovarian cancer blood marker CA125 is ~50% for stage I/II and ~80% for stage III/IV cancers. While the generalizability of these markers is currently unknown, we have demonstrated the ability of our analytical package to extract biomarker candidates from high-performance mass spectral data.


Sign in / Sign up

Export Citation Format

Share Document