Protein Identification by Tandem Mass Spectrometry and Sequence Database Searching

Author(s):  
Alexey I. Nesvizhskii
2007 ◽  
Vol 05 (02a) ◽  
pp. 297-311 ◽  
Author(s):  
CHUNGONG YU ◽  
YU LIN ◽  
SHIWEI SUN ◽  
JINJIN CAI ◽  
JINGFEN ZHANG ◽  
...  

In protein identification by tandem mass spectrometry, it is critical to accurately predict the theoretical spectrum for a peptide sequence. To date, the widely-used database searching methods adopted simple statistical models for predicting. For some peptide, these models usually yield a theoretical spectrum with a significant deviation from the experimental one. In this paper, in order to derive an improved predicting model, we utilized a non-linear programming model to quantify the factors impacting peptide fragmentation. Then, an iterative algorithm was proposed to solve this optimization problem. Upon a training set of 1803 spectra, the experimental result showed a good agreement with some known principles about peptide fragmentation, such as the tendency to cleave at the middle of peptide, and Pro's preference of the N-terminal cleavage. Moreover, upon a testing set of 941 spectra, comparison of the predicted spectra against the experimental ones showed that this method can generate reasonable predictions. The results in this paper can offer help to both database searching and de novo methods.


Author(s):  
Haipeng Wang

Protein identification (sequencing) by tandem mass spectrometry is a fundamental technique for proteomics which studies structures and functions of proteins in large scale and acts as a complement to genomics. Analysis and interpretation of vast amounts of spectral data generated in proteomics experiments present unprecedented challenges and opportunities for data mining in areas such as data preprocessing, peptide-spectrum matching, results validation, peptide fragmentation pattern discovery and modeling, and post-translational modification (PTM) analysis. This article introduces the basic concepts and terms of protein identification and briefly reviews the state-of-the-art relevant data mining applications. It also outlines challenges and future potential hot spots in this field.


Sign in / Sign up

Export Citation Format

Share Document