NovoPair: De novo peptide sequencing for tandem mass spectra pair

Author(s):  
Yan Yan ◽  
Anthony J. Kusalik ◽  
Fang-Xiang Wu
2012 ◽  
Vol 12 (2) ◽  
pp. 615-625 ◽  
Author(s):  
Hao Chi ◽  
Haifeng Chen ◽  
Kun He ◽  
Long Wu ◽  
Bing Yang ◽  
...  

2008 ◽  
Vol 06 (03) ◽  
pp. 467-492 ◽  
Author(s):  
KANG NING ◽  
NAN YE ◽  
HON WAI LEONG

Peptide sequencing plays a fundamental role in proteomics. Tandem mass spectrometry, being sensitive and efficient, is one of the most commonly used techniques in peptide sequencing. Many computational models and algorithms have been developed for peptide sequencing using tandem mass spectrometry. In this paper, we investigate general issues in de novo sequencing, and present results that can be used to improve current de novo sequencing algorithms. We propose a general preprocessing scheme that performs binning, pseudo-peak introduction, and noise removal, and present theoretical and experimental analyses on each of the components. Then, we study the antisymmetry problem and current assumptions related to it, and propose a more realistic way to handle the antisymmetry problem based on analysis of some datasets. We integrate our findings on preprocessing and the antisymmetry problem with some current models for peptide sequencing. Experimental results show that our findings help to improve accuracies for de novo sequencing.


2019 ◽  
Vol 35 (14) ◽  
pp. i183-i190 ◽  
Author(s):  
Hao Yang ◽  
Hao Chi ◽  
Wen-Feng Zeng ◽  
Wen-Jing Zhou ◽  
Si-Min He

AbstractMotivationDe novo peptide sequencing based on tandem mass spectrometry data is the key technology of shotgun proteomics for identifying peptides without any database and assembling unknown proteins. However, owing to the low ion coverage in tandem mass spectra, the order of certain consecutive amino acids cannot be determined if all of their supporting fragment ions are missing, which results in the low precision of de novo sequencing.ResultsIn order to solve this problem, we developed pNovo 3, which used a learning-to-rank framework to distinguish similar peptide candidates for each spectrum. Three metrics for measuring the similarity between each experimental spectrum and its corresponding theoretical spectrum were used as important features, in which the theoretical spectra can be precisely predicted by the pDeep algorithm using deep learning. On seven benchmark datasets from six diverse species, pNovo 3 recalled 29–102% more correct spectra, and the precision was 11–89% higher than three other state-of-the-art de novo sequencing algorithms. Furthermore, compared with the newly developed DeepNovo, which also used the deep learning approach, pNovo 3 still identified 21–50% more spectra on the nine datasets used in the study of DeepNovo. In summary, the deep learning and learning-to-rank techniques implemented in pNovo 3 significantly improve the precision of de novo sequencing, and such machine learning framework is worth extending to other related research fields to distinguish the similar sequences.Availability and implementationpNovo 3 can be freely downloaded from http://pfind.ict.ac.cn/software/pNovo/index.html.Supplementary informationSupplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document