Using cross-correlation normalized for peptide length to optimize peptide identification in shotgun proteomics

2005 ◽  
Vol 19 (20) ◽  
pp. 2983-2985 ◽  
Author(s):  
Bing Yang ◽  
Wantao Ying ◽  
Yan Gong ◽  
Yangjun Zhang ◽  
Yun Cai ◽  
...  
2008 ◽  
Vol 8 (3) ◽  
pp. 547-557 ◽  
Author(s):  
Jiyang Zhang ◽  
Jie Ma ◽  
Lei Dou ◽  
Songfeng Wu ◽  
Xiaohong Qian ◽  
...  

2020 ◽  
Author(s):  
John T. Halloran ◽  
Gregor Urban ◽  
David Rocke ◽  
Pierre Baldi

AbstractSemi-supervised machine learning post-processors critically improve peptide identification of shot-gun proteomics data. Such post-processors accept the peptide-spectrum matches (PSMs) and feature vectors resulting from a database search, train a machine learning classifier, and recalibrate PSMs using the trained parameters, often yielding significantly more identified peptides across q-value thresholds. However, current state-of-the-art post-processors rely on shallow machine learning methods, such as support vector machines. In contrast, the powerful training capabilities of deep learning models have displayed superior performance to shallow models in an ever-growing number of other fields. In this work, we show that deep models significantly improve the recalibration of PSMs compared to the most accurate and widely-used post-processors, such as Percolator and PeptideProphet. Furthermore, we show that deep learning is able to adaptively analyze complex datasets and features for more accurate universal post-processing, leading to both improved Prosit analysis and markedly better recalibration of recently developed database-search functions.


2013 ◽  
Vol 12 (3) ◽  
pp. 1108-1119 ◽  
Author(s):  
Ling Jian ◽  
Xinnan Niu ◽  
Zhonghang Xia ◽  
Parimal Samir ◽  
Chiranthani Sumanasekera ◽  
...  

2007 ◽  
Vol 4 (11) ◽  
pp. 923-925 ◽  
Author(s):  
Lukas Käll ◽  
Jesse D Canterbury ◽  
Jason Weston ◽  
William Stafford Noble ◽  
Michael J MacCoss

2018 ◽  
Author(s):  
Hao Chi ◽  
Chao Liu ◽  
Hao Yang ◽  
Wen-Feng Zeng ◽  
Long Wu ◽  
...  

ABSTRACTShotgun proteomics has grown rapidly in recent decades, but a large fraction of tandem mass spectrometry (MS/MS) data in shotgun proteomics are not successfully identified. We have developed a novel database search algorithm, Open-pFind, to efficiently identify peptides even in an ultra-large search space which takes into account unexpected modifications, amino acid mutations, semi- or non-specific digestion and co-eluting peptides. Tested on two metabolically labeled MS/MS datasets, Open-pFind reported 50.5‒117.0% more peptide-spectrum matches (PSMs) than the seven other advanced algorithms. More importantly, the Open-pFind results were more credible judged by the verification experiments using stable isotopic labeling. Tested on four additional large-scale datasets, 70‒85% of the spectra were confidently identified, and high-quality spectra were nearly completely interpreted by Open-pFind. Further, Open-pFind was over 40 times faster than the other three open search algorithms and 2‒3 times faster than three restricted search algorithms. Re-analysis of an entire human proteome dataset consisting of ∼25 million spectra using Open-pFind identified a total of 14,064 proteins encoded by 12,723 genes by requiring at least two uniquely identified peptides. In this search results, Open-pFind also excelled in an independent test for false positives based on the presence or absence of olfactory receptors. Thus, a practical use of the open search strategy has been realized by Open-pFind for the truly global-scale proteomics experiments of today and in the future.


2017 ◽  
Vol 163 ◽  
pp. 118-125 ◽  
Author(s):  
Shu-Rong Zhang ◽  
Yi-Chu Shan ◽  
Hao Jiang ◽  
Jian-Hui Liu ◽  
Yuan Zhou ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document