scholarly journals PIPP: Improving peptide identity propagation using neural networks

2021 ◽  
Author(s):  
Soroor Hediyeh-zadeh ◽  
Jarryd Martin ◽  
Melissa J. Davis ◽  
Andrew I. Webb

AbstractPeptide identity propagation (PIP) can substantially reduce missing values in label-free mass spectrometry quantification by transferring peptides identified by tandem mass (MS/MS) spectra in one run to experimentally related runs where the peptides are not identified by MS/MS. The existing frameworks for matching identifications between runs perform peak tracing and propagation based on similarity of precursor features using only a limited number of dimensions available in MS1 data. These approaches do not produce accompanying confidence estimates and hence cannot filter probable false positive identity transfers. We introduce an embedding based PIP that uses a higher dimensional representation of MS1 measurements that is optimized to capture peptide identities using deep neural networks. We developed a propagation framework that works entirely on MaxQuant results. Current PIP workflows typically perform propagation mainly using two feature dimensions, and rely on deterministic tolerances for identification transfer. Our framework overcomes both these limitations while additionally assigning probabilities to each transferred identity. The proposed embedding approach enables quantification of the empirical false discovery rate (FDR) for peptide identification, while also increasing depth of coverage through coembedding the runs from the experiment with experimental libraries. In published datasets with technical and biological variability, we demonstrate that our method reduces missing values in MaxQuant results, maintains high quantification precision and accuracy, and low false transfer rate.

Author(s):  
Todd C Hollon ◽  
Balaji Pandian ◽  
Esteban Urias ◽  
Akshay V Save ◽  
Arjun R Adapa ◽  
...  

Abstract Background Detection of glioma recurrence remains a challenge in modern neuro-oncology. Noninvasive radiographic imaging is unable to definitively differentiate true recurrence versus pseudoprogression. Even in biopsied tissue, it can be challenging to differentiate recurrent tumor and treatment effect. We hypothesized that intraoperative stimulated Raman histology (SRH) and deep neural networks can be used to improve the intraoperative detection of glioma recurrence. Methods We used fiber laser–based SRH, a label-free, nonconsumptive, high-resolution microscopy method (<60 sec per 1 × 1 mm2) to image a cohort of patients (n = 35) with suspected recurrent gliomas who underwent biopsy or resection. The SRH images were then used to train a convolutional neural network (CNN) and develop an inference algorithm to detect viable recurrent glioma. Following network training, the performance of the CNN was tested for diagnostic accuracy in a retrospective cohort (n = 48). Results Using patch-level CNN predictions, the inference algorithm returns a single Bernoulli distribution for the probability of tumor recurrence for each surgical specimen or patient. The external SRH validation dataset consisted of 48 patients (recurrent, 30; pseudoprogression, 18), and we achieved a diagnostic accuracy of 95.8%. Conclusion SRH with CNN-based diagnosis can be used to improve the intraoperative detection of glioma recurrence in near-real time. Our results provide insight into how optical imaging and computer vision can be combined to augment conventional diagnostic methods and improve the quality of specimen sampling at glioma recurrence.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Mathias Kalxdorf ◽  
Torsten Müller ◽  
Oliver Stegle ◽  
Jeroen Krijgsveld

AbstractLabel-free proteomics by data-dependent acquisition enables the unbiased quantification of thousands of proteins, however it notoriously suffers from high rates of missing values, thus prohibiting consistent protein quantification across large sample cohorts. To solve this, we here present IceR (Ion current extraction Re-quantification), an efficient and user-friendly quantification workflow that combines high identification rates of data-dependent acquisition with low missing value rates similar to data-independent acquisition. Specifically, IceR uses ion current information for a hybrid peptide identification propagation approach with superior quantification precision, accuracy, reliability and data completeness compared to other quantitative workflows. Applied to plasma and single-cell proteomics data, IceR enhanced the number of reliably quantified proteins, improved discriminability between single-cell populations, and allowed reconstruction of a developmental trajectory. IceR will be useful to improve performance of large scale global as well as low-input proteomics applications, facilitated by its availability as an easy-to-use R-package.


Author(s):  
Boyang Liu ◽  
Ding Wang ◽  
Kaixiang Lin ◽  
Pang-Ning Tan ◽  
Jiayu Zhou

Unsupervised anomaly detection plays a crucial role in many critical applications. Driven by the success of deep learning, recent years have witnessed growing interests in applying deep neural networks (DNNs) to anomaly detection problems. A common approach is using autoencoders to learn a feature representation for the normal observations in the data. The reconstruction error of the autoencoder is then used as outlier scores to detect the anomalies. However, due to the high complexity brought upon by the over-parameterization of DNNs, the reconstruction error of the anomalies could also be small, which hampers the effectiveness of these methods. To alleviate this problem, we propose a robust framework using collaborative autoencoders to jointly identify normal observations from the data while learning its feature representation. We investigate the theoretical properties of the framework and empirically show its outstanding performance as compared to other DNN-based methods. Our experimental results also show the resiliency of the framework to missing values compared to other baseline methods.


2020 ◽  
Author(s):  
Mathias Kalxdorf ◽  
Torsten Müller ◽  
Oliver Stegle ◽  
Jeroen Krijgsveld

AbstractLabel-free proteomics by data-dependent acquisition (DDA) enables the unbiased quantification of thousands of proteins, however it notoriously suffers from high rates of missing values, thus prohibiting consistent protein quantification across large sample cohorts. To solve this, we here present IceR, an efficient and user-friendly quantification workflow that combines high identification rates of DDA with low missing value rates similar to DIA. Specifically, IceR uses ion current information in DDA data for a hybrid peptide identification propagation (PIP) approach with superior quantification precision, accuracy, reliability and data completeness compared to other quantitative workflows. We demonstrate greatly improved quantification sensitivity on published plasma and single-cell proteomics data, enhancing the number of reliably quantified proteins, improving discriminability between single-cell populations, and allowing reconstruction of a developmental trajectory. IceR will be useful to improve performance of large scale global as well as low-input proteomics applications, facilitated by its availability as an easy-to-use R-package.


Author(s):  
Alex Hernández-García ◽  
Johannes Mehrer ◽  
Nikolaus Kriegeskorte ◽  
Peter König ◽  
Tim C. Kietzmann

2018 ◽  
Author(s):  
Chi Zhang ◽  
Xiaohan Duan ◽  
Ruyuan Zhang ◽  
Li Tong

Sign in / Sign up

Export Citation Format

Share Document