scholarly journals DeepPhospho accelerates DIA phosphoproteome profiling through in silico library generation

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Ronghui Lou ◽  
Weizhen Liu ◽  
Rongjie Li ◽  
Shanshan Li ◽  
Xuming He ◽  
...  

AbstractPhosphoproteomics integrating data-independent acquisition (DIA) enables deep phosphoproteome profiling with improved quantification reproducibility and accuracy compared to data-dependent acquisition (DDA)-based phosphoproteomics. DIA data mining heavily relies on a spectral library that in most cases is built on DDA analysis of the same sample. Construction of this project-specific DDA library impairs the analytical throughput, limits the proteome coverage, and increases the sample size for DIA phosphoproteomics. Herein we introduce a deep neural network, DeepPhospho, which conceptually differs from previous deep learning models to achieve accurate predictions of LC-MS/MS data for phosphopeptides. By leveraging in silico libraries generated by DeepPhospho, we establish a DIA workflow for phosphoproteome profiling which involves DIA data acquisition and data mining with DeepPhospho predicted libraries, thus circumventing the need of DDA library construction. Our DeepPhospho-empowered workflow substantially expands the phosphoproteome coverage while maintaining high quantification performance, which leads to the discovery of more signaling pathways and regulated kinases in an EGF signaling study than the DDA library-based approach. DeepPhospho is provided as a web server as well as an offline app to facilitate user access to model training, predictions and library generation.

2021 ◽  
Author(s):  
Wenqing Shui ◽  
Ronghui Lou ◽  
Weizhen Liu ◽  
Rongjie Li ◽  
Shanshan Li ◽  
...  

Abstract Phosphoproteomics integrating data-independent acquisition (DIA) has enabled deep phosphoproteome profiling with improved quantification reproducibility and accuracy compared to data-dependent acquisition (DDA)-based phosphoproteomics. DIA data mining heavily relies on a spectral library that in most cases is built on DDA analysis of the same sample. Construction of this project-specific DDA library impairs the analytical throughput, limits the proteome coverage, and increases the sample size for DIA phosphoproteomics. Herein we introduce a novel deep neural network, DeepPhospho, which conceptually differs from previous deep learning models to achieve accurate predictions of LC-MS/MS data for phosphopeptides. By leveraging in silico libraries generated by DeepPhospho, we established a new DIA workflow for phosphoproteome profiling which involves DIA data acquisition and data mining with DeepPhospho predicted libraries, thus circumventing the need of DDA library construction. Our DeepPhospho-empowered workflow substantially expanded the phosphoproteome coverage while maintaining high quantification performance, which led to the discovery of more signaling pathways and regulated kinases in an EGF signaling study than the DDA library-based approach. DeepPhospho is provided as a web server to facilitate user access to predictions and library generation.


2020 ◽  
Author(s):  
Weigang Ge ◽  
Xiao Liang ◽  
Fangfei Zhang ◽  
Luang Xu ◽  
Nan Xiang ◽  
...  

AbstractEfficient peptide and protein identification from data-independent acquisition mass spectrometric (DIA-MS) data typically rely on an experiment-specific spectral library with a suitable size. Here, we report a computational strategy for optimizing the spectral library for a specific DIA dataset based on a comprehensive spectral library, which is accomplished by a priori analysis of the DIA dataset. This strategy achieved up to 44.7% increase in peptide identification and 38.1% increase in protein identification in the test dataset of six colorectal tumor samples compared with the comprehensive pan-human library strategy. We further applied this strategy to 389 carcinoma samples from 15 tumor datasets and observed up to 39.2% increase in peptide identification and 19.0% increase in protein identification. In summary, we present a computational strategy for spectral library size optimization to achieve deeper proteome coverage of DIA-MS data.


Author(s):  
Asad Ali Siyal ◽  
Eric Sheng-Wen Chen ◽  
Hsin-Ju Chan ◽  
Reta Birhanu Kitata ◽  
Jhih-Ci Yang ◽  
...  

Author(s):  
Darren R Allen ◽  
Christopher Warnholtz ◽  
Brett C McWhinney

Abstract An interference resulting in the false-positive detection of the synthetic cathinone 4-MePPP in urine was suspected following the recent addition of 4-MePPP spectral data to an LC-QTOF-MS drug library. Although positive detection criteria were achieved, it was noted that all urine samples suspected of containing 4-MePPP also concurrently contained high levels of tramadol and its associated metabolites. Using QTOF-MS software elucidation tools, candidate compounds for the suspected interference were proposed. To provide further confidence in the identity of the interference, in silico fragmentation tools were used to match product ions generated in the analysis with product ions predicted from the theoretical fragmentation of candidate compounds. The ability of the suspected interference to subsequently produce the required product ions for spectral library identification of 4-MePPP was also tested. This information was used to provide a high preliminary confidence in the compound identity prior to purchase and subsequent confirmation with certified reference material. A co-eluting isobaric interference was identified and confirmed as an in-source fragment of the tramadol metabolite, N,N-bisdesmethyltramadol. Proposed resolutions for this interference are also described and subsequently validated by retrospective interrogation of previous cases of suspected interference.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Stevan D. Stojanović ◽  
Maximilian Fuchs ◽  
Chunguang Liang ◽  
Kevin Schmidt ◽  
Ke Xiao ◽  
...  

AbstractThe family of RNA-binding proteins (RBP) functions as a crucial regulator of multiple biological processes and diseases. However, RBP function in the clinical setting of idiopathic pulmonary fibrosis (IPF) is still unknown. We developed a practical in silico screening approach for the characterization of RBPs using multi-sources data information and comparative molecular network bioinformatics followed by wet-lab validation studies. Data mining of bulk RNA-Sequencing data of tissues of patients with IPF identified Quaking (QKI) as a significant downregulated RBP. Cell-type specific expression was confirmed by single-cell RNA-Sequencing analysis of IPF patient data. We systematically analyzed the molecular interaction network around QKI and its functional interplay with microRNAs (miRs) in human lung fibroblasts and discovered a novel regulatory miR-506-QKI axis contributing to the pathogenesis of IPF. The in silico results were validated by in-house experiments applying model systems of miR and lung biology. This study supports an understanding of the intrinsic molecular mechanisms of IPF regulated by the miR-506-QKI axis. Initially applied to human lung disease, the herein presented integrative in silico data mining approach can be adapted to other disease entities, underlining its practical relevance in RBP research.


2021 ◽  
Vol 11 (15) ◽  
pp. 7050
Author(s):  
Zeeshan Ahmad ◽  
Adnan Shahid Khan ◽  
Kashif Nisar ◽  
Iram Haider ◽  
Rosilah Hassan ◽  
...  

The revolutionary idea of the internet of things (IoT) architecture has gained enormous popularity over the last decade, resulting in an exponential growth in the IoT networks, connected devices, and the data processed therein. Since IoT devices generate and exchange sensitive data over the traditional internet, security has become a prime concern due to the generation of zero-day cyberattacks. A network-based intrusion detection system (NIDS) can provide the much-needed efficient security solution to the IoT network by protecting the network entry points through constant network traffic monitoring. Recent NIDS have a high false alarm rate (FAR) in detecting the anomalies, including the novel and zero-day anomalies. This paper proposes an efficient anomaly detection mechanism using mutual information (MI), considering a deep neural network (DNN) for an IoT network. A comparative analysis of different deep-learning models such as DNN, Convolutional Neural Network, Recurrent Neural Network, and its different variants, such as Gated Recurrent Unit and Long Short-term Memory is performed considering the IoT-Botnet 2020 dataset. Experimental results show the improvement of 0.57–2.6% in terms of the model’s accuracy, while at the same time reducing the FAR by 0.23–7.98% to show the effectiveness of the DNN-based NIDS model compared to the well-known deep learning models. It was also observed that using only the 16–35 best numerical features selected using MI instead of 80 features of the dataset result in almost negligible degradation in the model’s performance but helped in decreasing the overall model’s complexity. In addition, the overall accuracy of the DL-based models is further improved by almost 0.99–3.45% in terms of the detection accuracy considering only the top five categorical and numerical features.


Talanta ◽  
2021 ◽  
pp. 122740
Author(s):  
Annagiulia Di Trana ◽  
Pietro Brunetti ◽  
Raffaele Giorgetti ◽  
Enrico Marinelli ◽  
Simona Zaami ◽  
...  

2014 ◽  
Vol 687-691 ◽  
pp. 1592-1595
Author(s):  
Yun Peng Duan ◽  
Chun Xi Zhao ◽  
Ying Shi

With the widely application of the WWW and the emergence of Web technology, make the research of data mining has entered a new stage. Web log mining is based on the idea of data mining to analyze the server log processing. Paper aimed at the early stage of the data mining is put forward based on log data preprocessing methods, the purpose is to divide server logs into multiple unique user access sequence at a time, and to give a good algorithm.


2021 ◽  
Vol 6 ◽  
pp. 309
Author(s):  
Paul Mwaniki ◽  
Timothy Kamanu ◽  
Samuel Akech ◽  
M. J. C Eijkemans

Introduction: Epidemiological studies that involve interpretation of chest radiographs (CXRs) suffer from inter-reader and intra-reader variability. Inter-reader and intra-reader variability hinder comparison of results from different studies or centres, which negatively affects efforts to track the burden of chest diseases or evaluate the efficacy of interventions such as vaccines. This study explores machine learning models that could standardize interpretation of CXR across studies and the utility of incorporating individual reader annotations when training models using CXR data sets annotated by multiple readers. Methods: Convolutional neural networks were used to classify CXRs from seven low to middle-income countries into five categories according to the World Health Organization's standardized methodology for interpreting paediatric CXRs. We compared models trained to predict the final/aggregate classification with models trained to predict how each reader would classify an image and then aggregate predictions for all readers using unweighted mean. Results: Incorporating individual reader's annotations during model training improved classification accuracy by 3.4% (multi-class accuracy 61% vs 59%). Model accuracy was higher for children above 12 months of age (68% vs 58%). The accuracy of the models in different countries ranged between 45% and 71%. Conclusions: Machine learning models can annotate CXRs in epidemiological studies reducing inter-reader and intra-reader variability. In addition, incorporating individual reader annotations can improve the performance of machine learning models trained using CXRs annotated by multiple readers.


2021 ◽  
Author(s):  
Noor Ahmad ◽  
Muhammad Aminu ◽  
Mohd Halim Mohd Noor

Deep learning approaches have attracted a lot of attention in the automatic detection of Covid-19 and transfer learning is the most common approach. However, majority of the pre-trained models are trained on color images, which can cause inefficiencies when fine-tuning the models on Covid-19 images which are often grayscale. To address this issue, we propose a deep learning architecture called CovidNet which requires a relatively smaller number of parameters. CovidNet accepts grayscale images as inputs and is suitable for training with limited training dataset. Experimental results show that CovidNet outperforms other state-of-the-art deep learning models for Covid-19 detection.


Sign in / Sign up

Export Citation Format

Share Document