scholarly journals Prediction of protein-protein interactions using one-class classification methods and integrating diverse biological data

2007 ◽  
Vol 4 (3) ◽  
pp. 208-223 ◽  
Author(s):  
José A. Reyes ◽  
David Gilbert

Summary This research addresses the problem of prediction of protein-protein interactions (PPI) when integrating diverse kinds of biological information. This task has been commonly viewed as a binary classification problem (whether any two proteins do or do not interact) and several different machine learning techniques have been employed to solve this task. However the nature of the data creates two major problems which can affect results. These are firstly imbalanced class problems due to the number of positive examples (pairs of proteins which really interact) being much smaller than the number of negative ones. Secondly the selection of negative examples can be based on some unreliable assumptions which could introduce some bias in the classification results.Here we propose the use of one-class classification (OCC) methods to deal with the task of prediction of PPI. OCC methods utilise examples of just one class to generate a predictive model which consequently is independent of the kind of negative examples selected; additionally these approaches are known to cope with imbalanced class problems. We have designed and carried out a performance evaluation study of several OCC methods for this task, and have found that the Parzen density estimation approach outperforms the rest. We also undertook a comparative performance evaluation between the Parzen OCC method and several conventional learning techniques, considering different scenarios, for example varying the number of negative examples used for training purposes. We found that the Parzen OCC method in general performs competitively with traditional approaches and in many situations outperforms them. Finally we evaluated the ability of the Parzen OCC approach to predict new potential PPI targets, and validated these results by searching for biological evidence in the literature.

2017 ◽  
Author(s):  
Khalid Raza

AbstractThe long awaited challenge of post-genomic era and systems biology research is computational prediction of protein-protein interactions (PPIs) that ultimately lead to protein functions prediction. The important research questions is how protein complexes with known sequence and structure be used to identify and classify protein binding sites, and how to infer knowledge from these classification such as predicting PPIs of proteins with unknown sequence and structure. Several machine learning techniques have been applied for the prediction of PPIs, but the accuracy of their prediction wholly depends on the number of features being used for training. In this paper, we have performed a survey of protein features used for the prediction of PPIs. The open research challenges and opportunities in the area have also been discussed.


2019 ◽  
Vol 19 (4) ◽  
pp. 232-241 ◽  
Author(s):  
Xuegong Chen ◽  
Wanwan Shi ◽  
Lei Deng

Background: Accumulating experimental studies have indicated that disease comorbidity causes additional pain to patients and leads to the failure of standard treatments compared to patients who have a single disease. Therefore, accurate prediction of potential comorbidity is essential to design more efficient treatment strategies. However, only a few disease comorbidities have been discovered in the clinic. Objective: In this work, we propose PCHS, an effective computational method for predicting disease comorbidity. Materials and Methods: We utilized the HeteSim measure to calculate the relatedness score for different disease pairs in the global heterogeneous network, which integrates six networks based on biological information, including disease-disease associations, drug-drug interactions, protein-protein interactions and associations among them. We built the prediction model using the Support Vector Machine (SVM) based on the HeteSim scores. Results and Conclusion: The results showed that PCHS performed significantly better than previous state-of-the-art approaches and achieved an AUC score of 0.90 in 10-fold cross-validation. Furthermore, some of our predictions have been verified in literatures, indicating the effectiveness of our method.


F1000Research ◽  
2017 ◽  
Vol 6 ◽  
pp. 2012 ◽  
Author(s):  
Hashem Koohy

In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades.


2019 ◽  
Vol 20 (3) ◽  
pp. 177-184 ◽  
Author(s):  
Nantao Zheng ◽  
Kairou Wang ◽  
Weihua Zhan ◽  
Lei Deng

Background:Targeting critical viral-host Protein-Protein Interactions (PPIs) has enormous application prospects for therapeutics. Using experimental methods to evaluate all possible virus-host PPIs is labor-intensive and time-consuming. Recent growth in computational identification of virus-host PPIs provides new opportunities for gaining biological insights, including applications in disease control. We provide an overview of recent computational approaches for studying virus-host PPI interactions.Methods:In this review, a variety of computational methods for virus-host PPIs prediction have been surveyed. These methods are categorized based on the features they utilize and different machine learning algorithms including classical and novel methods.Results:We describe the pivotal and representative features extracted from relevant sources of biological data, mainly include sequence signatures, known domain interactions, protein motifs and protein structure information. We focus on state-of-the-art machine learning algorithms that are used to build binary prediction models for the classification of virus-host protein pairs and discuss their abilities, weakness and future directions.Conclusion:The findings of this review confirm the importance of computational methods for finding the potential protein-protein interactions between virus and host. Although there has been significant progress in the prediction of virus-host PPIs in recent years, there is a lot of room for improvement in virus-host PPI prediction.


Author(s):  
Byung-Hoon Park ◽  
Phuongan Dam ◽  
Chongle Pan ◽  
Ying Xu ◽  
Al Geist ◽  
...  

Protein-protein interactions are fundamental to cellular processes. They are responsible for phenomena like DNA replication, gene transcription, protein translation, regulation of metabolic pathways, immunologic recognition, signal transduction, etc. The identification of interacting proteins is therefore an important prerequisite step in understanding their physiological functions. Due to the invaluable importance to various biophysical activities, reliable computational methods to infer protein-protein interactions from either structural or genome sequences are in heavy demand lately. Successful predictions, for instance, will facilitate a drug design process and the reconstruction of metabolic or regulatory networks. In this chapter, we review: (a) high-throughput experimental methods for identification of protein-protein interactions, (b) existing databases of protein-protein interactions, (c) computational approaches to predicting protein-protein interactions at both residue and protein levels, (d) various statistical and machine learning techniques to model protein-protein interactions, and (e) applications of protein-protein interactions in predicting protein functions. We also discuss intrinsic drawbacks of the existing approaches and future research directions.


2019 ◽  
Vol 26 (8) ◽  
pp. 601-619 ◽  
Author(s):  
Amit Sagar ◽  
Bin Xue

The interactions between RNAs and proteins play critical roles in many biological processes. Therefore, characterizing these interactions becomes critical for mechanistic, biomedical, and clinical studies. Many experimental methods can be used to determine RNA-protein interactions in multiple aspects. However, due to the facts that RNA-protein interactions are tissuespecific and condition-specific, as well as these interactions are weak and frequently compete with each other, those experimental techniques can not be made full use of to discover the complete spectrum of RNA-protein interactions. To moderate these issues, continuous efforts have been devoted to developing high quality computational techniques to study the interactions between RNAs and proteins. Many important progresses have been achieved with the application of novel techniques and strategies, such as machine learning techniques. Especially, with the development and application of CLIP techniques, more and more experimental data on RNA-protein interaction under specific biological conditions are available. These CLIP data altogether provide a rich source for developing advanced machine learning predictors. In this review, recent progresses on computational predictors for RNA-protein interaction were summarized in the following aspects: dataset, prediction strategies, and input features. Possible future developments were also discussed at the end of the review.


Parasitology ◽  
2012 ◽  
Vol 139 (9) ◽  
pp. 1103-1118 ◽  
Author(s):  
J. M. WASTLING ◽  
S. D. ARMSTRONG ◽  
R. KRISHNA ◽  
D. XIA

SUMMARYSystems biology aims to integrate multiple biological data types such as genomics, transcriptomics and proteomics across different levels of structure and scale; it represents an emerging paradigm in the scientific process which challenges the reductionism that has dominated biomedical research for hundreds of years. Systems biology will nevertheless only be successful if the technologies on which it is based are able to deliver the required type and quality of data. In this review we discuss how well positioned is proteomics to deliver the data necessary to support meaningful systems modelling in parasite biology. We summarise the current state of identification proteomics in parasites, but argue that a new generation of quantitative proteomics data is now needed to underpin effective systems modelling. We discuss the challenges faced to acquire more complete knowledge of protein post-translational modifications, protein turnover and protein-protein interactions in parasites. Finally we highlight the central role of proteome-informatics in ensuring that proteomics data is readily accessible to the user-community and can be translated and integrated with other relevant data types.


Sign in / Sign up

Export Citation Format

Share Document