scholarly journals Evolution of Sequence-based Bioinformatics Tools for Protein-protein Interaction Prediction

2020 ◽  
Vol 21 (6) ◽  
pp. 454-463 ◽  
Author(s):  
Mst. Shamima Khatun ◽  
Watshara Shoombuatong ◽  
Md. Mehedi Hasan ◽  
Hiroyuki Kurata

Protein-protein interactions (PPIs) are the physical connections between two or more proteins via electrostatic forces or hydrophobic effects. Identification of the PPIs is pivotal, which contributes to many biological processes including protein function, disease incidence, and therapy design. The experimental identification of PPIs via high-throughput technology is time-consuming and expensive. Bioinformatics approaches are expected to solve such restrictions. In this review, our main goal is to provide an inclusive view of the existing sequence-based computational prediction of PPIs. Initially, we briefly introduce the currently available PPI databases and then review the state-of-the-art bioinformatics approaches, working principles, and their performances. Finally, we discuss the caveats and future perspective of the next generation algorithms for the prediction of PPIs.

2006 ◽  
Vol 11 (7) ◽  
pp. 854-863 ◽  
Author(s):  
Maxwell D. Cummings ◽  
Michael A. Farnum ◽  
Marina I. Nelen

The genomics revolution has unveiled a wealth of poorly characterized proteins. Scientists are often able to produce milligram quantities of proteins for which function is unknown or hypothetical, based only on very distant sequence homology. Broadly applicable tools for functional characterization are essential to the illumination of these orphan proteins. An additional challenge is the direct detection of inhibitors of protein-protein interactions (and allosteric effectors). Both of these research problems are relevant to, among other things, the challenge of finding and validating new protein targets for drug action. Screening collections of small molecules has long been used in the pharmaceutical industry as 1 method of discovering drug leads. Screening in this context typically involves a function-based assay. Given a sufficient quantity of a protein of interest, significant effort may still be required for functional characterization, assay development, and assay configuration for screening. Increasingly, techniques are being reported that facilitate screening for specific ligands for a protein of unknown function. Such techniques also allow for function-independent screening with better characterized proteins. ThermoFluor®, a screening instrument based on monitoring ligand effects on temperature-dependent protein unfolding, can be applied when protein function is unknown. This technology has proven useful in the decryption of an essential bacterial enzyme and in the discovery of a series of inhibitors of a cancer-related, protein-protein interaction. The authors review some of the tools relevant to these research problems in drug discovery, and describe our experiences with 2 different proteins.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 1919 ◽  
Author(s):  
Madhavi K. Ganapathiraju

After the first reported case of Zika virus in Brazil, in 2015, a significant increase in the reported cases of microcephaly was observed. Microcephaly is a neurological condition in which the infant’s head is significantly smaller with complications in brain development. Recently, two small membrane-associated interferon-inducible transmembrane proteins (IFITM1 and IFITM3) have been shown to repress members of the flaviviridae family which includes the Zika virus. However, the exact mechanisms leading to the inhibition of the virus are yet unknown. Here, we assembled an interactome of IFITM1 and IFITM3 with known protein-protein interactions (PPIs) collected from publicly available databases and novel PPIs predicted using High-confidence Protein-Protein Interaction Prediction (HiPPIP) model. We analyzed the functional and pathway associations of the interacting proteins, and found that there are several immunity pathways (interferon signaling, cd28 signaling in T-helper cells crosstalk between dendritic cells and natural killer cells), neuronal pathways (axonal guidance signaling, neural tube closure and actin cytoskeleton signaling) and developmental pathways that are associated with these interactors. These results could help direct future research in elucidating the mechanisms underlying the viral immunity to Zika virus and other flaviviruses.


2021 ◽  
Author(s):  
Joseph Szymborski ◽  
Amin Emad

Motivation: Computational methods for the prediction of protein-protein interactions, while important tools for researchers, are plagued by challenges in generalising to unseen proteins. Datasets used for modelling protein-protein predictions are particularly predisposed to information leakage and sampling biases. Results: In this study, we introduce RAPPPID, a method for the Regularised Automatic Prediction of Protein-Protein Interactions using Deep Learning. RAPPPID is a twin AWD-LSTM network which employs multiple regularisation methods during training time to learn generalised weights. Testing on stringent interaction datasets composed of proteins not seen during training, RAPPPID outperforms state-of-the-art methods. Further experiments show that RAPPPID's performance holds regardless of the particular proteins in the testing set and its performance is higher for biologically supported edges. This study serves to demonstrate that appropriate regularisation is an important component of overcoming the challenges of creating models for protein-protein interaction prediction that generalise to unseen proteins. Availability and Implementation: Code and datasets are freely available at https://github.com/jszym/rapppid. Contact: [email protected] Supplementary Information: Online-only supplementary data is available at the journal's website.


Molecules ◽  
2021 ◽  
Vol 27 (1) ◽  
pp. 41
Author(s):  
Brandan Dunham ◽  
Madhavi K. Ganapathiraju

Protein–protein interactions (PPIs) perform various functions and regulate processes throughout cells. Knowledge of the full network of PPIs is vital to biomedical research, but most of the PPIs are still unknown. As it is infeasible to discover all of them experimentally due to technical and resource limitations, computational prediction of PPIs is essential and accurately assessing the performance of algorithms is required before further application or translation. However, many published methods compose their evaluation datasets incorrectly, using a higher proportion of positive class data than occuring naturally, leading to exaggerated performance. We re-implemented various published algorithms and evaluated them on datasets with realistic data compositions and found that their performance is overstated in original publications; with several methods outperformed by our control models built on ‘illogical’ and random number features. We conclude that these methods are influenced by an over-characterization of some proteins in the literature and due to scale-free nature of PPI network and that they fail when tested on all possible protein pairs. Additionally, we found that sequence-only-based algorithms performed worse than those that employ functional and expression features. We present a benchmark evaluation of many published algorithms for PPI prediction. The source code of our implementations and the benchmark datasets created here are made available in open source.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Thi Ngan Dong ◽  
Graham Brogden ◽  
Gisa Gerold ◽  
Megha Khosla

Abstract Background Viral infections are causing significant morbidity and mortality worldwide. Understanding the interaction patterns between a particular virus and human proteins plays a crucial role in unveiling the underlying mechanism of viral infection and pathogenesis. This could further help in prevention and treatment of virus-related diseases. However, the task of predicting protein–protein interactions between a new virus and human cells is extremely challenging due to scarce data on virus-human interactions and fast mutation rates of most viruses. Results We developed a multitask transfer learning approach that exploits the information of around 24 million protein sequences and the interaction patterns from the human interactome to counter the problem of small training datasets. Instead of using hand-crafted protein features, we utilize statistically rich protein representations learned by a deep language modeling approach from a massive source of protein sequences. Additionally, we employ an additional objective which aims to maximize the probability of observing human protein–protein interactions. This additional task objective acts as a regularizer and also allows to incorporate domain knowledge to inform the virus-human protein–protein interaction prediction model. Conclusions Our approach achieved competitive results on 13 benchmark datasets and the case study for the SARS-CoV-2 virus receptor. Experimental results show that our proposed model works effectively for both virus-human and bacteria-human protein–protein interaction prediction tasks. We share our code for reproducibility and future research at https://git.l3s.uni-hannover.de/dong/multitask-transfer.


2018 ◽  
Vol 31 (9) ◽  
pp. 899-902 ◽  
Author(s):  
Cleverson Carlos Matiolli ◽  
Maeli Melotto

Yeast-two-hybrid (Y2H) cDNA library screening is a valuable tool to uncover protein-protein interactions and represents a widely used method to investigate protein function. However, low transcript representation in cDNA libraries limits the depth of the screening. We have developed a Y2H library with cDNA made from Arabidopsis leaves exposed to several stressors as well as untreated leaves. The library was built using pooled mRNA extracted from plants challenged with plant and human bacterial pathogens, the flg22 elicitor, the phytotoxin coronatine, and several hormones associated with environmental stress responses. The purpose of such a library is to maximize the discovery of protein-protein interactions that occur under optimum conditions as well as during biotic and abiotic stresses.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Shengchen Wang ◽  
Faying Zhang ◽  
Meng Mei ◽  
Ting Wang ◽  
Yueli Yun ◽  
...  

AbstractCharacterizing protein–protein interactions (PPIs) is an effective method to help explore protein function. Here, through integrating a newly identified split human Rhinovirus 3 C (HRV 3 C) protease, super-folder GFP (sfGFP), and ClpXP-SsrA protein degradation machinery, we developed a fluorescence-assisted single-cell methodology (split protease-E. coli ClpXP (SPEC)) to explore protein–protein interactions for both eukaryotic and prokaryotic species in E. coli cells. We firstly identified a highly efficient split HRV 3 C protease with high re-assembly ability and then incorporated it into the SPEC method. The SPEC method could convert the cellular protein-protein interaction to quantitative fluorescence signals through a split HRV 3 C protease-mediated proteolytic reaction with high efficiency and broad temperature adaptability. Using SPEC method, we explored the interactions among effectors of representative type I-E and I-F CRISPR/Cas complexes, which combining with subsequent studies of Cas3 mutations conferred further understanding of the functions and structures of CRISPR/Cas complexes.


2019 ◽  
Author(s):  
Hassan Kané ◽  
Mohamed Coulibali ◽  
Ali Abdalla ◽  
Pelkins Ajanoh

ABSTRACTComputational methods that infer the function of proteins are key to understanding life at the molecular level. In recent years, representation learning has emerged as a powerful paradigm to discover new patterns among entities as varied as images, words, speech, molecules. In typical representation learning, there is only one source of data or one level of abstraction at which the learned representation occurs. However, proteins can be described by their primary, secondary, tertiary, and quaternary structure or even as nodes in protein-protein interaction networks. Given that protein function is an emergent property of all these levels of interactions in this work, we learn joint representations from both amino acid sequence and multilayer networks representing tissue-specific protein-protein interactions. Using these hybrid representations, we show that simple machine learning models trained using these hybrid representations outperform existing network-based methods on the task of tissue-specific protein function prediction on 13 out of 13 tissues. Furthermore, these representations outperform existing ones by 14% on average.


2005 ◽  
Vol 34 (2) ◽  
pp. 263-280 ◽  
Author(s):  
Arnaud Droit ◽  
Guy G Poirier ◽  
Joanna M Hunter

An ambitious goal of proteomics is to elucidate the structure, interactions and functions of all proteins within cells and organisms. One strategy to determine protein function is to identify the protein–protein interactions. The increasing use of high-throughput and large-scale bioinformatics-based studies has generated a massive amount of data stored in a number of different databases. A challenge for bioinformatics is to explore this disparate data and to uncover biologically relevant interactions and pathways. In parallel, there is clearly a need for the development of approaches that can predict novel protein–protein interaction networks in silico. Here, we present an overview of different experimental and bioinformatic methods to elucidate protein–protein interactions.


2016 ◽  
Author(s):  
Claudio Mirabello ◽  
Björn Wallner

AbstractProtein-protein interactions (PPI) are crucial for protein function. There exist many techniques to identify PPIs experimentally, but to determine the interactions in molecular detail is still difficult and very time-consuming. The fact that the number of PPIs is vastly larger than the number of individual proteins makes it practically impossible to characterize all interactions experimentally. Computational approaches that can bridge this gap and predict PPIs and model the interactions in molecular detail are greatly needed. Here we present InterPred, a fully automated pipeline that predicts and model PPIs from sequence using structural modelling combined with massive structural comparisons and molecular docking. A key component of the method is the use of a novel random forest classifier that integrate several structural features to distinguish correct from incorrect protein-protein interaction models. We show that InterPred represents a major improvement in protein-protein interaction detection with a performance comparable or better than experimental high-throughput techniques. We also show that our full-atom protein-protein complex modelling pipeline performs better than state of the art protein docking methods on a standard benchmark set. In addition, InterPred was also one of the top predictors in the latest CAPRI37 experiment.InterPred source code can be downloaded from http://wallnerlab.org/InterPred


Sign in / Sign up

Export Citation Format

Share Document