scholarly journals An integration of deep learning with feature embedding for protein–protein interaction prediction

PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7126 ◽  
Author(s):  
Yu Yao ◽  
Xiuquan Du ◽  
Yanyu Diao ◽  
Huaixu Zhu

Protein–protein interactions are closely relevant to protein function and drug discovery. Hence, accurately identifying protein–protein interactions will help us to understand the underlying molecular mechanisms and significantly facilitate the drug discovery. However, the majority of existing computational methods for protein–protein interactions prediction are focused on the feature extraction and combination of features and there have been limited gains from the state-of-the-art models. In this work, a new residue representation method named Res2vec is designed for protein sequence representation. Residue representations obtained by Res2vec describe more precisely residue-residue interactions from raw sequence and supply more effective inputs for the downstream deep learning model. Combining effective feature embedding with powerful deep learning techniques, our method provides a general computational pipeline to infer protein–protein interactions, even when protein structure knowledge is entirely unknown. The proposed method DeepFE-PPI is evaluated on the S. Cerevisiae and human datasets. The experimental results show that DeepFE-PPI achieves 94.78% (accuracy), 92.99% (recall), 96.45% (precision), 89.62% (Matthew’s correlation coefficient, MCC) and 98.71% (accuracy), 98.54% (recall), 98.77% (precision), 97.43% (MCC), respectively. In addition, we also evaluate the performance of DeepFE-PPI on five independent species datasets and all the results are superior to the existing methods. The comparisons show that DeepFE-PPI is capable of predicting protein–protein interactions by a novel residue representation method and a deep learning classification framework in an acceptable level of accuracy. The codes along with instructions to reproduce this work are available from https://github.com/xal2019/DeepFE-PPI.

2021 ◽  
Author(s):  
Yunzhuo Zhou ◽  
Raghad Al-Jarf ◽  
Azadeh Alavi ◽  
Thanh Binh Nguyen ◽  
Carlos H. M. Rodrigues ◽  
...  

Abstract Protein phosphorylation acts as an essential on/off switch in many cellular signalling pathways, regulating protein function. This has led to ongoing interest in targeting kinases for therapeutic intervention. Computer-aided drug discovery has been proven a useful and cost-effective approach for facilitating prioritisation and enrichment of screening libraries. Limited effort, however, has been devoted to developing and tailoring in silico tools to assist the development of kinase inhibitors and providing relevant insights on what makes potent inhibitors. To fill this gap, here we developed kinCSM, an integrative computational tool capable of accurately identifying potent cyclin-dependent kinase 2 (CDK2) inhibitors, quantitatively predicting CDK2 ligand-kinase inhibition constants (pKi) and classify inhibition modes without kinase information. kinCSM predictive models were built using supervised learning and leveraged the concept of graph-based signatures to capture both physicochemical properties and geometry properties of small molecules. CDK2 inhibitors were accurately identified with Matthew’s Correlation Coefficients of up to 0.74, and inhibition constants predicted with Pearson’s correlation of up to 0.76, both with consistent performances of 0.66 and 0.68 on non-redundant blind tests, respectively. kinCSM was also able to identify the potential type of inhibition for a given molecule, achieving Matthew’s Correlation Coefficient of up to 0.80 on cross-validation and 0.73 on blind test. Analysing the molecular composition of kinase inhibitors revealed enriched chemical fragments in potent CDK2 inhibitors and different types of inhibitors, which provides insights into the molecular mechanisms behind ligand-kinase interactions. We believe kinCSM will be an invaluable tool to guide future kinase drug discovery. To aid the fast and accurate screening of potent CDK2 kinase inhibitors, we made kinCSM freely available online at http://biosig.unimelb.edu.au/kin_csm/.


2020 ◽  
Vol 27 (5) ◽  
pp. 359-369 ◽  
Author(s):  
Cheng Shi ◽  
Jiaxing Chen ◽  
Xinyue Kang ◽  
Guiling Zhao ◽  
Xingzhen Lao ◽  
...  

: Protein-related interaction prediction is critical to understanding life processes, biological functions, and mechanisms of drug action. Experimental methods used to determine proteinrelated interactions have always been costly and inefficient. In recent years, advances in biological and medical technology have provided us with explosive biological and physiological data, and deep learning-based algorithms have shown great promise in extracting features and learning patterns from complex data. At present, deep learning in protein research has emerged. In this review, we provide an introductory overview of the deep neural network theory and its unique properties. Mainly focused on the application of this technology in protein-related interactions prediction over the past five years, including protein-protein interactions prediction, protein-RNA\DNA, Protein– drug interactions prediction, and others. Finally, we discuss some of the challenges that deep learning currently faces.


2019 ◽  
Vol 26 (21) ◽  
pp. 3890-3910 ◽  
Author(s):  
Branislava Gemovic ◽  
Neven Sumonja ◽  
Radoslav Davidovic ◽  
Vladimir Perovic ◽  
Nevena Veljkovic

Background: The significant number of protein-protein interactions (PPIs) discovered by harnessing concomitant advances in the fields of sequencing, crystallography, spectrometry and two-hybrid screening suggests astonishing prospects for remodelling drug discovery. The PPI space which includes up to 650 000 entities is a remarkable reservoir of potential therapeutic targets for every human disease. In order to allow modern drug discovery programs to leverage this, we should be able to discern complete PPI maps associated with a specific disorder and corresponding normal physiology. Objective: Here, we will review community available computational programs for predicting PPIs and web-based resources for storing experimentally annotated interactions. Methods: We compared the capacities of prediction tools: iLoops, Struck2Net, HOMCOS, COTH, PrePPI, InterPreTS and PRISM to predict recently discovered protein interactions. Results: We described sequence-based and structure-based PPI prediction tools and addressed their peculiarities. Additionally, since the usefulness of prediction algorithms critically depends on the quality and quantity of the experimental data they are built on; we extensively discussed community resources for protein interactions. We focused on the active and recently updated primary and secondary PPI databases, repositories specialized to the subject or species, as well as databases that include both experimental and predicted PPIs. Conclusion: PPI complexes are the basis of important physiological processes and therefore, possible targets for cell-penetrating ligands. Reliable computational PPI predictions can speed up new target discoveries through prioritization of therapeutically relevant protein–protein complexes for experimental studies.


2020 ◽  
Vol 27 (37) ◽  
pp. 6306-6355 ◽  
Author(s):  
Marian Vincenzi ◽  
Flavia Anna Mercurio ◽  
Marilisa Leone

Background:: Many pathways regarding healthy cells and/or linked to diseases onset and progression depend on large assemblies including multi-protein complexes. Protein-protein interactions may occur through a vast array of modules known as protein interaction domains (PIDs). Objective:: This review concerns with PIDs recognizing post-translationally modified peptide sequences and intends to provide the scientific community with state of art knowledge on their 3D structures, binding topologies and potential applications in the drug discovery field. Method:: Several databases, such as the Pfam (Protein family), the SMART (Simple Modular Architecture Research Tool) and the PDB (Protein Data Bank), were searched to look for different domain families and gain structural information on protein complexes in which particular PIDs are involved. Recent literature on PIDs and related drug discovery campaigns was retrieved through Pubmed and analyzed. Results and Conclusion:: PIDs are rather versatile as concerning their binding preferences. Many of them recognize specifically only determined amino acid stretches with post-translational modifications, a few others are able to interact with several post-translationally modified sequences or with unmodified ones. Many PIDs can be linked to different diseases including cancer. The tremendous amount of available structural data led to the structure-based design of several molecules targeting protein-protein interactions mediated by PIDs, including peptides, peptidomimetics and small compounds. More studies are needed to fully role out, among different families, PIDs that can be considered reliable therapeutic targets, however, attacking PIDs rather than catalytic domains of a particular protein may represent a route to obtain selective inhibitors.


2020 ◽  
Vol 20 (10) ◽  
pp. 855-882
Author(s):  
Olivia Slater ◽  
Bethany Miller ◽  
Maria Kontoyianni

Drug discovery has focused on the paradigm “one drug, one target” for a long time. However, small molecules can act at multiple macromolecular targets, which serves as the basis for drug repurposing. In an effort to expand the target space, and given advances in X-ray crystallography, protein-protein interactions have become an emerging focus area of drug discovery enterprises. Proteins interact with other biomolecules and it is this intricate network of interactions that determines the behavior of the system and its biological processes. In this review, we briefly discuss networks in disease, followed by computational methods for protein-protein complex prediction. Computational methodologies and techniques employed towards objectives such as protein-protein docking, protein-protein interactions, and interface predictions are described extensively. Docking aims at producing a complex between proteins, while interface predictions identify a subset of residues on one protein that could interact with a partner, and protein-protein interaction sites address whether two proteins interact. In addition, approaches to predict hot spots and binding sites are presented along with a representative example of our internal project on the chemokine CXC receptor 3 B-isoform and predictive modeling with IP10 and PF4.


2018 ◽  
Vol 25 (1) ◽  
pp. 5-21 ◽  
Author(s):  
Ylenia Cau ◽  
Daniela Valensin ◽  
Mattia Mori ◽  
Sara Draghi ◽  
Maurizio Botta

14-3-3 is a class of proteins able to interact with a multitude of targets by establishing protein-protein interactions (PPIs). They are usually found in all eukaryotes with a conserved secondary structure and high sequence homology among species. 14-3-3 proteins are involved in many physiological and pathological cellular processes either by triggering or interfering with the activity of specific protein partners. In the last years, the scientific community has collected many evidences on the role played by seven human 14-3-3 isoforms in cancer or neurodegenerative diseases. Indeed, these proteins regulate the molecular mechanisms associated to these diseases by interacting with (i) oncogenic and (ii) pro-apoptotic proteins and (iii) with proteins involved in Parkinson and Alzheimer diseases. The discovery of small molecule modulators of 14-3-3 PPIs could facilitate complete understanding of the physiological role of these proteins, and might offer valuable therapeutic approaches for these critical pathological states.


2006 ◽  
Vol 11 (7) ◽  
pp. 854-863 ◽  
Author(s):  
Maxwell D. Cummings ◽  
Michael A. Farnum ◽  
Marina I. Nelen

The genomics revolution has unveiled a wealth of poorly characterized proteins. Scientists are often able to produce milligram quantities of proteins for which function is unknown or hypothetical, based only on very distant sequence homology. Broadly applicable tools for functional characterization are essential to the illumination of these orphan proteins. An additional challenge is the direct detection of inhibitors of protein-protein interactions (and allosteric effectors). Both of these research problems are relevant to, among other things, the challenge of finding and validating new protein targets for drug action. Screening collections of small molecules has long been used in the pharmaceutical industry as 1 method of discovering drug leads. Screening in this context typically involves a function-based assay. Given a sufficient quantity of a protein of interest, significant effort may still be required for functional characterization, assay development, and assay configuration for screening. Increasingly, techniques are being reported that facilitate screening for specific ligands for a protein of unknown function. Such techniques also allow for function-independent screening with better characterized proteins. ThermoFluor®, a screening instrument based on monitoring ligand effects on temperature-dependent protein unfolding, can be applied when protein function is unknown. This technology has proven useful in the decryption of an essential bacterial enzyme and in the discovery of a series of inhibitors of a cancer-related, protein-protein interaction. The authors review some of the tools relevant to these research problems in drug discovery, and describe our experiences with 2 different proteins.


2008 ◽  
Vol 412 (1) ◽  
pp. 163-170 ◽  
Author(s):  
Alon Herschhorn ◽  
Iris Oz-Gleenberg ◽  
Amnon Hizi

The RT (reverse transcriptase) of HIV-1 interacts with HIV-1 IN (integrase) and inhibits its enzymatic activities. However, the molecular mechanisms underling these interactions are not well understood. In order to study these mechanisms, we have analysed the interactions of HIV-1 IN with HIV-1 RT and with two other related RTs: those of HIV-2 and MLV (murine-leukaemia virus). All three RTs inhibited HIV-1 IN, albeit to a different extent, suggesting a common site of binding that could be slightly modified for each one of the studied RTs. Using surface plasmon resonance technology, which monitors direct protein–protein interactions, we performed kinetic analyses of the binding of HIV-1 IN to these three RTs and observed interesting binding patterns. The interaction of HIV-1 RT with HIV-1 IN was unique and followed a two-state reaction model. According to this model, the initial IN–RT complex formation was followed by a conformational change in the complex that led to an elevation of the total affinity between these two proteins. In contrast, HIV-2 and MLV RTs interacted with IN in a simple bi-molecular manner, without any apparent secondary conformational changes. Interestingly, HIV-1 and HIV-2 RTs were the most efficient inhibitors of HIV-1 IN activity, whereas HIV-1 and MLV RTs showed the highest affinity towards HIV-1 IN. These modes of direct protein interactions, along with the apparent rate constants calculated and the correlations of the interaction kinetics with the capacity of the RTs to inhibit IN activities, are all discussed.


Sign in / Sign up

Export Citation Format

Share Document