scholarly journals PubMed-Scale Chemical Concept Embeddings Reconstruct Physical Protein Interaction Networks

Author(s):  
Blaž Škrlj ◽  
Enja Kokalj ◽  
Nada Lavrač

PubMed is the largest resource of curated biomedical knowledge to date, entailing more than 25 million documents. Large quantities of novel literature prevent a single expert from keeping track of all potentially relevant papers, resulting in knowledge gaps. In this article, we present CHEMMESHNET, a newly developed PubMed-based network comprising more than 10,000,000 associations, constructed from expert-curated MeSH annotations of chemicals based on all currently available PubMed articles. By learning latent representations of concepts in the obtained network, we demonstrate in a proof of concept study that purely literature-based representations are sufficient for the reconstruction of a large part of the currently known network of physical, empirically determined protein–protein interactions. We demonstrate that simple linear embeddings of node pairs, when coupled with a neural network–based classifier, reliably reconstruct the existing collection of empirically confirmed protein–protein interactions. Furthermore, we demonstrate how pairs of learned representations can be used to prioritize potentially interesting novel interactions based on the common chemical context. Highly ranked interactions are qualitatively inspected in terms of potential complex formation at the structural level and represent potentially interesting new knowledge. We demonstrate that two protein–protein interactions, prioritized by structure-based approaches, also emerge as probable with regard to the trained machine-learning model.

2017 ◽  
Vol 114 (40) ◽  
pp. E8333-E8342 ◽  
Author(s):  
Maximilian G. Plach ◽  
Florian Semmelmann ◽  
Florian Busch ◽  
Markus Busch ◽  
Leonhard Heizinger ◽  
...  

Cells contain a multitude of protein complexes whose subunits interact with high specificity. However, the number of different protein folds and interface geometries found in nature is limited. This raises the question of how protein–protein interaction specificity is achieved on the structural level and how the formation of nonphysiological complexes is avoided. Here, we describe structural elements called interface add-ons that fulfill this function and elucidate their role for the diversification of protein–protein interactions during evolution. We identified interface add-ons in 10% of a representative set of bacterial, heteromeric protein complexes. The importance of interface add-ons for protein–protein interaction specificity is demonstrated by an exemplary experimental characterization of over 30 cognate and hybrid glutamine amidotransferase complexes in combination with comprehensive genetic profiling and protein design. Moreover, growth experiments showed that the lack of interface add-ons can lead to physiologically harmful cross-talk between essential biosynthetic pathways. In sum, our complementary in silico, in vitro, and in vivo analysis argues that interface add-ons are a practical and widespread evolutionary strategy to prevent the formation of nonphysiological complexes by specializing protein–protein interactions.


2015 ◽  
Vol 2 (9) ◽  
pp. 150156 ◽  
Author(s):  
Georgia Tsagkogeorga ◽  
Michael R. McGowen ◽  
Kalina T. J. Davies ◽  
Simon Jarman ◽  
Andrea Polanowski ◽  
...  

Recent studies have reported multiple cases of molecular adaptation in cetaceans related to their aquatic abilities. However, none of these has included the hippopotamus, precluding an understanding of whether molecular adaptations in cetaceans occurred before or after they split from their semi-aquatic sister taxa. Here, we obtained new transcriptomes from the hippopotamus and humpback whale, and analysed these together with available data from eight other cetaceans. We identified more than 11 000 orthologous genes and compiled a genome-wide dataset of 6845 coding DNA sequences among 23 mammals, to our knowledge the largest phylogenomic dataset to date for cetaceans. We found positive selection in nine genes on the branch leading to the common ancestor of hippopotamus and whales, and 461 genes in cetaceans compared to 64 in hippopotamus. Functional annotation revealed adaptations in diverse processes, including lipid metabolism, hypoxia, muscle and brain function. By combining these findings with data on protein–protein interactions, we found evidence suggesting clustering among gene products relating to nervous and muscular systems in cetaceans. We found little support for shared ancestral adaptations in the two taxa; most molecular adaptations in extant cetaceans occurred after their split with hippopotamids.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Sven H. Giese ◽  
Ludwig R. Sinn ◽  
Fritz Wegner ◽  
Juri Rappsilber

AbstractCrosslinking mass spectrometry has developed into a robust technique that is increasingly used to investigate the interactomes of organelles and cells. However, the incomplete and noisy information in the mass spectra of crosslinked peptides limits the numbers of protein–protein interactions that can be confidently identified. Here, we leverage chromatographic retention time information to aid the identification of crosslinked peptides from mass spectra. Our Siamese machine learning model xiRT achieves highly accurate retention time predictions of crosslinked peptides in a multi-dimensional separation of crosslinked E. coli lysate. Importantly, supplementing the search engine score with retention time features leads to a substantial increase in protein–protein interactions without affecting confidence. This approach is not limited to cell lysates and multi-dimensional separation but also improves considerably the analysis of crosslinked multiprotein complexes with a single chromatographic dimension. Retention times are a powerful complement to mass spectrometric information to increase the sensitivity of crosslinking mass spectrometry analyses.


Author(s):  
Mamta Sagar ◽  
Padma Saxena ◽  
Suruchi Singh ◽  
Ravindra Nath ◽  
Pramod W. Ramteke

Molecular docking is an efficient way to study protein-protein and protein-ligand interactions in virtual mode, this provides structural annotations of molecular interactions, required in the drug discovery process. The Cartesian FFT approach in ‘Hex’ spherical polar Fourier (SPF) uses rotational correlations, this method is used here to study protein-protein interactions. Hepatitis B virus (HBV) X protein (HBx) is essential for virus infection and has been used in the development of therapeutics for liver cancer. It can interact with many cellular proteins. It interferes with cell viability and stimulates HBV replication. The von Hippel-Lindau binding protein 1(VBP1) has an important role in HBx-mediated nuclear factor kappa B (NFkB) stimulation. VBP1 and HBx function as coactivators in the activation of NFκB binding. Docking results revealed that HBx and NFkB bind with VBP1 at the common site on amino acids positions Arg 161, Glu 92, and Arg 82, which may have a role in HBx-mediated NFκB activation. Lowest energy complex VBP1- NFkB1 was obtained at -883.70 Kcal/mol. The amino acids involved in interaction among HBx, VBP1, and NFκB proteins, may be involved in transcriptional regulation and has significance in normal and abnormal regulation. These amino acid interactions may be associated with the manifestation of Liver cancer.


2019 ◽  
Vol 116 (7) ◽  
pp. 2545-2550 ◽  
Author(s):  
Abimael Cruz-Migoni ◽  
Peter Canning ◽  
Camilo E. Quevedo ◽  
Carole J. R. Bataille ◽  
Nicolas Bery ◽  
...  

The RAS gene family is frequently mutated in human cancers, and the quest for compounds that bind to mutant RAS remains a major goal, as it also does for inhibitors of protein–protein interactions. We have refined crystallization conditions for KRAS169Q61H-yielding crystals suitable for soaking with compounds and exploited this to assess new RAS-binding compounds selected by screening a protein–protein interaction-focused compound library using surface plasmon resonance. Two compounds, referred to as PPIN-1 and PPIN-2, with related structures from 30 initial RAS binders showed binding to a pocket where compounds had been previously developed, including RAS effector protein–protein interaction inhibitors selected using an intracellular antibody fragment (called Abd compounds). Unlike the Abd series of RAS binders, PPIN-1 and PPIN-2 compounds were not competed by the inhibitory anti-RAS intracellular antibody fragment and did not show any RAS-effector inhibition properties. By fusing the common, anchoring part from the two new compounds with the inhibitory substituents of the Abd series, we have created a set of compounds that inhibit RAS-effector interactions with increased potency. These fused compounds add to the growing catalog of RAS protein–protein inhibitors and show that building a chemical series by crossing over two chemical series is a strategy to create RAS-binding small molecules.


2012 ◽  
Vol 2012 ◽  
pp. 1-10 ◽  
Author(s):  
Ricardo A. Cifuentes ◽  
Daniel Restrepo-Montoya ◽  
Juan-Manuel Anaya

There is genetic evidence of similarities and differences among autoimmune diseases (AIDs) that warrants looking at a general panorama of what has been published. Thus, our aim was to determine the main shared genes and to what extent they contribute to building clusters of AIDs. We combined a text-mining approach to build clusters of genetic concept profiles (GCPs) from the literature in MedLine with knowledge of protein-protein interactions to confirm if genes in GCP encode proteins that truly interact. We found three clusters in which the genes with the highest contribution encoded proteins that showed strong and specific interactions. After projecting the AIDs on a plane, two clusters could be discerned: Sjögren’s syndrome—systemic lupus erythematosus, and autoimmune thyroid disease—type1 diabetes—rheumatoid arthritis. Our results support the common origin of AIDs and the role of genes involved in apoptosis such asCTLA4,FASLG,andIL10.


2020 ◽  
Author(s):  
Moumita Ghosh ◽  
Pritam Sil ◽  
Anirban Roy ◽  
Rohmatul Fajriyah ◽  
Kartick Chandra Mondal

Abstract COVID-19 pandemic defined a worldwide health crisis into a humanitarian crisis. Amid this global emergency, human civilization is under enormous strain since no proper therapeutic method is discovered yet. A wave of research effort has been put towards the invention of therapeutics and vaccines against COVID-19. Contrarily, the spread of this fatal virus has already infected millions of people and claimed many lives all over the world. Computational biology can attempt to understand the protein-protein interactions between the viral protein and host protein. Therefore potential viral-host protein interactions can be identified which is known as crucial information towards the discovery of drugs. In this paper, we have presented an approach for predicting novel interactions from maximal biclusters. Additionally, the predicted interactions are verified from biological perspectives. For this, we conduct a study on the gene ontology and KEGG pathway in relation to the newly predicted interactions.


2021 ◽  
Author(s):  
Amro Safadi ◽  
Simon C Lovell ◽  
Andrew James Doig

The identification of genes that may be linked to cancer is of great importance for the discovery of new drug targets. The rate at which cancer genes are being found experimentally is slow, however, due to the complexity of the identification and confirmation process, giving a narrow range of therapeutic targets to investigate and develop. One solution to this problem is to use predictive analysis techniques that can accurately identify cancer gene candidates in a timely fashion. Furthermore, the effort in identifying characteristics that are linked to cancer genes is crucial to further our understanding of this disease. These characteristics can be employed in recognising therapeutic drug targets. Here, we investigated whether certain genes' properties can indicate the likelihood of it to be involved in the initiation or progression of cancer. We found that for cancer, the essentiality scores tend to be higher for cancer genes than for all protein coding human genes. A machine-learning model was developed and we found that essentiality related properties and properties arising from protein-protein interaction networks or evolution are particularly effective in predicting cancer-associated genes. We were also able to identify potential drug targets that have not been previously linked with cancer, but have the characteristics of cancer-related genes.


Sign in / Sign up

Export Citation Format

Share Document