scholarly journals The Coronavirus Network Explorer: mining a large-scale knowledge graph for effects of SARS-CoV-2 on host cell function

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Andreas Krämer ◽  
Jean-Noël Billaud ◽  
Stuart Tugendreich ◽  
Dan Shiffman ◽  
Martin Jones ◽  
...  

Abstract Background Leveraging previously identified viral interactions with human host proteins, we apply a machine learning-based approach to connect SARS-CoV-2 viral proteins to relevant host biological functions, diseases, and pathways in a large-scale knowledge graph derived from the biomedical literature. Our goal is to explore how SARS-CoV-2 could interfere with various host cell functions, and to identify drug targets amongst the host genes that could potentially be modulated against COVID-19 by repurposing existing drugs. The machine learning model employed here involves gene embeddings that leverage causal gene expression signatures curated from literature. In contrast to other network-based approaches for drug repurposing, our approach explicitly takes the direction of effects into account, distinguishing between activation and inhibition. Results We have constructed 70 networks connecting SARS-CoV-2 viral proteins to various biological functions, diseases, and pathways reflecting viral biology, clinical observations, and co-morbidities in the context of COVID-19. Results are presented in the form of interactive network visualizations through a web interface, the Coronavirus Network Explorer (CNE), that allows exploration of underlying experimental evidence. We find that existing drugs targeting genes in those networks are strongly enriched in the set of drugs that are already in clinical trials against COVID-19. Conclusions The approach presented here can identify biologically plausible hypotheses for COVID-19 pathogenesis, explicitly connected to the immunological, virological and pathological observations seen in SARS-CoV-2 infected patients. The discovery of repurposable drugs is driven by prior knowledge of relevant functional endpoints that reflect known viral biology or clinical observations, therefore suggesting potential mechanisms of action. We believe that the CNE offers relevant insights that go beyond more conventional network approaches, and can be a valuable tool for drug repurposing. The CNE is available at https://digitalinsights.qiagen.com/coronavirus-network-explorer.

2020 ◽  
Author(s):  
Andreas Krämer ◽  
Jean-Noël Billaud ◽  
Stuart Tugendreich ◽  
Dan Shiftman ◽  
Martin Jones ◽  
...  

Building on recent work that identified human host proteins that interact with SARS-CoV-2 viral proteins in the context of an affinity-purification mass spectrometry screen, we use a machine learning-based approach to connect the viral proteins to relevant biological functions and diseases in a large-scale knowledge graph derived from the biomedical literature. Our aim is to explore how SARS-CoV-2 could interfere with various host cell functions, and also to identify additional drug targets amongst the host genes that could potentially be modulated against COVID-19. Results are presented in the form of interactive network visualizations, that allow exploration of underlying experimental evidence. A selection of networks is discussed in the context of recent clinical observations.


Author(s):  
Payman Samavarchi-Tehrani ◽  
Hala Abdouni ◽  
James D.R. Knight ◽  
Audrey Astori ◽  
Reuben Samson ◽  
...  

AbstractViral replication is dependent on interactions between viral polypeptides and host proteins. Identifying virus-host protein interactions can thus uncover unique opportunities for interfering with the virus life cycle via novel drug compounds or drug repurposing. Importantly, many viral-host protein interactions take place at intracellular membranes and poorly soluble organelles, which are difficult to profile using classical biochemical purification approaches. Applying proximity-dependent biotinylation (BioID) with the fast-acting miniTurbo enzyme to 27 SARS-CoV-2 proteins in a lung adenocarcinoma cell line (A549), we detected 7810 proximity interactions (7382 of which are new for SARS-CoV-2) with 2242 host proteins (results available at covid19interactome.org). These results complement and dramatically expand upon recent affinity purification-based studies identifying stable host-virus protein complexes, and offer an unparalleled view of membrane-associated processes critical for viral production. Host cell organellar markers were also subjected to BioID in parallel, allowing us to propose modes of action for several viral proteins in the context of host proteome remodelling. In summary, our dataset identifies numerous high confidence proximity partners for SARS-CoV-2 viral proteins, and describes potential mechanisms for their effects on specific host cell functions.


2021 ◽  
Vol 11 ◽  
Author(s):  
Meng Wang ◽  
Xinyu Ma ◽  
Jingwen Si ◽  
Hongjia Tang ◽  
Haofen Wang ◽  
...  

Adverse drug reactions (ADRs) are a major public health concern, and early detection is crucial for drug development and patient safety. Together with the increasing availability of large-scale literature data, machine learning has the potential to predict unknown ADRs from current knowledge. By the machine learning methods, we constructed a Tumor-Biomarker Knowledge Graph (TBKG) which contains four types of node: Tumor, Biomarker, Drug, and ADR using biomedical literatures. Based on this knowledge graph, we not only discovered potential ADRs of antitumor drugs but also provided explanations. Experiments on real-world data show that our model can achieve 0.81 accuracy of three cross-validation and the ADRs discovery of Osimertinib was chosen for the clinical validation. Calculated ADRs of Osimertinib by our model consisted of the known ADRs which were in line with the official manual and some unreported rare ADRs in clinical cases. Results also showed that our model outperformed traditional co-occurrence methods. Moreover, each calculated ADRs were attached with the corresponding paths of “tumor-biomarker-drug” in the knowledge graph which could help to obtain in-depth insights into the underlying mechanisms. In conclusion, the tumor-biomarker knowledge-graph based approach is an explainable method for potential ADRs discovery based on biomarkers and might be valuable to the community working on the emerging field of biomedical literature mining and provide impetus for the mechanism research of ADRs.


2020 ◽  
Author(s):  
Jian Du ◽  
Xiaoying Li

BACKGROUND Combination therapy plays an important role in the effective treatment of malignant neoplasms and precision medicine. Numerous clinical studies have been carried out to investigate combination drug therapies. Automated knowledge discovery of these combinations and their graphic representation in knowledge graphs will enable pattern recognition and identification of drug combinations used to treat a specific type of cancer, improve drug efficacy and treatment of human disorders. OBJECTIVE This paper aims to develop an automated, visual approach to discover knowledge about combination therapies from biomedical literature, especially from those studies with high-level evidence such as clinical trial reports and clinical practice guidelines. METHODS Based on semantic predications, which consist of a triple structure of subject-predicate-object (SPO), we proposed an automated algorithm to discover knowledge of combination drug therapies using the following rules: 1) two or more semantic predications (S<sub>1</sub>-P-O and S<sub>i</sub>-P-O, i = 2, 3…) can be extracted from one conclusive claim (sentence) in the abstract of a given publication, and 2) these predications have an identical predicate (that closely relates to human disease treatment, eg, “treat”) and object (eg, disease name) but different subjects (eg, drug names). A customized knowledge graph organizes and visualizes these combinations, improving the traditional semantic triples. After automatic filtering of broad concepts such as “pharmacologic actions” and generic disease names, a set of combination drug therapies were identified and characterized through manual interpretation. RESULTS We retrieved 22,263 clinical trial reports and 31 clinical practice guidelines from PubMed abstracts by searching “antineoplastic agents” for drug restriction (published between Jan 2009 and Oct 2019). There were 15,603 conclusive claims locally parsed using the search terms “conclusion*” and “conclude*” ready for semantic predications extraction by SemRep, and 325 candidate groups of semantic predications about combined medications were automatically discovered within 316 conclusive claims. Based on manual analysis, we determined that 255/316 claims (78.46%) were accurately identified as describing combination therapies and adopted these to construct the customized knowledge graph. We also identified two categories (and 4 subcategories) to characterize the inaccurate results: limitations of SemRep and limitations of proposal. We further learned the predominant patterns of drug combinations based on mechanism of action for new combined medication studies and discovered 4 obvious markers (“combin*,” “coadministration,” “co-administered,” and “regimen”) to identify potential combination therapies to enable development of a machine learning algorithm. CONCLUSIONS Semantic predications from conclusive claims in the biomedical literature can be used to support automated knowledge discovery and knowledge graph construction for combination therapies. A machine learning approach is warranted to take full advantage of the identified markers and other contextual features.


10.2196/18323 ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. e18323
Author(s):  
Jian Du ◽  
Xiaoying Li

Background Combination therapy plays an important role in the effective treatment of malignant neoplasms and precision medicine. Numerous clinical studies have been carried out to investigate combination drug therapies. Automated knowledge discovery of these combinations and their graphic representation in knowledge graphs will enable pattern recognition and identification of drug combinations used to treat a specific type of cancer, improve drug efficacy and treatment of human disorders. Objective This paper aims to develop an automated, visual approach to discover knowledge about combination therapies from biomedical literature, especially from those studies with high-level evidence such as clinical trial reports and clinical practice guidelines. Methods Based on semantic predications, which consist of a triple structure of subject-predicate-object (SPO), we proposed an automated algorithm to discover knowledge of combination drug therapies using the following rules: 1) two or more semantic predications (S1-P-O and Si-P-O, i = 2, 3…) can be extracted from one conclusive claim (sentence) in the abstract of a given publication, and 2) these predications have an identical predicate (that closely relates to human disease treatment, eg, “treat”) and object (eg, disease name) but different subjects (eg, drug names). A customized knowledge graph organizes and visualizes these combinations, improving the traditional semantic triples. After automatic filtering of broad concepts such as “pharmacologic actions” and generic disease names, a set of combination drug therapies were identified and characterized through manual interpretation. Results We retrieved 22,263 clinical trial reports and 31 clinical practice guidelines from PubMed abstracts by searching “antineoplastic agents” for drug restriction (published between Jan 2009 and Oct 2019). There were 15,603 conclusive claims locally parsed using the search terms “conclusion*” and “conclude*” ready for semantic predications extraction by SemRep, and 325 candidate groups of semantic predications about combined medications were automatically discovered within 316 conclusive claims. Based on manual analysis, we determined that 255/316 claims (78.46%) were accurately identified as describing combination therapies and adopted these to construct the customized knowledge graph. We also identified two categories (and 4 subcategories) to characterize the inaccurate results: limitations of SemRep and limitations of proposal. We further learned the predominant patterns of drug combinations based on mechanism of action for new combined medication studies and discovered 4 obvious markers (“combin*,” “coadministration,” “co-administered,” and “regimen”) to identify potential combination therapies to enable development of a machine learning algorithm. Conclusions Semantic predications from conclusive claims in the biomedical literature can be used to support automated knowledge discovery and knowledge graph construction for combination therapies. A machine learning approach is warranted to take full advantage of the identified markers and other contextual features.


2016 ◽  
Author(s):  
Fupan Yao ◽  
Seyed Ali Madani Tonekaboni ◽  
Zhaleh Safikhani ◽  
Petr Smirnov ◽  
Nehme El-Hachem ◽  
...  

ABSTRACTResearch in oncology traditionally focuses on specific tissue type from which the cancer develops. However, advances in high-throughput molecular profiling technologies have enabled the comprehensive characterization of molecular aberrations in multiple cancer types. It was hoped that these large-scale datasets would provide the foundation for a paradigm shift in oncology which would see tumors being classified by their molecular profiles rather than tissue types, but tumors with similar genomic aberrations may respond differently to targeted therapies depending on their tissue of origin. There is therefore a need to reassess the potential association between pharmacological response and tissue of origin for therapeutic drugs, and to test how these associations translate from preclinical to clinical settings.In this paper, we investigate the tissue specificity of drug sensitivities in large-scale pharmacological studies and compare these associations to those found in clinical trial descriptions. Our meta-analysis of the four largest in vitro drug screening datasets indicates that tissue of origin is strongly associated with drug response. We identify novel tissue-drug associations, which may present exciting new avenues for drug repurposing. One caveat is that the vast majority of the significant associations found in preclinical settings do not concur with clinical observations. Accordingly, our results call for more testing to find the root cause of the discrepancies between preclinical and clinical observations.


2021 ◽  
Vol 12 ◽  
Author(s):  
Daniel P. Smith ◽  
Olly Oechsle ◽  
Michael J. Rawling ◽  
Ed Savory ◽  
Alix M.B. Lacoste ◽  
...  

The onset of the 2019 Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic necessitated the identification of approved drugs to treat the disease, before the development, approval and widespread administration of suitable vaccines. To identify such a drug, we used a visual analytics workflow where computational tools applied over an AI-enhanced biomedical knowledge graph were combined with human expertise. The workflow comprised rapid augmentation of knowledge graph information from recent literature using machine learning (ML) based extraction, with human-guided iterative queries of the graph. Using this workflow, we identified the rheumatoid arthritis drug baricitinib as both an antiviral and anti-inflammatory therapy. The effectiveness of baricitinib was substantiated by the recent publication of the data from the ACTT-2 randomised Phase 3 trial, followed by emergency approval for use by the FDA, and a report from the CoV-BARRIER trial confirming significant reductions in mortality with baricitinib compared to standard of care. Such methods that iteratively combine computational tools with human expertise hold promise for the identification of treatments for rare and neglected diseases and, beyond drug repurposing, in areas of biological research where relevant data may be lacking or hidden in the mass of available biomedical literature.


Author(s):  
Galia Nordon ◽  
Gideon Koren ◽  
Varda Shalev ◽  
Eric Horvitz ◽  
Kira Radinsky

We present a system that jointly harnesses large-scale electronic health records data and a concept graph mined from the medical literature to guide drug repurposing—the process of applying known drugs in new ways to treat diseases. Our study is unique in methods and scope, per the scale of the concept graph and the quantity of data. We harness 10 years of nation-wide medical records of more than 1.5 million people and extract medical knowledge from all of PubMed, the world’s largest corpus of online biomedical literature. We employ links on the concept graph to provide causal signals to prioritize candidate influences between medications and target diseases. We show results of the system on studies of drug repurposing for hypertension and diabetes. In both cases, we present drug families identified by the algorithm which were previously unknown. We verify the results via clinical expert opinion and by prospective clinical trials on hypertension.


2019 ◽  
Vol 35 (19) ◽  
pp. 3672-3678 ◽  
Author(s):  
Nafiseh Saberian ◽  
Azam Peyvandipour ◽  
Michele Donato ◽  
Sahar Ansari ◽  
Sorin Draghici

Abstract Motivation Drug repurposing is a potential alternative to the classical drug discovery pipeline. Repurposing involves finding novel indications for already approved drugs. In this work, we present a novel machine learning-based method for drug repurposing. This method explores the anti-similarity between drugs and a disease to uncover new uses for the drugs. More specifically, our proposed method takes into account three sources of information: (i) large-scale gene expression profiles corresponding to human cell lines treated with small molecules, (ii) gene expression profile of a human disease and (iii) the known relationship between Food and Drug Administration (FDA)-approved drugs and diseases. Using these data, our proposed method learns a similarity metric through a supervised machine learning-based algorithm such that a disease and its associated FDA-approved drugs have smaller distance than the other disease-drug pairs. Results We validated our framework by showing that the proposed method incorporating distance metric learning technique can retrieve FDA-approved drugs for their approved indications. Once validated, we used our approach to identify a few strong candidates for repurposing. Availability and implementation The R scripts are available on demand from the authors. Supplementary information Supplementary data are available at Bioinformatics online.


Viruses ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 1479
Author(s):  
Akatsuki Saito ◽  
Maya Shofa ◽  
Hirotaka Ode ◽  
Maho Yumiya ◽  
Junki Hirano ◽  
...  

Viral proteins interact with different sets of host cell components throughout the viral life cycle and are known to localize to the intracellular membraneless organelles (MLOs) of the host cell, where formation/dissolution is regulated by phase separation of intrinsically disordered proteins and regions (IDPs/IDRs). Viral proteins are rich in IDRs, implying that viruses utilize IDRs to regulate phase separation of the host cell organelles and augment replication by commandeering the functions of the organelles and/or sneaking into the organelles to evade the host immune response. This review aims to integrate current knowledge of the structural properties and intracellular localizations of viral IDPs to understand viral strategies in the host cell. First, the properties of viral IDRs are reviewed and similarities and differences with those of eukaryotes are described. The higher IDR content in viruses with smaller genomes suggests that IDRs are essential characteristics of viral proteins. Then, the interactions of the IDRs of flaviviruses with the MLOs of the host cell are investigated with emphasis on the viral proteins localized in the nucleoli and stress granules. Finally, the possible roles of viral IDRs in regulation of the phase separation of organelles and future possibilities for antiviral drug development are discussed.


Sign in / Sign up

Export Citation Format

Share Document