A Framework To Build A Causal Knowledge Graph for Chronic Diseases and Cancers By Discovering Semantic Associations from Biomedical Literature

BACKGROUND Combination therapy plays an important role in the effective treatment of malignant neoplasms and precision medicine. Numerous clinical studies have been carried out to investigate combination drug therapies. Automated knowledge discovery of these combinations and their graphic representation in knowledge graphs will enable pattern recognition and identification of drug combinations used to treat a specific type of cancer, improve drug efficacy and treatment of human disorders. OBJECTIVE This paper aims to develop an automated, visual approach to discover knowledge about combination therapies from biomedical literature, especially from those studies with high-level evidence such as clinical trial reports and clinical practice guidelines. METHODS Based on semantic predications, which consist of a triple structure of subject-predicate-object (SPO), we proposed an automated algorithm to discover knowledge of combination drug therapies using the following rules: 1) two or more semantic predications (S1-P-O and Si-P-O, i = 2, 3…) can be extracted from one conclusive claim (sentence) in the abstract of a given publication, and 2) these predications have an identical predicate (that closely relates to human disease treatment, eg, “treat”) and object (eg, disease name) but different subjects (eg, drug names). A customized knowledge graph organizes and visualizes these combinations, improving the traditional semantic triples. After automatic filtering of broad concepts such as “pharmacologic actions” and generic disease names, a set of combination drug therapies were identified and characterized through manual interpretation. RESULTS We retrieved 22,263 clinical trial reports and 31 clinical practice guidelines from PubMed abstracts by searching “antineoplastic agents” for drug restriction (published between Jan 2009 and Oct 2019). There were 15,603 conclusive claims locally parsed using the search terms “conclusion*” and “conclude*” ready for semantic predications extraction by SemRep, and 325 candidate groups of semantic predications about combined medications were automatically discovered within 316 conclusive claims. Based on manual analysis, we determined that 255/316 claims (78.46%) were accurately identified as describing combination therapies and adopted these to construct the customized knowledge graph. We also identified two categories (and 4 subcategories) to characterize the inaccurate results: limitations of SemRep and limitations of proposal. We further learned the predominant patterns of drug combinations based on mechanism of action for new combined medication studies and discovered 4 obvious markers (“combin*,” “coadministration,” “co-administered,” and “regimen”) to identify potential combination therapies to enable development of a machine learning algorithm. CONCLUSIONS Semantic predications from conclusive claims in the biomedical literature can be used to support automated knowledge discovery and knowledge graph construction for combination therapies. A machine learning approach is warranted to take full advantage of the identified markers and other contextual features.

Download Full-text

A Knowledge Graph of Combined Drug Therapies Using Semantic Predications From Biomedical Literature: Algorithm Development

JMIR Medical Informatics ◽

10.2196/18323 ◽

2020 ◽

Vol 8 (4) ◽

pp. e18323

Author(s):

Jian Du ◽

Xiaoying Li

Keyword(s):

Machine Learning ◽

Clinical Trial ◽

Clinical Practice ◽

Practice Guidelines ◽

Biomedical Literature ◽

Knowledge Graph ◽

Combination Therapies ◽

Combination Drug ◽

Automated Knowledge ◽

Semantic Predications

Background Combination therapy plays an important role in the effective treatment of malignant neoplasms and precision medicine. Numerous clinical studies have been carried out to investigate combination drug therapies. Automated knowledge discovery of these combinations and their graphic representation in knowledge graphs will enable pattern recognition and identification of drug combinations used to treat a specific type of cancer, improve drug efficacy and treatment of human disorders. Objective This paper aims to develop an automated, visual approach to discover knowledge about combination therapies from biomedical literature, especially from those studies with high-level evidence such as clinical trial reports and clinical practice guidelines. Methods Based on semantic predications, which consist of a triple structure of subject-predicate-object (SPO), we proposed an automated algorithm to discover knowledge of combination drug therapies using the following rules: 1) two or more semantic predications (S1-P-O and Si-P-O, i = 2, 3…) can be extracted from one conclusive claim (sentence) in the abstract of a given publication, and 2) these predications have an identical predicate (that closely relates to human disease treatment, eg, “treat”) and object (eg, disease name) but different subjects (eg, drug names). A customized knowledge graph organizes and visualizes these combinations, improving the traditional semantic triples. After automatic filtering of broad concepts such as “pharmacologic actions” and generic disease names, a set of combination drug therapies were identified and characterized through manual interpretation. Results We retrieved 22,263 clinical trial reports and 31 clinical practice guidelines from PubMed abstracts by searching “antineoplastic agents” for drug restriction (published between Jan 2009 and Oct 2019). There were 15,603 conclusive claims locally parsed using the search terms “conclusion*” and “conclude*” ready for semantic predications extraction by SemRep, and 325 candidate groups of semantic predications about combined medications were automatically discovered within 316 conclusive claims. Based on manual analysis, we determined that 255/316 claims (78.46%) were accurately identified as describing combination therapies and adopted these to construct the customized knowledge graph. We also identified two categories (and 4 subcategories) to characterize the inaccurate results: limitations of SemRep and limitations of proposal. We further learned the predominant patterns of drug combinations based on mechanism of action for new combined medication studies and discovered 4 obvious markers (“combin*,” “coadministration,” “co-administered,” and “regimen”) to identify potential combination therapies to enable development of a machine learning algorithm. Conclusions Semantic predications from conclusive claims in the biomedical literature can be used to support automated knowledge discovery and knowledge graph construction for combination therapies. A machine learning approach is warranted to take full advantage of the identified markers and other contextual features.

Download Full-text

Counterfactual inference to predict causal knowledge graph for relational transfer learning by assimilating expert knowledge --Relational feature transfer learning algorithm

Advanced Engineering Informatics ◽

10.1016/j.aei.2021.101516 ◽

2022 ◽

Vol 51 ◽

pp. 101516

Author(s):

Jiarui Li ◽

Yukio Horiguchi ◽

Tetsuo Sawaragi

Keyword(s):

Transfer Learning ◽

Expert Knowledge ◽

Learning Algorithm ◽

Knowledge Graph ◽

Causal Knowledge

Download Full-text

The Coronavirus Network Explorer: Mining a large-scale knowledge graph for effects of SARS-CoV-2 on host cell function

10.1101/2020.09.14.296327 ◽

2020 ◽

Author(s):

Andreas Krämer ◽

Jean-Noël Billaud ◽

Stuart Tugendreich ◽

Dan Shiftman ◽

Martin Jones ◽

...

Keyword(s):

Host Cell ◽

Drug Targets ◽

Large Scale ◽

Cell Function ◽

Affinity Purification ◽

Viral Proteins ◽

Biomedical Literature ◽

Knowledge Graph ◽

Cell Functions ◽

Interactive Network

Building on recent work that identified human host proteins that interact with SARS-CoV-2 viral proteins in the context of an affinity-purification mass spectrometry screen, we use a machine learning-based approach to connect the viral proteins to relevant biological functions and diseases in a large-scale knowledge graph derived from the biomedical literature. Our aim is to explore how SARS-CoV-2 could interfere with various host cell functions, and also to identify additional drug targets amongst the host genes that could potentially be modulated against COVID-19. Results are presented in the form of interactive network visualizations, that allow exploration of underlying experimental evidence. A selection of networks is discussed in the context of recent clinical observations.

Download Full-text

Expert-Augmented Computational Drug Repurposing Identified Baricitinib as a Treatment for COVID-19

Frontiers in Pharmacology ◽

10.3389/fphar.2021.709856 ◽

2021 ◽

Vol 12 ◽

Author(s):

Daniel P. Smith ◽

Olly Oechsle ◽

Michael J. Rawling ◽

Ed Savory ◽

Alix M.B. Lacoste ◽

...

Keyword(s):

Visual Analytics ◽

Drug Repurposing ◽

Standard Of Care ◽

Biomedical Literature ◽

Biological Research ◽

Knowledge Graph ◽

Neglected Diseases ◽

Biomedical Knowledge ◽

Computational Tools ◽

Approved Drugs

The onset of the 2019 Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic necessitated the identification of approved drugs to treat the disease, before the development, approval and widespread administration of suitable vaccines. To identify such a drug, we used a visual analytics workflow where computational tools applied over an AI-enhanced biomedical knowledge graph were combined with human expertise. The workflow comprised rapid augmentation of knowledge graph information from recent literature using machine learning (ML) based extraction, with human-guided iterative queries of the graph. Using this workflow, we identified the rheumatoid arthritis drug baricitinib as both an antiviral and anti-inflammatory therapy. The effectiveness of baricitinib was substantiated by the recent publication of the data from the ACTT-2 randomised Phase 3 trial, followed by emergency approval for use by the FDA, and a report from the CoV-BARRIER trial confirming significant reductions in mortality with baricitinib compared to standard of care. Such methods that iteratively combine computational tools with human expertise hold promise for the identification of treatments for rare and neglected diseases and, beyond drug repurposing, in areas of biological research where relevant data may be lacking or hidden in the mass of available biomedical literature.

Download Full-text

Re-curation and Rational Enrichment of Knowledge Graphs in Biological Expression Language

10.1101/536409 ◽

2019 ◽

Author(s):

Charles Tapley Hoyt ◽

Daniel Domingo-Fernández ◽

Rana Aldisi ◽

Lingling Xu ◽

Kristian Kolpeja ◽

...

Keyword(s):

Text Mining ◽

Full Text ◽

Biomedical Literature ◽

Knowledge Graph ◽

Pubmed Central ◽

Link Type ◽

Information Density ◽

Manual Curation ◽

Rapid Accumulation ◽

Knowledge Graphs

AbstractThe rapid accumulation of new biomedical literature not only causes curated knowledge graphs to become outdated and incomplete, but also makes manual curation an impractical and unsustainable solution. Automated or semi-automated workflows are necessary to assist in prioritizing and curating the literature to update and enrich knowledge graphs.We have developed two workflows: one for re-curating a given knowledge graph to assure its syntactic and semantic quality and another for rationally enriching it by manually revising automatically extracted relations for nodes with low information density. We applied these workflows to the knowledge graphs encoded in Biological Expression Language from the NeuroMMSig database using content that was pre-extracted from MEDLINE abstracts and PubMed Central full text articles using text mining output integrated by INDRA. We have made this workflow freely available at https://github.com/bel-enrichment/bel-enrichment.Database URLhttps://github.com/bel-enrichment/results

Download Full-text

Construction of Depression Knowledge Graph Based on Biomedical Literature

10.1109/bibm52615.2021.9669447 ◽

2021 ◽

Author(s):

Zepeng Li ◽

Yufeng Zhang ◽

Rikui Huang ◽

Zhenwen Zhang ◽

Jianghong Zhu ◽

...

Keyword(s):

Biomedical Literature ◽

Knowledge Graph

Download Full-text

Research on Application of Chinese Natural Language Processing in Constructing Knowledge Graph of Chronic Diseases

2021 International Conference on Communications, Information System and Computer Engineering (CISCE) ◽

10.1109/cisce52179.2021.9445976 ◽

2021 ◽

Author(s):

Shuangling Qin ◽

Chaozhi Xu ◽

Fang Zhang ◽

Tao Jiang ◽

Wei Ge ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Chronic Diseases ◽

Language Processing ◽

Knowledge Graph ◽

Research On Application ◽

Constructing Knowledge

Download Full-text

Strategies for Assembling the Biodiversity Knowledge Graph

Biodiversity Information Science and Standards ◽

10.3897/biss.4.59126 ◽

2020 ◽

Vol 4 ◽

Author(s):

Roderic Page

Keyword(s):

Life Sciences ◽

Biomedical Literature ◽

Knowledge Graph ◽

Supporting Evidence ◽

Domain Specific ◽

Biodiversity Knowledge ◽

Comprehensive Knowledge ◽

International Image ◽

Knowledge Graphs ◽

Semantic Publishing

This talk explores different strategies for assembling the “biodiversity knowledge graph” (Page 2016). The first is a centralised, crowd-sourced approach using Wikidata as the foundation. Wikidata is becoming increasingly attractive as a knowledge graph for the life sciences (Waagmeester et al. 2020), and I will discuss some of its strengths and limitations, particularly as a source of bibliographic and taxonomic information. For example, Wikidata’s handling of taxonomy is somewhat problematic given the lack of clear separation of taxa and their names. A second approach is to build biodiversity knowledge graphs from scratch, such as OpenBioDiv (Penev et al. 2019) and my own Ozymandias (Page 2019). These approaches use either generalised vocabularies such as schema.org, or domain specific ones such as TaxPub (Catapano 2010) and the Semantic Publishing and Referencing Ontologies (SPAR) (Peroni and Shotton 2018), and to date tend to have restricted focus, whether geographic (e.g., Australian animals in Ozymandias) or temporal (recent taxonomic literature, OpenBioDiv). A growing number of data sources are now using schema.org to describe their data, including ORCID and Zenodo, and efforts to extend schema.org into biology (Bioschemas) suggest we may soon be able to build comprehensive knowledge graphs using just schema.org and its derivatives. A third approach is not to build an entire knowledge graph, but instead focus on constructing small pieces of the graph tightly linked to supporting evidence, for example via annotations. Annotations are increasingly used to mark up both the biomedical literature (e.g., Kim et al. 2015, Venkatesan et al. 2017) and the biodiversity literature (Batista-Navarro et al. 2017). One could argue that taxonomic databases are essentially lists of annotations (“this name appears in this publication on this page”), which suggests we could link literature projects such as the Biodiversity Heritage Library (BHL) to taxonomic databases via annotations. Given that the International Image Interoperability Framework (IIIF) provides a framework for treating publications themselves as a set of annotations (e.g., page images) upon which other annotations can be added (Zundert 2018), this suggests ways that knowledge graphs could lead directly to visualising the links between taxonomy and the taxonomic literature. All three approaches will be discussed, accompanied by working examples.

Download Full-text

COVID-KOP: Integrating Emerging COVID-19 Data with the ROBOKOP Database

10.26434/chemrxiv.12462623.v1 ◽

2020 ◽

Cited By ~ 1

Author(s):

Daniel Korn ◽

Tesia Bobrowski ◽

Michael Li ◽

Yaphet Kebede ◽

Patrick Wang ◽

...

Keyword(s):

Biomedical Literature ◽

Knowledge Graph ◽

Biomedical Knowledge ◽

Drug Candidates ◽

Clinical Drug

In response to the COVID-19 pandemic, we established COVID-KOP, a new knowledgebase integrating the existing ROBOKOP biomedical knowledge graph with information from recent biomedical literature on COVID-19 annotated in the CORD-19 collection. COVID-KOP can be used effectively to test new hypotheses concerning repurposing of known drugs and clinical drug candidates against COVID-19. COVID-KOP is freely accessible at <a href="https://covidkop.renci.org/">https://covidkop.renci.org/</a>. For code and instructions for the original ROBOKOP, see: https://github.com/NCATS-Gamma/robokop.

Download Full-text