KnowlyBERT - Hybrid Query Answering over Language Models and Knowledge Graphs

Building Hybrid Representations from Text Corpora, Knowledge Graphs, and Language Models

A Practical Guide to Hybrid Natural Language Processing ◽

10.1007/978-3-030-44830-1_6 ◽

2020 ◽

pp. 57-89

Author(s):

Jose Manuel Gomez-Perez ◽

Ronald Denaux ◽

Andres Garcia-Silva

Keyword(s):

Language Models ◽

Text Corpora ◽

Knowledge Graphs

Download Full-text

Path-based knowledge reasoning with textual semantic information for medical knowledge graph completion

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01622-7 ◽

2021 ◽

Vol 21 (S9) ◽

Author(s):

Yinyu Lan ◽

Shizhu He ◽

Kang Liu ◽

Xiangrong Zeng ◽

Shengping Liu ◽

...

Keyword(s):

Semantic Information ◽

State Of The Art ◽

Semantic Representation ◽

Medical Knowledge ◽

The State ◽

Language Models ◽

Knowledge Graph ◽

Knowledge Reasoning ◽

Numerical Computing ◽

Knowledge Graphs

Abstract Background Knowledge graphs (KGs), especially medical knowledge graphs, are often significantly incomplete, so it necessitating a demand for medical knowledge graph completion (MedKGC). MedKGC can find new facts based on the existed knowledge in the KGs. The path-based knowledge reasoning algorithm is one of the most important approaches to this task. This type of method has received great attention in recent years because of its high performance and interpretability. In fact, traditional methods such as path ranking algorithm take the paths between an entity pair as atomic features. However, the medical KGs are very sparse, which makes it difficult to model effective semantic representation for extremely sparse path features. The sparsity in the medical KGs is mainly reflected in the long-tailed distribution of entities and paths. Previous methods merely consider the context structure in the paths of knowledge graph and ignore the textual semantics of the symbols in the path. Therefore, their performance cannot be further improved due to the two aspects of entity sparseness and path sparseness. Methods To address the above issues, this paper proposes two novel path-based reasoning methods to solve the sparsity issues of entity and path respectively, which adopts the textual semantic information of entities and paths for MedKGC. By using the pre-trained model BERT, combining the textual semantic representations of the entities and the relationships, we model the task of symbolic reasoning in the medical KG as a numerical computing issue in textual semantic representation. Results Experiments results on the publicly authoritative Chinese symptom knowledge graph demonstrated that the proposed method is significantly better than the state-of-the-art path-based knowledge graph reasoning methods, and the average performance is improved by 5.83% for all relations. Conclusions In this paper, we propose two new knowledge graph reasoning algorithms, which adopt textual semantic information of entities and paths and can effectively alleviate the sparsity problem of entities and paths in the MedKGC. As far as we know, it is the first method to use pre-trained language models and text path representations for medical knowledge reasoning. Our method can complete the impaired symptom knowledge graph in an interpretable way, and it outperforms the state-of-the-art path-based reasoning methods.

Download Full-text

QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

10.18653/v1/2021.naacl-main.45 ◽

2021 ◽

Author(s):

Michihiro Yasunaga ◽

Hongyu Ren ◽

Antoine Bosselut ◽

Percy Liang ◽

Jure Leskovec

Keyword(s):

Question Answering ◽

Language Models ◽

Knowledge Graphs

Download Full-text

Ontology-Mediated SPARQL Query Answering over Knowledge Graphs

Big Data Research ◽

10.1016/j.bdr.2020.100177 ◽

2020 ◽

pp. 100177

Author(s):

Guohui Xiao ◽

Julien Corman

Keyword(s):

Sparql Query ◽

Query Answering ◽

Knowledge Graphs

Download Full-text

Chapter 13. Generalizable Neuro-Symbolic Systems for Commonsense Question Answering

10.3233/faia210360 ◽

2021 ◽

Author(s):

Alessandro Oltramari ◽

Jonathan Francis ◽

Filip Ilievski ◽

Kaixin Ma ◽

Roshanak Mirzaee

Keyword(s):

Error Analysis ◽

Quantitative Evaluation ◽

Question Answering ◽

Language Models ◽

Language Understanding ◽

Symbolic Systems ◽

Benchmark Datasets ◽

Knowledge Graphs

This chapter illustrates how suitable neuro-symbolic models for language understanding can enable domain generalizability and robustness in downstream tasks. Different methods for integrating neural language models and knowledge graphs are discussed. The situations in which this combination is most appropriate are characterized, including quantitative evaluation and qualitative error analysis on a variety of commonsense question answering benchmark datasets.

Download Full-text

Materializing knowledge bases via trigger graphs

Proceedings of the VLDB Endowment ◽

10.14778/3447689.3447699 ◽

2021 ◽

Vol 14 (6) ◽

pp. 943-956

Author(s):

Efthymia Tsamoura ◽

David Carral ◽

Enrico Malizia ◽

Jacopo Urbani

Keyword(s):

Empirical Study ◽

Single Machine ◽

General Problem ◽

Real World ◽

Data Cleaning ◽

Knowledge Bases ◽

Query Answering ◽

Commodity Hardware ◽

Knowledge Graphs

The chase is a well-established family of algorithms used to materialize Knowledge Bases (KBs) for tasks like query answering under dependencies or data cleaning. A general problem of chase algorithms is that they might perform redundant computations. To counter this problem, we introduce the notion of Trigger Graphs (TGs), which guide the execution of the rules avoiding redundant computations. We present the results of an extensive theoretical and empirical study that seeks to answer when and how TGs can be computed and what are the benefits of TGs when applied over real-world KBs. Our results include introducing algorithms that compute (minimal) TGs. We implemented our approach in a new engine, called GLog, and our experiments show that it can be significantly more efficient than the chase enabling us to materialize Knowledge Graphs with 17B facts in less than 40 min using a single machine with commodity hardware.

Download Full-text

LM4KG: Improving Common Sense Knowledge Graphs with Language Models

Lecture Notes in Computer Science - The Semantic Web – ISWC 2020 ◽

10.1007/978-3-030-62419-4_26 ◽

2020 ◽

pp. 456-473

Author(s):

Janna Omeliyanenko ◽

Albin Zehe ◽

Lena Hettinger ◽

Andreas Hotho

Keyword(s):

Common Sense ◽

Language Models ◽

Knowledge Graphs ◽

Common Sense Knowledge

Download Full-text

A Brief Review of Relation Extraction Based on Pre-Trained Language Models

Fuzzy Systems and Data Mining VI - Frontiers in Artificial Intelligence and Applications ◽

10.3233/faia200755 ◽

2020 ◽

Author(s):

Tiange Xu ◽

Fu Zhang

Keyword(s):

Recurrent Neural Networks ◽

Rapid Development ◽

Relation Extraction ◽

Language Models ◽

Research Progress ◽

Future Research ◽

Distant Supervision ◽

Future Research Directions ◽

Knowledge Graphs ◽

Key Techniques

Relation extraction is to extract the semantic relation between entity pairs in text, and it is a key point in building Knowledge Graphs and information extraction. The rapid development of deep learning in recent years has resulted in rich research results in relation extraction tasks. At present, the accuracy of relation extraction tasks based on pre-trained language models such as BERT exceeds the methods based on Convolutional or Recurrent Neural Networks. This review mainly summarizes the research progress of pre-trained language models such as BERT in supervised learning and distant supervision relation extraction. In addition, the directions for future research and some comparisons and analyses are discussed in our whole survey. The survey may help readers understand and catch some key techniques about the issue, and identify some future research directions.

Download Full-text

Commonsense Knowledge Base Completion with Structural and Semantic Context

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5684 ◽

2020 ◽

Vol 34 (03) ◽

pp. 2925-2933 ◽

Cited By ~ 1

Author(s):

Chaitanya Malaviya ◽

Chandra Bhagavatula ◽

Antoine Bosselut ◽

Yejin Choi

Keyword(s):

Language Model ◽

Knowledge Bases ◽

Semantic Context ◽

Language Models ◽

Free Form ◽

Graph Structure ◽

Commonsense Knowledge ◽

Convolutional Networks ◽

Local Graph ◽

Knowledge Graphs

Automatic KB completion for commonsense knowledge graphs (e.g., ATOMIC and ConceptNet) poses unique challenges compared to the much studied conventional knowledge bases (e.g., Freebase). Commonsense knowledge graphs use free-form text to represent nodes, resulting in orders of magnitude more nodes compared to conventional KBs ( ∼18x more nodes in ATOMIC compared to Freebase (FB15K-237)). Importantly, this implies significantly sparser graph structures — a major challenge for existing KB completion methods that assume densely connected graphs over a relatively smaller set of nodes.In this paper, we present novel KB completion models that can address these challenges by exploiting the structural and semantic context of nodes. Specifically, we investigate two key ideas: (1) learning from local graph structure, using graph convolutional networks and automatic graph densification and (2) transfer learning from pre-trained language models to knowledge graphs for enhanced contextual representation of knowledge. We describe our method to incorporate information from both these sources in a joint model and provide the first empirical results for KB completion on ATOMIC and evaluation with ranking metrics on ConceptNet. Our results demonstrate the effectiveness of language model representations in boosting link prediction performance and the advantages of learning from local graph structure (+1.5 points in MRR for ConceptNet) when training on subgraphs for computational efficiency. Further analysis on model predictions shines light on the types of commonsense knowledge that language models capture well.

Download Full-text

Approximate Knowledge Graph Query Answering: From Ranking to Binary Classification

Lecture Notes in Computer Science - Graph Structures for Knowledge Representation and Reasoning ◽

10.1007/978-3-030-72308-8_8 ◽

2021 ◽

pp. 107-124

Author(s):

Ruud van Bakel ◽

Teodor Aleksiev ◽

Daniel Daza ◽

Dimitrios Alivanistos ◽

Michael Cochez

Keyword(s):

Message Passing ◽

Binary Classification ◽

Extraction Methods ◽

Query Answering ◽

Graph Query ◽

Heterogeneous Datasets ◽

Community Effort ◽

External Sources ◽

Knowledge Graphs ◽

Special Case

AbstractLarge, heterogeneous datasets are characterized by missing or even erroneous information. This is more evident when they are the product of community effort or automatic fact extraction methods from external sources, such as text. A special case of the aforementioned phenomenon can be seen in knowledge graphs, where this mostly appears in the form of missing or incorrect edges and nodes.Structured querying on such incomplete graphs will result in incomplete sets of answers, even if the correct entities exist in the graph, since one or more edges needed to match the pattern are missing. To overcome this problem, several algorithms for approximate structured query answering have been proposed. Inspired by modern Information Retrieval metrics, these algorithms produce a ranking of all entities in the graph, and their performance is further evaluated based on how high in this ranking the correct answers appear.In this work we take a critical look at this way of evaluation. We argue that performing a ranking-based evaluation is not sufficient to assess methods for complex query answering. To solve this, we introduce Message Passing Query Boxes (MPQB), which takes binary classification metrics back into use and shows the effect this has on the recently proposed query embedding method MPQE.

Download Full-text