NMT Enhancement based on Knowledge Graph Mining with Pre-trained Language Model

Understanding mathematical topics is important for both educators and students to capture latent concepts of questions, evaluate study performance, and recommend content in online learning systems. Compared to traditional text classification, mathematical topic classification has several main challenges: (1) the length of mathematical questions is relatively short; (2) there are various representations of the same mathematical concept(i.e., calculations and application); (3) the content of question is complex including algebra, geometry, and calculus. In order to overcome these problems, we propose a framework that combines content tokens and mathematical knowledge concepts in whole procedures. We embed entities from mathematics knowledge graphs, integrate entities into tokens in a masked language model, set up semantic similarity-based tasks for next-sentence prediction, and fuse knowledge vectors and token vectors during the fine-tuning procedure. We also build a Chinese mathematical topic prediction dataset consisting of more than 70,000 mathematical questions with topics. Our experiments using real data demonstrate that our knowledge graph-based mathematical topic prediction model outperforms other state-of-the-art methods.

Download Full-text

Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training

10.18653/v1/2021.naacl-main.278 ◽

2021 ◽

Author(s):

Oshin Agarwal ◽

Heming Ge ◽

Siamak Shakeri ◽

Rami Al-Rfou

Keyword(s):

Language Model ◽

Knowledge Graph ◽

Corpus Generation

Download Full-text

Knowledge graph mining for realty domain using dependency parsing and QAT models

Procedia Computer Science ◽

10.1016/j.procs.2021.10.004 ◽

2021 ◽

Vol 193 ◽

pp. 32-41

Author(s):

Alexander Zamiralov ◽

Timur Sohin ◽

Nikolay Butakov

Keyword(s):

Graph Mining ◽

Knowledge Graph ◽

Dependency Parsing

Download Full-text

AKMiner: Domain-Specific Knowledge Graph Mining from Academic Literatures

Lecture Notes in Computer Science - Web Information Systems Engineering – WISE 2013 ◽

10.1007/978-3-642-41154-0_18 ◽

2013 ◽

pp. 241-255 ◽

Cited By ~ 2

Author(s):

Shanshan Huang ◽

Xiaojun Wan

Keyword(s):

Graph Mining ◽

Knowledge Graph ◽

Specific Knowledge ◽

Domain Specific ◽

Domain Specific Knowledge

Download Full-text

DeNERT-KG: Named Entity and Relation Extraction Model Using DQN, Knowledge Graph, and BERT

Applied Sciences ◽

10.3390/app10186429 ◽

2020 ◽

Vol 10 (18) ◽

pp. 6429

Author(s):

SungMin Yang ◽

SoYeop Yoo ◽

OkRan Jeong

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Language Model ◽

Named Entity Recognition ◽

Relation Extraction ◽

Entity Recognition ◽

Knowledge Graph ◽

Named Entity ◽

Artificial Intelligence Technology

Along with studies on artificial intelligence technology, research is also being carried out actively in the field of natural language processing to understand and process people’s language, in other words, natural language. For computers to learn on their own, the skill of understanding natural language is very important. There are a wide variety of tasks involved in the field of natural language processing, but we would like to focus on the named entity registration and relation extraction task, which is considered to be the most important in understanding sentences. We propose DeNERT-KG, a model that can extract subject, object, and relationships, to grasp the meaning inherent in a sentence. Based on the BERT language model and Deep Q-Network, the named entity recognition (NER) model for extracting subject and object is established, and a knowledge graph is applied for relation extraction. Using the DeNERT-KG model, it is possible to extract the subject, type of subject, object, type of object, and relationship from a sentence, and verify this model through experiments.

Download Full-text

Microsoft Concept Graph: Mining Semantic Concepts for Short Text Understanding

Data Intelligence ◽

10.1162/dint_a_00013 ◽

2019 ◽

Vol 1 (3) ◽

pp. 238-270 ◽

Cited By ~ 3

Author(s):

Lei Ji ◽

Yujing Wang ◽

Botian Shi ◽

Dawei Zhang ◽

Zhongyuan Wang ◽

...

Keyword(s):

Graph Mining ◽

Knowledge Graph ◽

Web Pages ◽

Search Query ◽

Short Text ◽

Text Understanding ◽

Concept Space ◽

Semantic Concepts ◽

Table Understanding ◽

The Web

Knowlege is important for text-related applications. In this paper, we introduce Microsoft Concept Graph, a knowledge graph engine that provides concept tagging APIs to facilitate the understanding of human languages. Microsoft Concept Graph is built upon Probase, a universal probabilistic taxonomy consisting of instances and concepts mined from the Web. We start by introducing the construction of the knowledge graph through iterative semantic extraction and taxonomy construction procedures, which extract 2.7 million concepts from 1.68 billion Web pages. We then use conceptualization models to represent text in the concept space to empower text-related applications, such as topic search, query recommendation, Web table understanding and Ads relevance. Since the release in 2016, Microsoft Concept Graph has received more than 100,000 pageviews, 2 million API calls and 3,000 registered downloads from 50,000 visitors over 64 countries.

Download Full-text

Building Knowledge Graph using Pre-trained Language Model for Learning Entity-aware Relationships

2020 IEEE International Conference on Computing, Power and Communication Technologies (GUCON) ◽

10.1109/gucon48875.2020.9231227 ◽

2020 ◽

Author(s):

Abhijeet Kumar ◽

Abhishek Pandey ◽

Rohit Gadia ◽

Mridul Mishra

Keyword(s):

Language Model ◽

Knowledge Graph

Download Full-text

Deep Learning-Based Knowledge Graph Generation for COVID-19

Sustainability ◽

10.3390/su13042276 ◽

2021 ◽

Vol 13 (4) ◽

pp. 2276

Author(s):

Taejin Kim ◽

Yeoil Yun ◽

Namgyu Kim

Keyword(s):

Knowledge Base ◽

Language Model ◽

Fine Tuning ◽

Knowledge Graph ◽

Specific Information ◽

Specific Knowledge ◽

Text Documents ◽

Domain Specific ◽

Open Information Extraction ◽

Domain Specific Knowledge

Many attempts have been made to construct new domain-specific knowledge graphs using the existing knowledge base of various domains. However, traditional “dictionary-based” or “supervised” knowledge graph building methods rely on predefined human-annotated resources of entities and their relationships. The cost of creating human-annotated resources is high in terms of both time and effort. This means that relying on human-annotated resources will not allow rapid adaptability in describing new knowledge when domain-specific information is added or updated very frequently, such as with the recent coronavirus disease-19 (COVID-19) pandemic situation. Therefore, in this study, we propose an Open Information Extraction (OpenIE) system based on unsupervised learning without a pre-built dataset. The proposed method obtains knowledge from a vast amount of text documents about COVID-19 rather than a general knowledge base and add this to the existing knowledge graph. First, we constructed a COVID-19 entity dictionary, and then we scraped a large text dataset related to COVID-19. Next, we constructed a COVID-19 perspective language model by fine-tuning the bidirectional encoder representations from transformer (BERT) pre-trained language model. Finally, we defined a new COVID-19-specific knowledge base by extracting connecting words between COVID-19 entities using the BERT self-attention weight from COVID-19 sentences. Experimental results demonstrated that the proposed Co-BERT model outperforms the original BERT in terms of mask prediction accuracy and metric for evaluation of translation with explicit ordering (METEOR) score.

Download Full-text

DOMAIN SPECIFIC KEY FEATURE EXTRACTION USING KNOWLEDGE GRAPH MINING

Multiple Criteria Decision Making ◽

10.22367/mcdm.2020.15.01 ◽

2020 ◽

Vol 15 ◽

pp. 1-22

Author(s):

Mohit Kumar Barai ◽

◽

Subhasis Sanyal ◽

Keyword(s):

Feature Extraction ◽

Language Processing ◽

Graph Mining ◽

Text Processing ◽

Weighted Graph ◽

Knowledge Graph ◽

Specific Data ◽

Domain Specific ◽

Extraction Algorithm ◽

Processing Product

In the field of text mining, many novel feature extraction approaches have been propounded. The following research paper is based on a novel feature extraction algorithm. In this paper, to formulate this approach, a weighted graph mining has been used to ensure the effectiveness of the feature extraction and computational efficiency; only the most effective graphs representing the maximum number of triangles based on a predefined relational criterion have been considered. The proposed novel technique is an amalgamation of the relation between words surrounding an aspect of the product and the lexicon-based connection among those words, which creates a relational triangle. A maximum number of a triangle covering an element has been accounted as a prime feature. The proposed algorithm performs more than three times better than TF-IDF within a limited set of data in analysis based on domain-specific data. Keywords: feature extraction, natural language processing, product review, text processing, knowledge graph.

Download Full-text