Knowledge Graph Construction Framework in the Securities Domain Based on FinBERT-CRF Named Entity Recognition Model

Knowledge Graph has gradually become one of core drivers advancing the Internet and AI in recent years, while there is currently no normal knowledge graph in the field of agriculture. Named Entity Recognition (NER), one important step in constructing knowledge graphs, has become a hot topic in both academia and industry. With the help of the Bidirectional Long Short-Term Memory Network (Bi-LSTM) and Conditional Random Field (CRF) model, we introduce a method of ensemble learning, and implement a named entity recognition model ELER. Our model achieves good results for the CoNLL2003 data set, the accuracy and F1 value in the best experimental results are respectively improved by 1.37% and 0.7% when compared with the BiLSTM-CRF model. In addition, our model achieves an F1 score of 91% for the agricultural data set AgriNER2018, which proves the validity of ELER model for small agriculture sample data sets and lays a foundation for the construction of agricultural knowledge graphs.

Download Full-text

Multilingual Named Entity Recognition Model for Indonesian Health Insurance Question Answering System

2020 3rd International Conference on Information and Communications Technology (ICOIACT) ◽

10.1109/icoiact50329.2020.9332027 ◽

2020 ◽

Author(s):

Budi Sulistiyo Jati ◽

ST Widyawan ◽

S.T. Muhammad Nur Rizal

Keyword(s):

Health Insurance ◽

Question Answering ◽

Named Entity Recognition ◽

Entity Recognition ◽

Question Answering System ◽

Recognition Model ◽

Named Entity

Download Full-text

CyberEyes: Cybersecurity Entity Recognition Model Based on Graph Convolutional Network

The Computer Journal ◽

10.1093/comjnl/bxaa141 ◽

2020 ◽

Author(s):

Yong Fang ◽

Yuchi Zhang ◽

Cheng Huang

Keyword(s):

Complex Structure ◽

Named Entity Recognition ◽

Internet Technology ◽

Entity Recognition ◽

Local Context ◽

Knowledge Graph ◽

Convolutional Network ◽

Evaluation Standard ◽

Recognition Model ◽

Non Local

Abstract Cybersecurity has gradually become the public focus between common people and countries with the high development of Internet technology in daily life. The cybersecurity knowledge analysis methods have achieved high evolution with the help of knowledge graph technology, especially a lot of threat intelligence information could be extracted with fine granularity. But named entity recognition (NER) is the primary task for constructing security knowledge graph. Traditional NER models are difficult to determine entities that have a complex structure in the field of cybersecurity, and it is difficult to capture non-local and non-sequential dependencies. In this paper, we propose a cybersecurity entity recognition model CyberEyes that uses non-local dependencies extracted by graph convolutional neural networks. The model can capture both local context and graph-level non-local dependencies. In the evaluation experiments, our model reached an F1 score of 90.28% on the cybersecurity corpus under the gold evaluation standard for NER, which performed better than the 86.49% obtained by the classic CNN-BiLSTM-CRF model.

Download Full-text

Building the Knowledge Graph for Zakat (KGZ) in Indonesian Language

ASM Science Journal ◽

10.32802/asmscj.2021.758 ◽

2021 ◽

Vol 16 ◽

pp. 1-10

Author(s):

Husni Teja Sukmana ◽

JM Muslimin ◽

Asep Fajar Firmansyah ◽

Lee Kyung Oh

Keyword(s):

Named Entity Recognition ◽

General Purpose ◽

Entity Recognition ◽

Basic Knowledge ◽

Knowledge Graph ◽

Specific Domain ◽

Named Entity ◽

Description Framework ◽

General Source ◽

Domain Information

In Indonesia, philanthropy is identical to Zakat. Zakat belongs to a specific domain because it has its characteristics of knowledge. This research studied knowledge graph in the Zakat domain called KGZ which is conducted in Indonesia. This area is still rarely performed, thus it becomes the first knowledge graph for Zakat in Indonesia. It is designed to provide basic knowledge on Zakat and managing the Zakat in Indonesia. There are some issues with building KGZ, firstly, the existing Indonesian named entity recognition (NER) is non-restricted and general-purpose based which data is obtained from a general source like news. Second, there is no dataset for NER in the Zakat domain. We define four steps to build KGZ, involving data acquisition, extracting entities and their relationship, mapping to ontology, and deploying knowledge graphs and visualizations. This research contributed a knowledge graph for Zakat (KGZ) and a building NER model for Zakat, called KGZ-NER. We defined 17 new named entity classes related to Zakat with 272 entities, 169 relationships and provided labelled datasets for KGZ-NER that are publicly accessible. We applied the Indonesian-Open Domain Information Extractor framework to process identifying entities’ relationships. Then designed modeling of information using resources description framework (RDF) to build the knowledge base for KGZ and store it to GraphDB, a product from Ontotext. This NER model has a precision 0.7641, recall 0.4544, and F1-score 0.5655. The increasing data size of KGZ is required to discover all of the knowledge of Zakat and managing Zakat in Indonesia. Moreover, sufficient resources are required in future works.

Download Full-text

Deep Learning-Based Named Entity Recognition and Knowledge Graph Construction for Geological Hazards

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9010015 ◽

2019 ◽

Vol 9 (1) ◽

pp. 15 ◽

Cited By ~ 2

Author(s):

Runyu Fan ◽

Lizhe Wang ◽

Jining Yan ◽

Weijing Song ◽

Yingqian Zhu ◽

...

Keyword(s):

Deep Learning ◽

Large Scale ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Entity Recognition ◽

Knowledge Graph ◽

Geological Hazard ◽

Geological Hazards ◽

Named Entity ◽

Corpus Construction

Constructing a knowledge graph of geological hazards literature can facilitate the reuse of geological hazards literature and provide a reference for geological hazard governance. Named entity recognition (NER), as a core technology for constructing a geological hazard knowledge graph, has to face the challenges that named entities in geological hazard literature are diverse in form, ambiguous in semantics, and uncertain in context. This can introduce difficulties in designing practical features during the NER classification. To address the above problem, this paper proposes a deep learning-based NER model; namely, the deep, multi-branch BiGRU-CRF model, which combines a multi-branch bidirectional gated recurrent unit (BiGRU) layer and a conditional random field (CRF) model. In an end-to-end and supervised process, the proposed model automatically learns and transforms features by a multi-branch bidirectional GRU layer and enhances the output with a CRF layer. Besides the deep, multi-branch BiGRU-CRF model, we also proposed a pattern-based corpus construction method to construct the corpus needed for the deep, multi-branch BiGRU-CRF model. Experimental results indicated the proposed deep, multi-branch BiGRU-CRF model outperformed state-of-the-art models. The proposed deep, multi-branch BiGRU-CRF model constructed a large-scale geological hazard literature knowledge graph containing 34,457 entities nodes and 84,561 relations.

Download Full-text

Clinical Named Entity Recognition from Chinese Electronic Medical Records Based on Deep Learning Pretraining

Journal of Healthcare Engineering ◽

10.1155/2020/8829219 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Lejun Gong ◽

Zhifei Zhang ◽

Shiqi Chen

Keyword(s):

Deep Learning ◽

Electronic Medical Records ◽

Medical Records ◽

Named Entity Recognition ◽

Clinical Entity ◽

Fine Tuning ◽

Entity Recognition ◽

Recognition Model ◽

Named Entity ◽

Model Based

Background. Clinical named entity recognition is the basic task of mining electronic medical records text, which are with some challenges containing the language features of Chinese electronic medical records text with many compound entities, serious missing sentence components, and unclear entity boundary. Moreover, the corpus of Chinese electronic medical records is difficult to obtain. Methods. Aiming at these characteristics of Chinese electronic medical records, this study proposed a Chinese clinical entity recognition model based on deep learning pretraining. The model used word embedding from domain corpus and fine-tuning of entity recognition model pretrained by relevant corpus. Then BiLSTM and Transformer are, respectively, used as feature extractors to identify four types of clinical entities including diseases, symptoms, drugs, and operations from the text of Chinese electronic medical records. Results. 75.06% Macro-P, 76.40% Macro-R, and 75.72% Macro-F1 aiming at test dataset could be achieved. These experiments show that the Chinese clinical entity recognition model based on deep learning pretraining can effectively improve the recognition effect. Conclusions. These experiments show that the proposed Chinese clinical entity recognition model based on deep learning pretraining can effectively improve the recognition performance.

Download Full-text

SatelliteNER: An Effective Named Entity Recognition Model for the Satellite Domain

Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management ◽

10.5220/0010147401000107 ◽

2020 ◽

Author(s):

Omid Jafari ◽

Parth Nagarkar ◽

Bhagwan Thatte ◽

Carl Ingram

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Recognition Model ◽

Named Entity

Download Full-text

Design and Evaluation of a Prescription Drug Monitoring Program for Chinese Patent Medicine based on Knowledge Graph

Evidence-based Complementary and Alternative Medicine ◽

10.1155/2021/9970063 ◽

2021 ◽

Vol 2021 ◽

pp. 1-8

Author(s):

Wangping Xiong ◽

Jun Cao ◽

Xian Zhou ◽

Jianqiang Du ◽

Bin Nie ◽

...

Keyword(s):

Prescription Drug ◽

Drug Monitoring ◽

Named Entity Recognition ◽

Monitoring Program ◽

Entity Recognition ◽

Knowledge Graph ◽

Named Entity ◽

Patent Medicines ◽

Prescription Drug Monitoring Program ◽

Chinese Patent

Background. Chinese patent medicines are increasingly used clinically, and the prescription drug monitoring program is an effective tool to promote drug safety and maintain health. Methods. We constructed a prescription drug monitoring program for Chinese patent medicines based on knowledge graphs. First, we extracted the key information of Chinese patent medicines, diseases, and symptoms from the domain-specific corpus by the information extraction. Second, based on the extracted entities and relationships, a knowledge graph was constructed to form a rule base for the monitoring of data. Then, the named entity recognition model extracted the key information from the electronic medical record to be monitored and matched the knowledge graph to realize the monitoring of the Chinese patent medicines in the prescription. Results. Named entity recognition based on the pretrained model achieved an F1 value of 83.3% on the Chinese patent medicines dataset. On the basis of entity recognition technology and knowledge graph, we implemented a prescription drug monitoring program for Chinese patent medicines. The accuracy rate of combined medication monitoring of three or more drugs of the program increased from 68% to 86.4%. The accuracy rate of drug control monitoring increased from 70% to 97%. The response time for conflicting prescriptions with two drugs was shortened from 1.3S to 0.8S. The response time for conflicting prescriptions with three or more drugs was shortened from 5.2S to 1.4S. Conclusions. The program constructed in this study can respond quickly and improve the efficiency of monitoring prescriptions. It is of great significance to ensure the safety of patients’ medication.

Download Full-text

DeNERT-KG: Named Entity and Relation Extraction Model Using DQN, Knowledge Graph, and BERT

Applied Sciences ◽

10.3390/app10186429 ◽

2020 ◽

Vol 10 (18) ◽

pp. 6429

Author(s):

SungMin Yang ◽

SoYeop Yoo ◽

OkRan Jeong

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Language Model ◽

Named Entity Recognition ◽

Relation Extraction ◽

Entity Recognition ◽

Knowledge Graph ◽

Named Entity ◽

Artificial Intelligence Technology

Along with studies on artificial intelligence technology, research is also being carried out actively in the field of natural language processing to understand and process people’s language, in other words, natural language. For computers to learn on their own, the skill of understanding natural language is very important. There are a wide variety of tasks involved in the field of natural language processing, but we would like to focus on the named entity registration and relation extraction task, which is considered to be the most important in understanding sentences. We propose DeNERT-KG, a model that can extract subject, object, and relationships, to grasp the meaning inherent in a sentence. Based on the BERT language model and Deep Q-Network, the named entity recognition (NER) model for extracting subject and object is established, and a knowledge graph is applied for relation extraction. Using the DeNERT-KG model, it is possible to extract the subject, type of subject, object, type of object, and relationship from a sentence, and verify this model through experiments.

Download Full-text

Named Entity Recognition Model of Chinese Clinical Electronic Medical Record Based on XLNet-BiLSTM

10.21203/rs.3.rs-218833/v1 ◽

2021 ◽

Author(s):

Shen Zhou Feng ◽

Su Qian Min ◽

Guo Jing Lei

Keyword(s):

Medical Record ◽

Electronic Medical Record ◽

Named Entity Recognition ◽

Entity Recognition ◽

Vector Model ◽

Global Maximum ◽

Data Set ◽

Recognition Model ◽

Named Entity ◽

Feature Sequence

Abstract The recognition of named entities in Chinese clinical electronic medical records is one of the basic tasks to realize smart medical care. Aiming at the insufficient text semantic representation of the traditional word vector model and the inability of the recurrent neural network (RNN) model to solve the problems of long-term dependence, a Chinese clinical electronic medical record named entity recognition model XLNet-BiLSTM-MHA-CRF based on XLNet is proposed. Use the XLNet pre-training language model as the embedding layer to vectorize the medical record text to solve the problem of ambiguity; use the bidirectional long and short-term memory network (BiLSTM) gate control unit to obtain the forward and backward semantic feature information of the sentence; Then input the feature sequence to the multi-head attention layer (multi-head attention, MHA), use MHA to obtain information represented by different subspaces of the feature sequence, enhance the relevance of context semantics and eliminate noise; finally, input the conditional random field CRF to identify the global maximum 优 sequence. The experimental results show that the XLNet-BiLSTM-Attention-CRF model has achieved good results on the CCKS-2017 named entity recognition data set.

Download Full-text