Named Entity Recognition for Terahertz Domain Knowledge Graph based on Albert-BiLSTM-CRF

Author(s):  
Xiao Zhang ◽  
Chuanzhen Li ◽  
Huaichang Du
2021 ◽  
Vol 16 ◽  
pp. 1-10
Author(s):  
Husni Teja Sukmana ◽  
JM Muslimin ◽  
Asep Fajar Firmansyah ◽  
Lee Kyung Oh

In Indonesia, philanthropy is identical to Zakat. Zakat belongs to a specific domain because it has its characteristics of knowledge. This research studied knowledge graph in the Zakat domain called KGZ which is conducted in Indonesia. This area is still rarely performed, thus it becomes the first knowledge graph for Zakat in Indonesia. It is designed to provide basic knowledge on Zakat and managing the Zakat in Indonesia. There are some issues with building KGZ, firstly, the existing Indonesian named entity recognition (NER) is non-restricted and general-purpose based which data is obtained from a general source like news. Second, there is no dataset for NER in the Zakat domain. We define four steps to build KGZ, involving data acquisition, extracting entities and their relationship, mapping to ontology, and deploying knowledge graphs and visualizations. This research contributed a knowledge graph for Zakat (KGZ) and a building NER model for Zakat, called KGZ-NER. We defined 17 new named entity classes related to Zakat with 272 entities, 169 relationships and provided labelled datasets for KGZ-NER that are publicly accessible. We applied the Indonesian-Open Domain Information Extractor framework to process identifying entities’ relationships. Then designed modeling of information using resources description framework (RDF) to build the knowledge base for KGZ and store it to GraphDB, a product from Ontotext. This NER model has a precision 0.7641, recall 0.4544, and F1-score 0.5655. The increasing data size of KGZ is required to discover all of the knowledge of Zakat and managing Zakat in Indonesia. Moreover, sufficient resources are required in future works.


2019 ◽  
Vol 9 (1) ◽  
pp. 15 ◽  
Author(s):  
Runyu Fan ◽  
Lizhe Wang ◽  
Jining Yan ◽  
Weijing Song ◽  
Yingqian Zhu ◽  
...  

Constructing a knowledge graph of geological hazards literature can facilitate the reuse of geological hazards literature and provide a reference for geological hazard governance. Named entity recognition (NER), as a core technology for constructing a geological hazard knowledge graph, has to face the challenges that named entities in geological hazard literature are diverse in form, ambiguous in semantics, and uncertain in context. This can introduce difficulties in designing practical features during the NER classification. To address the above problem, this paper proposes a deep learning-based NER model; namely, the deep, multi-branch BiGRU-CRF model, which combines a multi-branch bidirectional gated recurrent unit (BiGRU) layer and a conditional random field (CRF) model. In an end-to-end and supervised process, the proposed model automatically learns and transforms features by a multi-branch bidirectional GRU layer and enhances the output with a CRF layer. Besides the deep, multi-branch BiGRU-CRF model, we also proposed a pattern-based corpus construction method to construct the corpus needed for the deep, multi-branch BiGRU-CRF model. Experimental results indicated the proposed deep, multi-branch BiGRU-CRF model outperformed state-of-the-art models. The proposed deep, multi-branch BiGRU-CRF model constructed a large-scale geological hazard literature knowledge graph containing 34,457 entities nodes and 84,561 relations.


2019 ◽  
Vol 11 (8) ◽  
pp. 180
Author(s):  
Fei Liao ◽  
Liangli Ma ◽  
Jingjing Pei ◽  
Linshan Tan

Military named entity recognition (MNER) is one of the key technologies in military information extraction. Traditional methods for the MNER task rely on cumbersome feature engineering and specialized domain knowledge. In order to solve this problem, we propose a method employing a bidirectional long short-term memory (BiLSTM) neural network with a self-attention mechanism to identify the military entities automatically. We obtain distributed vector representations of the military corpus by unsupervised learning and the BiLSTM model combined with the self-attention mechanism is adopted to capture contextual information fully carried by the character vector sequence. The experimental results show that the self-attention mechanism can improve effectively the performance of MNER task. The F-score of the military documents and network military texts identification was 90.15% and 89.34%, respectively, which was better than other models.


2015 ◽  
Vol 7 (S1) ◽  
Author(s):  
Tsendsuren Munkhdalai ◽  
Meijing Li ◽  
Khuyagbaatar Batsuren ◽  
Hyeon Ah Park ◽  
Nak Hyeon Choi ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Wangping Xiong ◽  
Jun Cao ◽  
Xian Zhou ◽  
Jianqiang Du ◽  
Bin Nie ◽  
...  

Background. Chinese patent medicines are increasingly used clinically, and the prescription drug monitoring program is an effective tool to promote drug safety and maintain health. Methods. We constructed a prescription drug monitoring program for Chinese patent medicines based on knowledge graphs. First, we extracted the key information of Chinese patent medicines, diseases, and symptoms from the domain-specific corpus by the information extraction. Second, based on the extracted entities and relationships, a knowledge graph was constructed to form a rule base for the monitoring of data. Then, the named entity recognition model extracted the key information from the electronic medical record to be monitored and matched the knowledge graph to realize the monitoring of the Chinese patent medicines in the prescription. Results. Named entity recognition based on the pretrained model achieved an F1 value of 83.3% on the Chinese patent medicines dataset. On the basis of entity recognition technology and knowledge graph, we implemented a prescription drug monitoring program for Chinese patent medicines. The accuracy rate of combined medication monitoring of three or more drugs of the program increased from 68% to 86.4%. The accuracy rate of drug control monitoring increased from 70% to 97%. The response time for conflicting prescriptions with two drugs was shortened from 1.3S to 0.8S. The response time for conflicting prescriptions with three or more drugs was shortened from 5.2S to 1.4S. Conclusions. The program constructed in this study can respond quickly and improve the efficiency of monitoring prescriptions. It is of great significance to ensure the safety of patients’ medication.


2012 ◽  
Vol 3 (1) ◽  
pp. 55-71 ◽  
Author(s):  
O. Isaac Osesina ◽  
John Talburt

Over the past decade, huge volumes of valuable information have become available to organizations. However, the existence of a substantial part of the information in unstructured form makes the automated extraction of business intelligence and decision support information from it difficult. By identifying the entities and their roles within unstructured text in a process known as semantic named entity recognition, unstructured text can be made more readily available for traditional business processes. The authors present a novel NER approach that is independent of the text language and subject domain making it applicable within different organizations. It departs from the natural language and machine learning methods in that it leverages the wide availability of huge amounts of data as well as high-performance computing to provide a data-intensive solution. Also, it does not rely on external resources such as dictionaries and gazettes for the language or domain knowledge.


2020 ◽  
Vol 10 (18) ◽  
pp. 6429
Author(s):  
SungMin Yang ◽  
SoYeop Yoo ◽  
OkRan Jeong

Along with studies on artificial intelligence technology, research is also being carried out actively in the field of natural language processing to understand and process people’s language, in other words, natural language. For computers to learn on their own, the skill of understanding natural language is very important. There are a wide variety of tasks involved in the field of natural language processing, but we would like to focus on the named entity registration and relation extraction task, which is considered to be the most important in understanding sentences. We propose DeNERT-KG, a model that can extract subject, object, and relationships, to grasp the meaning inherent in a sentence. Based on the BERT language model and Deep Q-Network, the named entity recognition (NER) model for extracting subject and object is established, and a knowledge graph is applied for relation extraction. Using the DeNERT-KG model, it is possible to extract the subject, type of subject, object, type of object, and relationship from a sentence, and verify this model through experiments.


Sign in / Sign up

Export Citation Format

Share Document