scholarly journals Domain Specific Entity Recognition with Semantic-based Deep Learning Approach

IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Quoc Hung Ngo ◽  
Tahar Kechadi ◽  
Nhien-An Le-Khac
2020 ◽  
Vol 34 (01) ◽  
pp. 598-605
Author(s):  
Chaoran Cheng ◽  
Fei Tan ◽  
Zhi Wei

We consider the problem of Named Entity Recognition (NER) on biomedical scientific literature, and more specifically the genomic variants recognition in this work. Significant success has been achieved for NER on canonical tasks in recent years where large data sets are generally available. However, it remains a challenging problem on many domain-specific areas, especially the domains where only small gold annotations can be obtained. In addition, genomic variant entities exhibit diverse linguistic heterogeneity, differing much from those that have been characterized in existing canonical NER tasks. The state-of-the-art machine learning approaches heavily rely on arduous feature engineering to characterize those unique patterns. In this work, we present the first successful end-to-end deep learning approach to bridge the gap between generic NER algorithms and low-resource applications through genomic variants recognition. Our proposed model can result in promising performance without any hand-crafted features or post-processing rules. Our extensive experiments and results may shed light on other similar low-resource NER applications.


2021 ◽  
Vol 14 (39) ◽  
pp. 2998-3006
Author(s):  
Birhanu Gardie ◽  
◽  
Smegnew Asemie ◽  
Kassahun Azezew

2019 ◽  
Vol 2019 ◽  
pp. 1-12 ◽  
Author(s):  
Tatdow Pansombut ◽  
Siripen Wikaisuksakul ◽  
Kittiya Khongkraphan ◽  
Aniruth Phon-on

This paper presents the recognition for WHO classification of acute lymphoblastic leukaemia (ALL) subtypes. The two ALL subtypes considered are T-lymphoblastic leukaemia (pre-T) and B-lymphoblastic leukaemia (pre-B). They exhibit various characteristics which make it difficult to distinguish between subtypes from their mature cells, lymphocytes. In a common approach, handcrafted features must be well designed for this complex domain-specific problem. With deep learning approach, handcrafted feature engineering can be eliminated because a deep learning method can automate this task through the multilayer architecture of a convolutional neural network (CNN). In this work, we implement a CNN classifier to explore the feasibility of deep learning approach to identify lymphocytes and ALL subtypes, and this approach is benchmarked against a dominant approach of support vector machines (SVMs) applying handcrafted feature engineering. Additionally, two traditional machine learning classifiers, multilayer perceptron (MLP), and random forest are also applied for the comparison. The experiments show that our CNN classifier delivers better performance to identify normal lymphocytes and pre-B cells. This shows a great potential for image classification with no requirement of multiple preprocessing steps from feature engineering.


Author(s):  
Ismail El Bazi ◽  
Nabil Laachfoubi

Most of the Arabic Named Entity Recognition (NER) systems depend massively on external resources and handmade feature engineering to achieve state-of-the-art results. To overcome such limitations, we proposed, in this paper, to use deep learning approach to tackle the Arabic NER task. We introduced a neural network architecture based on bidirectional Long Short-Term Memory (LSTM) and Conditional Random Fields (CRF) and experimented with various commonly used hyperparameters to assess their effect on the overall performance of our system. Our model gets two sources of information about words as input: pre-trained word embeddings and character-based representations and eliminated the need for any task-specific knowledge or feature engineering. We obtained state-of-the-art result on the standard ANERcorp corpus with an F1 score of 90.6%.


Sign in / Sign up

Export Citation Format

Share Document