entity annotation
Recently Published Documents


TOTAL DOCUMENTS

25
(FIVE YEARS 4)

H-INDEX

5
(FIVE YEARS 1)

2022 ◽  
Vol 59 (1) ◽  
pp. 102794
Author(s):  
Fernando A. Correia ◽  
Alexandre A.A. Almeida ◽  
José Luiz Nunes ◽  
Kaline G. Santos ◽  
Ivar A. Hartmann ◽  
...  


2021 ◽  
Author(s):  
Dao-Ling Huang ◽  
Quanlei Zeng ◽  
Yun Xiong ◽  
Shuixia Liu ◽  
Chaoqun Pang ◽  
...  

A combined high-quality manual annotation and deep-learning natural language processing study is reported to make accurate name entity recognition (NER) for biomedical literatures. A home-made version of entity annotation guidelines on biomedical literatures was constructed. Our manual annotations have an overall over 92% consistency for all the four entity types such as gene, variant, disease and species with the same publicly available annotated corpora from other experts previously. A total of 400 full biomedical articles from PubMed are annotated based on our home-made entity annotation guidelines. Both a BERT-based large model and a DistilBERT-based simplified model were constructed, trained and optimized for offline and online inference, respectively. The F1-scores of NER of gene, variant, disease and species for the BERT-based model are 97.28%, 93.52%, 92.54% and 95.76%, respectively, while those for the DistilBERT-based model are 95.14%, 86.26%, 91.37% and 89.92%, respectively. The F1 scores of the DistilBERT-based NER model retains 97.8%, 92.2%, 98.7% and 93.9% of those of BERT-based NER for gene, variant, disease and species, respectively. Moreover, the performance for both our BERT-based NER model and DistilBERT-based NER model outperforms that of the state-of-art model,BioBERT, indicating the significance to train an NER model on biomedical-domain literatures jointly with high-quality annotated datasets.



Author(s):  
Elena Álvarez-Mellado ◽  
María Luisa Díez-Platas ◽  
Pablo Ruiz-Fabo ◽  
Helena Bermúdez ◽  
Salvador Ros ◽  
...  

AbstractMedieval documents are a rich source of historical data. Performing named-entity recognition (NER) on this genre of texts can provide us with valuable historical evidence. However, traditional NER categories and schemes are usually designed with modern documents in mind (i.e. journalistic text) and the general-domain NER annotation schemes fail to capture the nature of medieval entities. In this paper we explore the challenges of performing named-entity annotation on a corpus of Spanish medieval documents: we discuss the mismatches that arise when applying traditional NER categories to a corpus of Spanish medieval documents and we propose a novel humanist-friendly TEI-compliant annotation scheme and guidelines intended to capture the particular nature of medieval entities.



Symmetry ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1673
Author(s):  
Jiabao Sheng ◽  
Aishan Wumaier ◽  
Zhe Li

To improve the performance of deep learning methods in case of a lack of labeled data for entity annotation in entity recognition tasks, this study proposes transfer learning schemes that combine the character to be the word to convert low-resource data symmetry into high-resource data. We combine character embedding, word embedding, and the embedding of the label features using high- and low-resource data based on the BiLSTM-CRF model, and perform the feature-transfer and parameter-sharing tasks in two domains of the BiLSTM network to annotate with zero resources. Before transfer learning, we must first calculate the label similarity between two different domains and select the label features with large similarity for feature transfer mapping. All training parameters of the source domain in the model are shared during the BiLSTM network processing and CRF layer. In addition, we also use the method of combining characters and words to reduce the problem of word segmentation across domains and reduce the error rate in label mapping. The results of experiments show that in terms of the overall F1 score, the proposed model without supervision was superior by 9.76 percentage points to the general parametric shared transfer learning method, and by 9.08 and 12.38 percentage points, respectively, to two recent high–low resource learning methods. The proposed scheme improves performance in terms of transfer learning between the high- and low-resource data and can identify the predicted data in the target domain.



Semantic Web ◽  
2018 ◽  
Vol 9 (3) ◽  
pp. 355-379
Author(s):  
Oluwaseyi Feyisetan ◽  
Elena Simperl ◽  
Markus Luczak-Roesch ◽  
Ramine Tinati ◽  
Nigel Shadbolt


Author(s):  
Takenobu Tokunaga ◽  
◽  
Hitoshi Nishikawa ◽  
Tomoya Iwakura ◽  
◽  
...  


Author(s):  
William Aprilius ◽  
Seng Hansun ◽  
Dennis Gunawan
Keyword(s):  


Sign in / Sign up

Export Citation Format

Share Document