name entity Latest Research Papers

A combined high-quality manual annotation and deep-learning natural language processing study is reported to make accurate name entity recognition (NER) for biomedical literatures. A home-made version of entity annotation guidelines on biomedical literatures was constructed. Our manual annotations have an overall over 92% consistency for all the four entity types such as gene, variant, disease and species with the same publicly available annotated corpora from other experts previously. A total of 400 full biomedical articles from PubMed are annotated based on our home-made entity annotation guidelines. Both a BERT-based large model and a DistilBERT-based simplified model were constructed, trained and optimized for offline and online inference, respectively. The F1-scores of NER of gene, variant, disease and species for the BERT-based model are 97.28%, 93.52%, 92.54% and 95.76%, respectively, while those for the DistilBERT-based model are 95.14%, 86.26%, 91.37% and 89.92%, respectively. The F1 scores of the DistilBERT-based NER model retains 97.8%, 92.2%, 98.7% and 93.9% of those of BERT-based NER for gene, variant, disease and species, respectively. Moreover, the performance for both our BERT-based NER model and DistilBERT-based NER model outperforms that of the state-of-art model,BioBERT, indicating the significance to train an NER model on biomedical-domain literatures jointly with high-quality annotated datasets.

Download Full-text

A Graph Database Representation of Portuguese Criminal-Related Documents

Informatics ◽

10.3390/informatics8020037 ◽

2021 ◽

Vol 8 (2) ◽

pp. 37

Author(s):

Gonçalo Carnaz ◽

Vitor Beires Nogueira ◽

Mário Antunes

Keyword(s):

Information Extraction ◽

Name Entity Recognition ◽

Entity Recognition ◽

Automatic Extraction ◽

Graph Database ◽

Named Entities ◽

Vast Number ◽

Name Entity ◽

Manual Analysis ◽

F Measure

Organizations have been challenged by the need to process an increasing amount of data, both structured and unstructured, retrieved from heterogeneous sources. Criminal investigation police are among these organizations, as they have to manually process a vast number of criminal reports, news articles related to crimes, occurrence and evidence reports, and other unstructured documents. Automatic extraction and representation of data and knowledge in such documents is an essential task to reduce the manual analysis burden and to automate the discovering of names and entities relationships that may exist in a case. This paper presents SEMCrime, a framework used to extract and classify named-entities and relations in Portuguese criminal reports and documents, and represent the data retrieved into a graph database. A 5WH1 (Who, What, Why, Where, When, and How) information extraction method was applied, and a graph database representation was used to store and visualize the relations extracted from the documents. Promising results were obtained with a prototype developed to evaluate the framework, namely a name-entity recognition with an F-Measure of 0.73, and a 5W1H information extraction performance with an F-Measure of 0.65.

Download Full-text

BIDIRECTIONAL LSTM-CNNs UNTUK EKSTRAKSI ENTITY LOKASI KEBAKARAN PADA BERITA ONLINE BERBAHASA INDONESIA

Seminar Nasional Official Statistics ◽

10.34123/semnasoffstat.v2020i1.601 ◽

2021 ◽

Vol 2020 (1) ◽

pp. 319-327

Author(s):

Alif Andika Putra ◽

Robert Kurniawan

Keyword(s):

Deep Learning ◽

Network Model ◽

Name Entity Recognition ◽

Entity Recognition ◽

Hybrid Network ◽

Name Entity ◽

Bidirectional Lstm

Provinsi DKI Jakarta merupakan salah satu daerah rawan terjadi kebakaran. BPBD DKI Jakarta sebagai Lembaga penanggulangan bencana memiliki salah satu misi yaitu meningkatkan kesiagaan masyarakat kota Jakarta terhadap bencana, salah satunya bencana kebakaran. Peningkatan kesiagaan terhadap bencana kebakaran dapat dilakukan dengan penyajian informasi mengenai lokasi rawan terjadinya kebakaran. BPBD DKI Jakarta dalam hal ini dapat memanfaatkan perkembangan teknologi informasi dan komunikasi, seperti internet sebagai sumber daya informasi. Persebaran informasi melalui internet salah satunya dimuat dalam bentuk web berita online. Informasi yang terdapat pada artikel berita online dapat dijadikan sebagai sumber informasi dalam memperoleh data. Suatu rangkaian proses diperlukan untuk dapat mengekstraksi informasi yang ada didalam artikel berita online. Pada penelitian ini, Eksraksi informasi pada artikel berita online dilakukan dengan mengklasifikasi entity ke dalam kelas-kelas tertentu menggunakan Name Entity Recognition (NER) dengan pendekatan deep learning hybrid network model Bidirectional LSTM-CNNs (BLSTM-CNNs). Penelitian ini menunjukan model NER dengan BLSTM-CNNs memiliki performa yang baik berdasarkan hasil perhitungan F1-score, presisi dan recall. Kemudian, dilakukan pemetaan berdasarkan entity lokasi yang terdapat dalam artikel berita online hasil klasifikasi menggunakan model NER dengan BLSTM-CNNs.

Download Full-text

Information Extraction Tasks based on BERT and SpaCy on Tourism Domain

ECTI Transactions on Computer and Information Technology (ECTI-CIT) ◽

10.37936/ecti-cit.2021151.228621 ◽

2021 ◽

Vol 15 (1) ◽

pp. 108-122

Author(s):

Chantana Chantrapornchai ◽

Aphisit Tunsakul

Keyword(s):

Name Entity Recognition ◽

Text Summarization ◽

Training Data ◽

Entity Recognition ◽

Entity Extraction ◽

Data Set ◽

Name Entity ◽

Sentence Extraction ◽

Relation Type ◽

Proper Training

In this paper, we present two methodologies to extract particular information based on the full text returned from the search engine to facilitate the users. The approaches are based three tasks: name entity recognition (NER), text classiﬁcation and text summarization. The ﬁrst step is the building training data and data cleansing. We consider tourism domain such as restaurant, hotels, shopping and tourism data set crawling from the websites. First, the tourism data are gathered and the vocabularies are built. Several minor steps include sentence extraction, relation and name entity extraction for tagging purpose. These steps are needed for creating proper training data. Then, the recognition model of a given entity type can be built. From the experiments, given review texts, we demonstrate to build the model to extract the desired entity,i.e, name, location, facility as well as relation type, classify the reviews or summarize the reviews. Two tools, SpaCy and BERT, are used to compare the performance of these tasks.

Download Full-text

Do Judge an Entity by Its Name! Entity Typing Using Language Models

The Semantic Web: ESWC 2021 Satellite Events - Lecture Notes in Computer Science ◽

10.1007/978-3-030-80418-3_12 ◽

2021 ◽

pp. 65-70

Author(s):

Russa Biswas ◽

Radina Sofronova ◽

Mehwish Alam ◽

Nicolas Heist ◽

Heiko Paulheim ◽

...

Keyword(s):

Language Models ◽

Name Entity

Download Full-text

Automatic Arabic Named Entity Extraction and Classification for Information Retrieval

International Journal on Natural Language Computing ◽

10.5121/ijnlc.2020.9601 ◽

2020 ◽

Vol 9 (6) ◽

pp. 1-22

Author(s):

Omar ASBAYOU

Keyword(s):

Information Retrieval ◽

Named Entity Recognition ◽

Entity Recognition ◽

Entity Extraction ◽

Rule Based ◽

Named Entity ◽

Named Entity Extraction ◽

Name Entity ◽

System Output

This article tries to explain our rule-based Arabic Named Entity recognition (NER) and classification system. It is based on lists of classified proper names (PN) and particularly on syntactico-semantic patterns resulting in fine classification of Arabic NE. These patterns use syntactico-semantic combination of morpho-syntactic and syntactic entities. It also uses lexical classification of trigger words and NE extensions. These linguistic data are essential not only to name entity extraction but also to the taxonomic classification and to determining the NE frontiers. Our method is also based on the contextualisation and on the notion of NE class attributes and values. Inspired from X-bar theory and immediate constituents, we built a rule-based NER system composed of five levels of syntactico-semantic combination. We also show how the fine NE annotations in our system output (XML database) is exploited in information retrieval and information extraction.

Download Full-text

MASK: A Success Story for An International Collaboration

International Journal for Population Data Science ◽

10.23889/ijpds.v5i5.1621 ◽

2020 ◽

Vol 5 (5) ◽

Author(s):

Mahmoud Azimaee ◽

Gangamma Kalappa ◽

Nikola Milosevic ◽

Goran Nenadic ◽

Hesam Dadafarin ◽

...

Keyword(s):

Personal Information ◽

Name Entity Recognition ◽

Entity Recognition ◽

Free Text ◽

Rule Based ◽

Data Annotation ◽

Name Entity ◽

Laboratory Test Results ◽

Computer Scientists ◽

The University

IntroductionA significant amount of valuable information in Electronic Health Records (EHR) such as laboratory test results or echocardiogram interpretations is embedded in lengthy free-text fields. Often patients’ personal information is also included in these narratives. Privacy legislation in different jurisdictions requires de-identification of this information prior to making it available for research. This process can be challenging and time-consuming. In particular, rule-based algorithms may lead to over-masking of essential medical terms, conditions, or devices that are named after individuals. Objectives and ApproachWe aimed to enhance ICES’ existing rule-based application to make it contextually-driven by applying Artificial Intelligence (AI). The ICES team collaborated with computer scientists at the University of Manchester who had already published work in this area and Evenset, a Toronto-based software company. Based on the Manchester University de-identification framework for name entity recognition, three machine learning-based algorithms for name entity recognition were implemented: CRF, BiLSTM recurrent neural networks with GLoVe and ELMo word embeddings. The models were trained on three different types of ICES data: Laboratory results, Electronic Medical Record (EMR) and echocardiogram data. Evenset developed the user interface and the masking modules. ResultsPreliminary tests have generated very promising results. To improve accuracy of the models, additional data annotation to expand the training datasets is currently being undertaken at ICES. The final framework will be available as an open-source tool for public. Conclusion / ImplicationsA collaborative approach for solving complex problems like de-identification of text-based medical data is highly efficient, especially where there are unique sets of expertise, resources, data and clinical knowledge among stakeholders.

Download Full-text

An Ontology-based Name Entity Recognition NER and NLP Systems in Arabic Storytelling

Al-Azhar Bulletin of Science ◽

10.21608/absb.2020.44367.1088 ◽

2020 ◽

Vol 31 (2) ◽

pp. 31-38

Author(s):

Marwa Elgamal ◽

Mohamed Abou-Kreisha ◽

Reda Abo Elezz ◽

Salwa Hamada

Keyword(s):

Name Entity Recognition ◽

Entity Recognition ◽

Name Entity

Download Full-text

Developing Name Entity Recognition for Structured and Unstructured Text Formatting Dataset

2020 Fifth International Conference on Informatics and Computing (ICIC) ◽

10.1109/icic50835.2020.9288566 ◽

2020 ◽

Author(s):

Nadhia Salsabila Azzahra ◽

Muhammad Okky Ibrohim ◽

Junaedi Fahmi ◽

Bagus Fajar Apriyanto ◽

Oskar Riandi

Keyword(s):

Name Entity Recognition ◽

Entity Recognition ◽

Unstructured Text ◽

Name Entity

Download Full-text

name entity
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Referent graph embedding model for name entity recognition of Chinese car reviews

Accurate Name Entity Recognition for Biomedical Literatures: A Combined High-quality Manual Annotation and Deep-learning Natural Language Processing Study

A Graph Database Representation of Portuguese Criminal-Related Documents

BIDIRECTIONAL LSTM-CNNs UNTUK EKSTRAKSI ENTITY LOKASI KEBAKARAN PADA BERITA ONLINE BERBAHASA INDONESIA

Information Extraction Tasks based on BERT and SpaCy on Tourism Domain

Do Judge an Entity by Its Name! Entity Typing Using Language Models

Automatic Arabic Named Entity Extraction and Classification for Information Retrieval

MASK: A Success Story for An International Collaboration

An Ontology-based Name Entity Recognition NER and NLP Systems in Arabic Storytelling

Developing Name Entity Recognition for Structured and Unstructured Text Formatting Dataset

Export Citation Format

name entityRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Referent graph embedding model for name entity recognition of Chinese car reviews

Accurate Name Entity Recognition for Biomedical Literatures: A Combined High-quality Manual Annotation and Deep-learning Natural Language Processing Study

A Graph Database Representation of Portuguese Criminal-Related Documents

BIDIRECTIONAL LSTM-CNNs UNTUK EKSTRAKSI ENTITY LOKASI KEBAKARAN PADA BERITA ONLINE BERBAHASA INDONESIA

Information Extraction Tasks based on BERT and SpaCy on Tourism Domain

Do Judge an Entity by Its Name! Entity Typing Using Language Models

Automatic Arabic Named Entity Extraction and Classification for Information Retrieval

MASK: A Success Story for An International Collaboration

An Ontology-based Name Entity Recognition NER and NLP Systems in Arabic Storytelling

Developing Name Entity Recognition for Structured and Unstructured Text Formatting Dataset

name entity
Recently Published Documents