Integrating Rule-Based System with Classification for Arabic Named Entity Recognition

ChemTok: A New Rule Based Tokenizer for Chemical Named Entity Recognition

BioMed Research International ◽

10.1155/2016/4248026 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 5

Author(s):

Abbas Akkasi ◽

Ekrem Varoğlu ◽

Nazife Dimililer

Keyword(s):

Conditional Random Fields ◽

Named Entity Recognition ◽

Classification Performance ◽

Entity Recognition ◽

Support Vector ◽

Learning Approaches ◽

Data Set ◽

Rule Based ◽

Named Entity ◽

Vector Machines

Named Entity Recognition (NER) from text constitutes the first step in many text mining applications. The most important preliminary step for NER systems using machine learning approaches is tokenization where raw text is segmented into tokens. This study proposes an enhanced rule based tokenizer, ChemTok, which utilizes rules extracted mainly from the train data set. The main novelty of ChemTok is the use of the extracted rules in order to merge the tokens split in the previous steps, thus producing longer and more discriminative tokens. ChemTok is compared to the tokenization methods utilized by ChemSpot and tmChem. Support Vector Machines and Conditional Random Fields are employed as the learning algorithms. The experimental results show that the classifiers trained on the output of ChemTok outperforms all classifiers trained on the output of the other two tokenizers in terms of classification performance, and the number of incorrectly segmented entities.

Download Full-text

Using machine learning to maintain rule-based named-entity recognition and classification systems

10.3115/1073012.1073067 ◽

2001 ◽

Cited By ~ 17

Author(s):

Georgios Petasis ◽

Frantz Vichot ◽

Francis Wolinski ◽

Georgios Paliouras ◽

Vangelis Karkaletsis ◽

...

Keyword(s):

Machine Learning ◽

Named Entity Recognition ◽

Classification Systems ◽

Entity Recognition ◽

Rule Based ◽

Named Entity

Download Full-text

CustNER

International Journal on Semantic Web and Information Systems ◽

10.4018/ijswis.2020070107 ◽

2020 ◽

Vol 16 (3) ◽

pp. 110-127

Author(s):

Raabia Mumtaz ◽

Muhammad Abdul Qadir

Keyword(s):

Language Learning ◽

Named Entity Recognition ◽

Entity Recognition ◽

False Negatives ◽

Rule Based System ◽

Named Entity ◽

Evaluation Dataset ◽

Natural Language Learning ◽

Person Location ◽

Better Than

This article describes CustNER: a system for named-entity recognition (NER) of person, location, and organization. Realizing the incorrect annotations of existing NER, four categories of false negatives have been identified. The NEs not annotated contain nationalities, have corresponding resource in DBpedia, are acronyms of other NEs. A rule-based system, CustNER, has been proposed that utilizes existing NERs and DBpedia knowledge base. CustNER has been trained on the open knowledge extraction (OKE) challenge 2017 dataset and evaluated on OKE and CoNLL03 (Conference on Natural Language Learning) datasets. The OKE dataset has also been annotated with the three types. Evaluation results show that CustNER outperforms existing NERs with F score 12.4% better than Stanford NER and 3.1% better than Illinois NER. On another standard evaluation dataset for which the system is not trained, the CoNLL03 dataset, CustNER gives results comparable to existing systems with F score 3.9% better than Stanford NER, though Illinois NER F score is 1.3% better than CustNER.

Download Full-text

Named Entity Recognition in Telugu language using Language Dependent Features and Rule based Approach

International Journal of Computer Applications ◽

10.5120/2602-3628 ◽

2011 ◽

Vol 22 (8) ◽

pp. 30-34 ◽

Cited By ~ 2

Author(s):

B. Sasidhar ◽

P. M. Yohan ◽

A. Vinaya Babu ◽

A. Govardhan

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Rule Based ◽

Named Entity ◽

Rule Based Approach

Download Full-text

Named Entity Recognition for a Low Resource Language

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b2085.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 587-590

Keyword(s):

Machine Learning ◽

Named Entity Recognition ◽

Training Data ◽

Entity Recognition ◽

Linguistic Knowledge ◽

Rule Based ◽

Low Resource ◽

Named Entity ◽

The North ◽

Rule Based Approach

Kokborok named entity recognition using the rules based approach is being studied in this paper. Named entity recognition is one of the applications of natural language processing. It is considered a subtask for information extraction. Named entity recognition is the means of identifying the named entity for some specific task. We have studied the named entity recognition system for the Kokborok language. Kokborok is the official language of the state of Tripura situated in the north eastern part of India. It is also widely spoken in other part of the north eastern state of India and adjoining areas of Bangladesh. The named entities are like the name of person, organization, location etc. Named entity recognitions are studied using the machine learning approach, rule based approach or the hybrid approach combining the machine learning and rule based approaches. Rule based named entity recognitions are influence by the linguistic knowledge of the language. Machine learning approach requires a large number of training data. Kokborok being a low resource language has very limited number of training data. The rule based approach requires linguistic rules and the results are not depended on the size of data available. We have framed a heuristic rules for identifying the named entity based on linguistic knowledge of the language. An encouraging result is obtained after we test our data with the rule based approach. We also tried to study and frame the rules for the counting system in Kokborok in this paper. The rule based approach to named entity recognition is found suitable for low resource language with limited digital work and absence of named entity tagged data. We have framed a suitable algorithm using the rules for solving the named entity recognition task for obtaining a desirable result.

Download Full-text

Bio-NER: Biomedical Named Entity Recognition using Rule-Based and Statistical Learners

International Journal of Advanced Computer Science and Applications ◽

10.14569/ijacsa.2017.081220 ◽

2017 ◽

Vol 8 (12) ◽

Cited By ~ 1

Author(s):

Pir Dino ◽

Sanotsh Kumar ◽

Banbhrani ◽

Arsalan Ali ◽

Hans Raj

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Rule Based ◽

Named Entity ◽

Biomedical Named Entity Recognition

Download Full-text

Artificial intelligence to organize patient portal messages: a journey from an ensemble deep learning text classification to rule-based named entity recognition

2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm47256.2019.8982942 ◽

2019 ◽

Author(s):

Ahmad P. Tafti ◽

Sunyang Fu ◽

Aditya Khurana ◽

George M. Mastorakos ◽

Kenneth G. Poole ◽

...

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Text Classification ◽

Named Entity Recognition ◽

Entity Recognition ◽

Patient Portal ◽

Rule Based ◽

Named Entity

Download Full-text

Rule-Based Person Named Entity Recognition for Bulgarian

Slavic Languages in the Perspective of Formal Grammar ◽

10.3726/978-3-653-05348-7/16 ◽

2016 ◽

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Rule Based ◽

Named Entity

Download Full-text

Person Named Entity Recognition in Balinese

JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) ◽

10.24843/jlk.2021.v10.i01.p13 ◽

2021 ◽

Vol 10 (1) ◽

pp. 99

Author(s):

Kenny Kurniadi ◽

Ngurah Agus Sanjaya ER

Keyword(s):

Information Extraction ◽

Named Entity Recognition ◽

Morphological Structure ◽

Entity Recognition ◽

Linguistic Meaning ◽

Rule Based ◽

Named Entity

Named Entity Recognition (NER) is part of information extraction whose task is to classify text which is categorized into several classes such as names of people (figures), organizations, and locations. In this study, the authors propose making a NER identify the names of characters in Balinese language documents. This study will use a rule-based method (rule-based). Rules are build based on the morphological structure and linguistic meaning of Balinese names. The research conducted, that the system has an accuracy of 67.41%, precision of 83.42%, recall of 77.83%, and F-Score of 80.53%.

Download Full-text

Combining rule-based and statistical mechanisms for low-resource named entity recognition

Machine Translation ◽

10.1007/s10590-017-9208-0 ◽

2017 ◽

Vol 32 (1-2) ◽

pp. 31-43 ◽

Cited By ~ 2

Author(s):

Ryan Gabbard ◽

Jay DeYoung ◽

Constantine Lignos ◽

Marjorie Freedman ◽

Ralph Weischedel

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Rule Based ◽

Low Resource ◽

Named Entity

Download Full-text