End to End Parts of Speech Tagging and Named Entity Recognition in Bangla Language

Abs-Sum-Kan: An Abstractive Text Summarization Technique for an Indian Regional Language by Induction of Tagging Rules

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1193.0782s319 ◽

2019 ◽

Vol 8 (2S3) ◽

pp. 1028-1036

Keyword(s):

Named Entity Recognition ◽

Qualitative Evaluation ◽

Entity Recognition ◽

Indian Languages ◽

Parts Of Speech ◽

Named Entity ◽

Domain Specific ◽

Full Abstraction ◽

Regional Language ◽

Speech Tagging

This paper presents a full abstraction for Indian languages, specifically Kannada, in the context of guided summarization. The proposed process generates the abstractive sum-mary by focusing on a unified presentation model with aspect based Information Extrac-tion (IE) rules and scheme based Templates. TF/IDF rules are used for classification into categories. Lexical analysis (like Parts Of Speech tagging and Named Entity Recognition) reduces prolixity, which leads to robust IE rules. Usage of Templates for sentence genera-tion makes the summaries succinct and information intensive. The IE rules are designed to accommodate the complexities of the considered languages. Later, the system aims to produce a guided summary of domain specific documents. An abstraction scheme is a collection of aspects and associated IE rules. Each abstraction scheme is designed based on a theme or subcategory. An extensive statistical and qualitative evaluation of the summaries generated by the system has been conducted and the results are found to be very promising.

Download Full-text

Abs-Sum-Kan: An Abstractive Text Summarization Technique for an Indian Regional Language by Induction of Tagging Rules

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1043.0882s819 ◽

2019 ◽

Vol 8 (2S8) ◽

pp. 1225-1233

Keyword(s):

Named Entity Recognition ◽

Qualitative Evaluation ◽

Entity Recognition ◽

Indian Languages ◽

Parts Of Speech ◽

Named Entity ◽

Domain Specific ◽

Full Abstraction ◽

Regional Language ◽

Speech Tagging

This paper presents a full abstraction for Indian languages, specifically Kannada, in the context of guided summarization. The proposed process generates the abstractive summary by focusing on a unified presentation model with aspect based Information Extraction (IE) rules and scheme based Templates. TF/IDF rules are used for classification into categories. Lexical analysis (like Parts Of Speech tagging and Named Entity Recognition) reduces prolixity, which leads to robust IE rules. Usage of Templates for sentence generation makes the summaries succinct and information intensive. The IE rules are designed to accommodate the complexities of the considered languages. Later, the system aims to produce a guided summary of domain specific documents. An abstraction scheme is a collection of aspects and associated IE rules. Each abstraction scheme is designed based on a theme or subcategory. An extensive statistical and qualitative evaluation of the summaries generated by the system has been conducted and the results are found to be very promising.

Download Full-text

End-to-End Recurrent Neural Network Models for Vietnamese Named Entity Recognition: Word-Level Vs. Character-Level

Communications in Computer and Information Science - Computational Linguistics ◽

10.1007/978-981-10-8438-6_18 ◽

2018 ◽

pp. 219-232 ◽

Cited By ~ 5

Author(s):

Thai-Hoang Pham ◽

Phuong Le-Hong

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Named Entity Recognition ◽

Network Models ◽

Entity Recognition ◽

Neural Network Models ◽

Named Entity ◽

Word Level ◽

End To End

Download Full-text

An End-to-End Progressive Multi-Task Learning Framework for Medical Named Entity Recognition and Normalization

10.18653/v1/2021.acl-long.485 ◽

2021 ◽

Author(s):

Baohang Zhou ◽

Xiangrui Cai ◽

Ying Zhang ◽

Xiaojie Yuan

Keyword(s):

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity ◽

Learning Framework ◽

Task Learning ◽

End To End

Download Full-text

PhoNLP: A joint multi-task learning model for Vietnamese part-of-speech tagging, named entity recognition and dependency parsing

10.18653/v1/2021.naacl-demos.1 ◽

2021 ◽

Author(s):

Linh The Nguyen ◽

Dat Quoc Nguyen

Keyword(s):

Named Entity Recognition ◽

Learning Model ◽

Entity Recognition ◽

Dependency Parsing ◽

Named Entity ◽

Part Of Speech Tagging ◽

Task Learning ◽

Part Of Speech ◽

Speech Tagging

Download Full-text

POS Tagging and NER System for Kannada Using Conditional Random Fields

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2021100101 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1-13

Author(s):

Arpitha Swamy ◽

Srinath S.

Keyword(s):

Random Fields ◽

Conditional Random Fields ◽

Named Entity Recognition ◽

Model Testing ◽

Entity Recognition ◽

Parts Of Speech ◽

Named Entity ◽

Pos Tagging ◽

Proper Nouns ◽

Pos Tagger

Parts-of-speech (POS) tagging is a method used to assign the POS tag for every word present in the text, and named entity recognition (NER) is a process to identify the proper nouns in the text and to classify the identified nouns into certain predefined categories. A POS tagger and a NER system for Kannada text have been proposed utilizing conditional random fields (CRFs). The dataset used for POS tagging consists of 147K tokens, where 103K tokens are used for training and the remaining tokens are used for testing. The proposed CRF model for POS tagging of Kannada text obtained 91.3% of precision, 91.6% of recall, and 91.4% of f-score values, respectively. To develop the NER system for Kannada, the data required is created manually using the modified tag-set containing 40 labels. The dataset used for NER system consists of 16.5K tokens, where 70% of the total words are used for training the model, and the remaining 30% of total words are used for model testing. The developed NER model obtained the 94% of precision, 93.9% of recall, and 93.9% of F1-measure values, respectively.

Download Full-text