Hidden Markov Model and its Application in Natural Language Processing

Natural language processing plays a great role in providing an interface for human-computer communication. It enables people to talk with the computer in their formal language rather than machine language. This study aims at presenting a Part of speech tagger that can assign word class to words in a given paragraph sentence. Some of the researchers developed parts of speech taggers for different languages such as English Amharic, Afan Oromo, Tigrigna, etc. On the other hand, many other languages do not have POS taggers like Shekki’noono language. POS tagger is incorporated in most natural language processing tools like machine translation, information extraction as a basic component. So, it is compulsory to develop a part of speech tagger for languages then it is possible to work with an advanced natural language application. Because those applications enhance machine to machine, machine to human, and human to human communications. Although, one language POS tagger cannot be directly applied for other languages POS tagger. With the purpose for developing the Shekki’noono POS tagger, we have used the stochastic Hidden Markov Model. For the study, we have used 1500 sentences collected from different sources such as newspapers (which includes social, economic, and political aspects), modules, textbooks, Radio Programs, and bulletins. The collected sentences are labeled by language experts with their appropriate parts of speech for each word. With the experiments carried out, the part of speech tagger is trained on the training sets using Hidden Markov model. As experiments showed, HMM based POS tagging has achieved 92.77 % accuracy for Shekki’noono. And the POS tagger model is compared with the previous experiments in related works using HMM. As a future work, the proposed approaches can be utilized to perform an evaluation on a larger corpus.

Download Full-text

Analisis Morfologi untuk Menangani Out-of-Vocabulary Words pada Part-of-Speech Tagger Bahasa Indonesia Menggunakan Hidden Markov Model

Jurnal Linguistik Komputasional (JLK) ◽

10.26418/jlk.v2i1.13 ◽

2019 ◽

Vol 2 (1) ◽

pp. 6 ◽

Cited By ~ 1

Author(s):

Febyana Ramadhanti ◽

Yudi Wibisono ◽

Rosa Ariani Sukamto

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Markov Model ◽

Hidden Markov Model ◽

Language Processing ◽

Hidden Markov ◽

Part Of Speech ◽

Pos Tagger ◽

Bahasa Indonesia

Part-of-speech (PoS) tagger merupakan salah satu task dalam bidang natural language processing (NLP) sebagai proses penandaan kategori kata (part-of-speech) untuk setiap kata pada teks kalimat masukan. Hidden markov model (HMM) merupakan algoritma PoS tagger berbasis probabilistik, sehingga sangat tergantung pada train corpus. Terbatasnya komponen dalam train corpus dan luasnya kata dalam bahasa Indonesia menimbulkan masalah yang disebut out-of-vocabulary (OOV) words. Penelitian ini membandingkan PoS tagger yang menggunakan HMM+AM (analisis morfologi) dan PoS tagger HMM tanpa AM, dengan menggunakan train corpus dan testing corpus yang sama. Testing corpus mengandung 30% tingkat OOV dari 6.676 token atau 740 kalimat masukan. Hasil yang diperoleh dari sistem HMM saja memiliki akurasi 97.54%, sedangkan sistem HMM dengan metode analisis morfologi memiliki akurasi tertinggi 99.14%.

Download Full-text

Dual Sticky Hierarchical Dirichlet Process Hidden Markov Model and Its Application to Natural Language Description of Motions

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2017.2756039 ◽

2018 ◽

Vol 40 (10) ◽

pp. 2355-2373 ◽

Cited By ~ 6

Author(s):

Weiming Hu ◽

Guodong Tian ◽

Yongxin Kang ◽

Chunfeng Yuan ◽

Stephen Maybank

Keyword(s):

Natural Language ◽

Markov Model ◽

Hidden Markov Model ◽

Dirichlet Process ◽

Hidden Markov ◽

Hierarchical Dirichlet Process ◽

Language Description

Download Full-text

A Statistical Method for Evaluating Performance of Part of Speech Tagger for Gujarati

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1492.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 3899-3903

Keyword(s):

Natural Language Processing ◽

Markov Model ◽

Language Processing ◽

Hidden Markov ◽

Model Error ◽

Part Of Speech Tagging ◽

Pos Tagging ◽

Part Of Speech ◽

Textual Content ◽

Speech Tagging

Part of Speech Tagging has continually been a difficult mission in the era of Natural Language Processing. This article offers POS tagging for Gujarati textual content the use of Hidden Markov Model. Using Gujarati text annotated corpus for training checking out statistics set are randomly separated. 80% accuracy is given by model. Error analysis in which the mismatches happened is likewise mentioned in element.

Download Full-text

Part-Of Speech Tagging Base on Hidden Markov Model

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.198-199.852 ◽

2012 ◽

Vol 198-199 ◽

pp. 852-855

Author(s):

Xi Jie Wang ◽

Shun Yi Hu

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Language Processing ◽

Viterbi Algorithm ◽

Hidden Markov ◽

Estimation Method ◽

Basic Principles ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Speech Tagging

Part-of-Speech Tagging is the fundamental problems in natural language processing .The paper introduces the representation of the Hidden Markov Model (HMM) and the needs to solve the problem, and then discusses the parameter estimation method of the HMM model, and research on basic principles of Part-of Speech Tagging using Viterbi algorithm.

Download Full-text

The Use of Hidden Markov Model in Natural ARABIC Language Processing: a survey

Procedia Computer Science ◽

10.1016/j.procs.2017.08.363 ◽

2017 ◽

Vol 113 ◽

pp. 240-247 ◽

Cited By ~ 9

Author(s):

Dima Suleiman ◽

Arafat Awajan ◽

Wael Al Etaiwi

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Language Processing ◽

Hidden Markov ◽

Arabic Language ◽

Arabic Language Processing

Download Full-text

Natural Language Processing and Enhanced Clinical Decision Making Radiology and VINCI

PsycEXTRA Dataset ◽

10.1037/e615572012-015 ◽

2012 ◽

Author(s):

Eliot Siegel

Keyword(s):

Decision Making ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Decision Making ◽

Clinical Decision

Download Full-text

Natural Language Processing in the Clinical Setting

PsycEXTRA Dataset ◽

10.1037/e615572012-013 ◽

2012 ◽

Author(s):

Thomas H. Payne

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Setting

Download Full-text