The Use of Hidden Markov Model in Natural ARABIC Language Processing: a survey

Abstract— Arabic language has a slightly different pronunciation than the Indonesian so to learn it takes a long time. In Arabia itself, there are variants in the pronunciation of the Arabic language or dialect. Dialect is a language, and letters are used by a particular group of people in a clump that makes the difference between the readings even greeting one another. In Indonesia, alone speakers of Indonesia itself have a different dialect to native speakers.This study was analyzed of Arabic writing suitability by Indonesian speakers using Linear Predictive Coding extraction techniques. The text produces different patterns of speech. This also happens if the text is spoken by a speaker who is not the mother tongue of the speakers. The data training in this study is using the Arabic speaker sound. The feature extraction is classified using Hidden Markov Model.In the classification, using Hidden Markov Model, voice signal is analyzed and searched the maximum possible value that can be recognized. The modeling results obtained parameters are used to compare with the sound of Arabic speakers. From the test results' Classification, Hidden Markov Models with Linear Predictive Coding extraction average accuracy of 78.6% for test data sampling frequency of 8,000 Hz, 80.2% for test data sampling frequency of 22050 Hz, 79% for frequencies sampling test data at 44100 Hz.

Download Full-text

Hidden Markov Model and its Application in Natural Language Processing

Information Technology Journal ◽

10.3923/itj.2013.4256.4261 ◽

2013 ◽

Vol 12 (17) ◽

pp. 4256-4261

Author(s):

Xuexia Gao ◽

Nan Zhu

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Markov Model ◽

Hidden Markov Model ◽

Language Processing ◽

Hidden Markov

Download Full-text

Part-Of Speech Tagging Base on Hidden Markov Model

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.198-199.852 ◽

2012 ◽

Vol 198-199 ◽

pp. 852-855

Author(s):

Xi Jie Wang ◽

Shun Yi Hu

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Language Processing ◽

Viterbi Algorithm ◽

Hidden Markov ◽

Estimation Method ◽

Basic Principles ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Speech Tagging

Part-of-Speech Tagging is the fundamental problems in natural language processing .The paper introduces the representation of the Hidden Markov Model (HMM) and the needs to solve the problem, and then discusses the parameter estimation method of the HMM model, and research on basic principles of Part-of Speech Tagging using Viterbi algorithm.

Download Full-text

A model for the acoustic phonetic structure of Arabic language using a single ergodic hidden Markov model

Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96 ◽

10.1109/icslp.1996.607120 ◽

2002 ◽

Cited By ~ 1

Author(s):

M.A. Mokhtar ◽

A.Z. El-Abddin

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Hidden Markov ◽

Arabic Language

Download Full-text

Review on Usage of Hidden Markov Model in Natural Language Processing

Smart Innovation, Systems and Technologies - Intelligent and Cloud Computing ◽

10.1007/978-981-15-5971-6_45 ◽

2020 ◽

pp. 415-423

Author(s):

Amrita Anandika ◽

Smita Prava Mishra ◽

Madhusmita Das

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Markov Model ◽

Hidden Markov Model ◽

Language Processing ◽

Hidden Markov

Download Full-text

Natural Language Processing Based Part of Speech Tagger using Hidden Markov Model

2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) ◽

10.1109/i-smac47947.2019.9032593 ◽

2019 ◽

Author(s):

Sindhya K Nambiar ◽

Antony Leons ◽

Soniya Jose ◽

Arunsree

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Markov Model ◽

Hidden Markov Model ◽

Language Processing ◽

Hidden Markov ◽

Part Of Speech

Download Full-text

A Hidden Markov Model-based Part of Speech Tagger for Shekki’noono Language

International Journal of Computing ◽

10.47839/ijc.20.4.2448 ◽

2021 ◽

pp. 587-595

Author(s):

Alebachew Chiche ◽

Hiwot Kadi ◽

Tibebu Bekele

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Markov Model ◽

Hidden Markov Model ◽

Language Processing ◽

Hidden Markov ◽

Parts Of Speech ◽

Pos Tagging ◽

Part Of Speech ◽

Pos Tagger

Natural language processing plays a great role in providing an interface for human-computer communication. It enables people to talk with the computer in their formal language rather than machine language. This study aims at presenting a Part of speech tagger that can assign word class to words in a given paragraph sentence. Some of the researchers developed parts of speech taggers for different languages such as English Amharic, Afan Oromo, Tigrigna, etc. On the other hand, many other languages do not have POS taggers like Shekki’noono language. POS tagger is incorporated in most natural language processing tools like machine translation, information extraction as a basic component. So, it is compulsory to develop a part of speech tagger for languages then it is possible to work with an advanced natural language application. Because those applications enhance machine to machine, machine to human, and human to human communications. Although, one language POS tagger cannot be directly applied for other languages POS tagger. With the purpose for developing the Shekki’noono POS tagger, we have used the stochastic Hidden Markov Model. For the study, we have used 1500 sentences collected from different sources such as newspapers (which includes social, economic, and political aspects), modules, textbooks, Radio Programs, and bulletins. The collected sentences are labeled by language experts with their appropriate parts of speech for each word. With the experiments carried out, the part of speech tagger is trained on the training sets using Hidden Markov model. As experiments showed, HMM based POS tagging has achieved 92.77 % accuracy for Shekki’noono. And the POS tagger model is compared with the previous experiments in related works using HMM. As a future work, the proposed approaches can be utilized to perform an evaluation on a larger corpus.

Download Full-text

Analisis Morfologi untuk Menangani Out-of-Vocabulary Words pada Part-of-Speech Tagger Bahasa Indonesia Menggunakan Hidden Markov Model

Jurnal Linguistik Komputasional (JLK) ◽

10.26418/jlk.v2i1.13 ◽

2019 ◽

Vol 2 (1) ◽

pp. 6 ◽

Cited By ~ 1

Author(s):

Febyana Ramadhanti ◽

Yudi Wibisono ◽

Rosa Ariani Sukamto

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Markov Model ◽

Hidden Markov Model ◽

Language Processing ◽

Hidden Markov ◽

Part Of Speech ◽

Pos Tagger ◽

Bahasa Indonesia

Part-of-speech (PoS) tagger merupakan salah satu task dalam bidang natural language processing (NLP) sebagai proses penandaan kategori kata (part-of-speech) untuk setiap kata pada teks kalimat masukan. Hidden markov model (HMM) merupakan algoritma PoS tagger berbasis probabilistik, sehingga sangat tergantung pada train corpus. Terbatasnya komponen dalam train corpus dan luasnya kata dalam bahasa Indonesia menimbulkan masalah yang disebut out-of-vocabulary (OOV) words. Penelitian ini membandingkan PoS tagger yang menggunakan HMM+AM (analisis morfologi) dan PoS tagger HMM tanpa AM, dengan menggunakan train corpus dan testing corpus yang sama. Testing corpus mengandung 30% tingkat OOV dari 6.676 token atau 740 kalimat masukan. Hasil yang diperoleh dari sistem HMM saja memiliki akurasi 97.54%, sedangkan sistem HMM dengan metode analisis morfologi memiliki akurasi tertinggi 99.14%.

Download Full-text