Integration of hidden markov models in the automated speaker recognition system for critical use

2019 ◽  
Vol 1 (4) ◽  
pp. 178-182
Author(s):  
Vjatcheslav KOVTUN
Author(s):  
Dea Sifana Ramadhina ◽  
Rita Magdalena ◽  
Sofia Saidah

Voice is one of the parameters in the identification process of a person. Through the voice, information will be obtained such as gender, age, and even the identity of the speaker. Speaker recognition is a method to narrow down crimes and frauds committed by voice. So that it will minimize the occurrence of faking one's identity. The Method of Mel Frequency Cepstrum Coefficient (MFCC) can be used in the speech recognition system. The process of feature extraction of speech signal using MFCC will produce acoustic speech signal. The classification, Hidden Markov Models (HMM) is used to match unidentified speaker’s voice with the voices in database. In this research, the system is used to verify the speaker, namely 15 text dependent in Indonesian. On testing the speaker with the same as database, the highest accuracy is 99,16%.


2021 ◽  
Vol 8 (1) ◽  
pp. 119
Author(s):  
Syahroni Hidayat ◽  
Andi Sofyan Anas ◽  
Siti Agrippina Alodia Yusuf ◽  
Muhammad Tajuddin

<p class="Abstrak">Penelitian pengolahan sinyal digital yang berfokus pada pengenalan pembicara telah dimulai sejak beberapa dekade yang lalu, dan telah menghasilkan banyak metode-metode pengenalan pembicara. Di antara algoritma pembentukan koefisien ciri yang telah dikembangkan tersebut, ada dua algoritma yang dapat memberikan akurasi yang tinggi jika diterapkan pada sistem, yaitu <em>Mel Frequency Cepstral Coefficient</em> (MFCC) dan <em>Wavelet</em>. Penelitian ini bertujuan untuk menguji dan memilih kanal terbaik dari proses <em>wavelet</em>-MFCC yang dapat dijadikan sebagai koefisien ciri baru untuk diterapkan pada sistem pengenal pembicara. Koefisien ciri baru tersebut kemudian disebut dengan koefisien ciri <em>Wavelet</em>-MFCC. Kofisien ini dibentuk dari merubah kanal hasil dekomposisi <em>wavelet</em>, yaitu kanal aproksimasi (cA), kanal detail (cD), dan penggabungannya (cAcD), menjadi koefisien MFCC. Metode dekomposisi <em>wavelet</em> yang digunakan adalah metode <em>dyadic</em> dengan menerapkan <em>level</em> dekomposisi <em>level</em> 1 dan <em>level</em> 2. Setiap koefisien ciri kemudian menjadi inputan pada sistem pengklasifikasi <em>Hidden Markov Models</em> (HMM). Keluaran dari HMM kemudian dihitung akurasinya dan dianalisis. Dari pengujian yang dilakukan, diperoleh bahwa kanal detail (cD) sebagai ciri dapat memberikan akurasi yang sama dengan menggunakan kanal gabungan (cAcD) dan lebih tinggi dari kanal aproksimasi (cA), dengan akurasi sebesar 95%. Hal ini menunjukkan bahwa, kanal detail pada dekomposisi <em>level</em> 1 menyimpan ciri suara dari setiap pembicara sehingga sudah cukup untuk dijadikan sebagai koefisien ciri. Maka, penggunaan dekomposisi <em>level</em> 1 dan kanal detail cD sebagai ciri <em>Wavelet</em>-<em>MFCC</em> pada sistem pengenalan pembicara dapat meringankan dan mempercepat proses komputasi.</p><p class="Abstrak"> </p><p class="Abstrak"><em><strong>Abstract</strong></em></p><p class="Abstract"><em>Research in digital signal that focused on speaker recognition has begun since decades ago, and has resulted many speaker recognition methods. there are two algorithms that can provide high accuracy in recognition system, which are Mel Frequency Cepstral Coefficient (MFCC) and Wavelet. the aims of this study is to examine and chose the best channel from wavelet-MFCC process that can be used as new feature coefficient, then called as Wavelet-MFCC features coefficient. The coefficient is built by converting the wavelet decomposition channels, which are approximation (cA), detail (cD), and its combination (cAcD), into the MFCC coefficient. Wavelet dyadic decomposition with level 1 and level 2 of decomposition is applied. Each feature coefficient acts as an input to the HMM classifier. The accuracy of the HMM output is calculated, then analyzed. The obtained results show that the detail chanel (cD) achieve equal accuracy as the combination chanel (cAcD), and higher accuracy compared to aproximation channel (cA), with accuracy 95%. Thus, it can be conclude that the detail channel on level 1 decomposition contains features of each speaker's. Then, cD is enough to be used as a Wavelet-MFCC feature. Thus, its implementation in the SRS can ease and speed up the computing process.</em></p><p class="Abstrak"><em><strong><br /></strong></em></p>


2022 ◽  
pp. 629-647
Author(s):  
Yosra Abdulaziz Mohammed

Cries of infants can be seen as an indicator of pain. It has been proven that crying caused by pain, hunger, fear, stress, etc., show different cry patterns. The work presented here introduces a comparative study between the performance of two different classification techniques implemented in an automatic classification system for identifying two types of infants' cries, pain, and non-pain. The techniques are namely, Continuous Hidden Markov Models (CHMM) and Artificial Neural Networks (ANN). Two different sets of acoustic features were extracted from the cry samples, those are MFCC and LPCC, the feature vectors generated by each were eventually fed into the classification module for the purpose of training and testing. The results of this work showed that the system based on CDHMM have better performance than that based on ANN. CDHMM gives the best identification rate at 96.1%, which is much higher than 79% of ANN whereby in general the system based on MFCC features performed better than the one that utilizes LPCC features.


Sign in / Sign up

Export Citation Format

Share Document