scholarly journals Pengenalan Jenis Kelamin Manusia Berbasis Suara Menggunakan MFCC dan GMM

Author(s):  
Faisal Dharma Adhinata ◽  
Diovianto Putra Rakhmadani ◽  
Alon Jala Tirta Segara

Biometric information that exists in humans is unique from one human to another. One of the biometric data that is easily obtained is the human voice. The human voice is identic data that can differentiate between individuals. When we hear human voices directly, it is easy for our ears to tell the person who is speaking is male or female. But sometimes male voices can resemble girls and vice versa. Therefore, we propose a human voice detection system through Artificial Intelligence (AI) in machine learning. In this study, we used the Mel Frequency Cepstrum Coefficients (MFCC) method to extract human voice features and Gaussian Mixture Models (GMM) for the classification of female or male voice data. The experiment results showed that the system built was able to detect human gender through biometric voice data with an accuracy of 81.18%.

2006 ◽  
Vol 27 (10) ◽  
pp. 935-951 ◽  
Author(s):  
Felicity R Allen ◽  
Eliathamby Ambikairajah ◽  
Nigel H Lovell ◽  
Branko G Celler

2021 ◽  
Vol 921 (2) ◽  
pp. 106
Author(s):  
Farnik Nikakhtar ◽  
Robyn E. Sanderson ◽  
Andrew Wetzel ◽  
Sarah Loebman ◽  
Sanjib Sharma ◽  
...  

2019 ◽  
Vol 105 (6) ◽  
pp. 1269-1277 ◽  
Author(s):  
Yousef A. Alotaibi ◽  
Sid-Ahmed Selouani ◽  
Mohammed Sidi Yakoub ◽  
Yasser Mohammed Seddiq ◽  
Ali Meftah

The robustness of speech classification and recognition systems can be improved by the adoption of language distinctive phonetic feature (DPF) elements that can increase the effective characterization of a speech signal. This paper presents the results of applying Hidden Markov Models (HMMs) that perform Arabic phoneme recognition in conjunction with the inclusion and classification of their DPF element classes. The research focuses on classifying Modern Standard Arabic (MSA) phonemes within isolated words without a language context. HMM-based phoneme recognition is tested using 8, 16, and 32 HMM Gaussian mixture models. The monophone configuration is designed with consideration of 2-gram language model to evaluate the inherent performance of the system. The overall correct rates for classifying DPF element classes for the three versions of HMM systems are 83.29% 88.96%, and 92.70% for 8, 16, and 32 HMM Gaussian mixture model systems, respectively.


2021 ◽  
Vol 11 (15) ◽  
pp. 7149
Author(s):  
Ji-Yeoun Lee

This work is focused on deep learning methods, such as feedforward neural network (FNN) and convolutional neural network (CNN), for pathological voice detection using mel-frequency cepstral coefficients (MFCCs), linear prediction cepstrum coefficients (LPCCs), and higher-order statistics (HOSs) parameters. In total, 518 voice data samples were obtained from the publicly available Saarbruecken voice database (SVD), comprising recordings of 259 healthy and 259 pathological women and men, respectively, and using /a/, /i/, and /u/ vowels at normal pitch. Significant differences were observed between the normal and the pathological voice signals for normalized skewness (p = 0.000) and kurtosis (p = 0.000), except for normalized kurtosis (p = 0.051) that was estimated in the /u/ samples in women. These parameters are useful and meaningful for classifying pathological voice signals. The highest accuracy, 82.69%, was achieved by the CNN classifier with the LPCCs parameter in the /u/ vowel in men. The second-best performance, 80.77%, was obtained with a combination of the FNN classifier, MFCCs, and HOSs for the /i/ vowel samples in women. There was merit in combining the acoustic measures with HOS parameters for better characterization in terms of accuracy. The combination of various parameters and deep learning methods was also useful for distinguishing normal from pathological voices.


2019 ◽  
Vol 16 (8) ◽  
pp. 3410-3418
Author(s):  
Muhammed Shuaau ◽  
Ka Fei Thang

Autonomous anomaly detection has attracted significant amount of attention in the past decade due to increased security concerns all around the world. The volume of data reported by surveillance cameras has outrun human capacity and there exists a greater need for anomaly detection systems for crime monitoring. This project proposes a solution to this problem in a reception area context by using trajectory analysis. Trajectory extraction is proposed by using Gaussian Mixture Models and Kalman Filter for data association. Then trajectory analysis is performed on extracted trajectories to detect four different anomalies which are entering staff area, running, loitering and squatting down. The proposed anomaly detection method is tested on datasets recorded at Asia Pacific University’s reception area. The proposed algorithms were able to achieve a detection accuracy of 89% and a false positive rate of 4.52%. The results presented show the effectiveness of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document