Dimensionality reduction for voice disorders identification system based on Mel Frequency Cepstral Coefficients and Support Vector Machine

Birds are excellent environmental indicators and may indicate sustainability of the ecosystem; birds may be used to provide provisioning, regulating, and supporting services. Therefore, birdlife conservation-related researches always receive centre stage. Due to the airborne nature of birds and the dense nature of the tropical forest, bird identifications through audio may be a better solution than visual identification. The goal of this study is to find the most appropriate cepstral features that can be used to classify bird sounds more accurately. Fifteen (15) endemic Bornean bird sounds have been selected and segmented using an automated energy-based algorithm. Three (3) types of cepstral features are extracted; linear prediction cepstrum coefficients (LPCC), mel frequency cepstral coefficients (MFCC), gammatone frequency cepstral coefficients (GTCC), and used separately for classification purposes using support vector machine (SVM). Through comparison between their prediction results, it has been demonstrated that model utilising GTCC features, with 93.3% accuracy, outperforms models utilising MFCC and LPCC features. This demonstrates the robustness of GTCC for bird sounds classification. The result is significant for the advancement of bird sound classification research, which has been shown to have many applications such as in eco-tourism and wildlife management.

Download Full-text

Comparison of feature extraction and normalization methods for speaker recognition using grid-audiovisual database

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i2.pp782-789 ◽

2020 ◽

Vol 18 (2) ◽

pp. 782

Author(s):

Musab T. S. Al-Kaltakchi ◽

Haithem Abd Al-Raheem Taha ◽

Mohanad Abd Shehab ◽

Mohamed A.M. Abdullah

Keyword(s):

Feature Extraction ◽

Speaker Recognition ◽

Speaker Identification ◽

Gaussian Mixture ◽

Identification Accuracy ◽

Identification System ◽

Good Representation ◽

Mel Frequency Cepstral Coefficients ◽

Normalization Methods ◽

Cepstral Coefficients

<p><span lang="EN-GB">In this paper, different feature extraction and feature normalization methods are investigated for speaker recognition. With a view to give a good representation of acoustic speech signals, Power Normalized Cepstral Coefficients (PNCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are employed for feature extraction. Then, to mitigate the effect of linear channel, Cepstral Mean-Variance Normalization (CMVN) and feature warping are utilized. The current paper investigates Text-independent speaker identification system by using 16 coefficients from both the MFCCs and PNCCs features. Eight different speakers are selected from the GRID-Audiovisual database with two females and six males. The speakers are modeled using the coupling between the Universal Background Model and Gaussian Mixture Models (GMM-UBM) in order to get a fast scoring technique and better performance. The system shows 100% in terms of speaker identification accuracy. The results illustrated that PNCCs features have better performance compared to the MFCCs features to identify females compared to male speakers. Furthermore, feature wrapping reported better performance compared to the CMVN method. </span></p>

Download Full-text

Emotion Recognition in Speech Using with SVM, DSVM and Auto-Encoder

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37545 ◽

2021 ◽

Vol 9 (8) ◽

pp. 1021-1026

Author(s):

Jeena Augustine

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Emotion Recognition ◽

Support Vector ◽

Mel Frequency Cepstral Coefficients ◽

Classification Rate ◽

Signal Process ◽

Common Technique ◽

Audio Information ◽

Deep Support

Abstract: Emotions recognition from the speech is one of the foremost vital subdomains within the sphere of signal process. during this work, our system may be a two-stage approach, particularly feature extraction, and classification engine. Firstly, 2 sets of options square measure investigated that are: thirty-nine Mel-frequency Cepstral coefficients (MFCC) and sixty-five MFCC options extracted supported the work of [20]. Secondly, we've got a bent to use the Support Vector Machine (SVM) because the most classifier engine since it is the foremost common technique within the sector of speech recognition. Besides that, we've a tendency to research the importance of the recent advances in machine learning along with the deep kerne learning, further because the numerous types of auto-encoders (the basic auto-encoder and also the stacked autoencoder). an oversized set of experiments unit conducted on the SAVEE audio information. The experimental results show that the DSVM technique outperforms the standard SVM with a classification rate of sixty-nine. 84% and 68.25% victimization thirty-nine MFCC, severally. To boot, the auto encoder technique outperforms the standard SVM, yielding a classification rate of 73.01%. Keywords: Emotion recognition, MFCC, SVM, Deep Support Vector Machine, Basic auto-encoder, Stacked Auto encode

Download Full-text

Physical-oriented and machine learning-based emission modeling in a diesel compression ignition engine: Dimensionality reduction and regression

International Journal of Engine Research ◽

10.1177/14680874211070736 ◽

2022 ◽

pp. 146808742110707

Author(s):

Aran Mohammad ◽

Reza Rezaei ◽

Christopher Hayduk ◽

Thaddaeus Delebinski ◽

Saeid Shahpouri ◽

...

Keyword(s):

Principal Component Analysis ◽

Support Vector Machine ◽

Factor Analysis ◽

Dimensionality Reduction ◽

Principal Component ◽

Component Analysis ◽

Data Driven ◽

Support Vector ◽

Emission Models ◽

Emission Modeling

The development of internal combustion engines is affected by the exhaust gas emissions legislation and the striving to increase performance. This demands for engine-out emission models that can be used for engine optimization for real driving emission controls. The prediction capability of physically and data-driven engine-out emission models is influenced by the system inputs, which are specified by the user and can lead to an improved accuracy with increasing number of inputs. Thereby the occurrence of irrelevant inputs becomes more probable, which have a low functional relation to the emissions and can lead to overfitting. Alternatively, data-driven methods can be used to detect irrelevant and redundant inputs. In this work, thermodynamic states are modeled based on 772 stationary measured test bench data from a commercial vehicle diesel engine. Afterward, 37 measured and modeled variables are led into a data-driven dimensionality reduction. For this purpose, approaches of supervised learning, such as lasso regression and linear support vector machine, and unsupervised learning methods like principal component analysis and factor analysis are applied to select and extract the relevant features. The selected and extracted features are used for regression by the support vector machine and the feedforward neural network to model the NOx, CO, HC, and soot emissions. This enables an evaluation of the modeling accuracy as a result of the dimensionality reduction. Using the methods in this work, the 37 variables are reduced to 25, 22, 11, and 16 inputs for NOx, CO, HC, and soot emission modeling while maintaining the accuracy. The features selected using the lasso algorithm provide more accurate learning of the regression models than the extracted features through principal component analysis and factor analysis. This results in test errors RMSETe for modeling NOx, CO, HC, and soot emissions 19.22 ppm, 6.46 ppm, 1.29 ppm, and 0.06 FSN, respectively.

Download Full-text

Acoustic comparison of electronics disguised voice using Different semitones

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.16.11502 ◽

2018 ◽

Vol 7 (2.16) ◽

pp. 98 ◽

Cited By ~ 2

Author(s):

Mahesh K. Singh ◽

A K. Singh ◽

Narendra Singh

Keyword(s):

Support Vector Machine ◽

Acoustic Analysis ◽

Speaker Identification ◽

Support Vector ◽

Acoustic Features ◽

Acoustic Feature ◽

Mel Frequency Cepstral Coefficients ◽

Identification Rate ◽

Normal Voice ◽

Feature Based

This paper emphasizes an algorithm that is based on acoustic analysis of electronics disguised voice. Proposed work is given a comparative analysis of all acoustic feature and its statistical coefficients. Acoustic features are computed by Mel-frequency cepstral coefficients (MFCC) method and compare with a normal voice and disguised voice by different semitones. All acoustic features passed through the feature based classifier and detected the identification rate of all type of electronically disguised voice. There are two types of support vector machine (SVM) and decision tree (DT) classifiers are used for speaker identification in terms of classification efficiency of electronically disguised voice by different semitones.

Download Full-text