scholarly journals Acoustic comparison of electronics disguised voice using Different semitones

2018 ◽  
Vol 7 (2.16) ◽  
pp. 98 ◽  
Author(s):  
Mahesh K. Singh ◽  
A K. Singh ◽  
Narendra Singh

This paper emphasizes an algorithm that is based on acoustic analysis of electronics disguised voice. Proposed work is given a comparative analysis of all acoustic feature and its statistical coefficients. Acoustic features are computed by Mel-frequency cepstral coefficients (MFCC) method and compare with a normal voice and disguised voice by different semitones. All acoustic features passed through the feature based classifier and detected the identification rate of all type of electronically disguised voice. There are two types of support vector machine (SVM) and decision tree (DT) classifiers are used for speaker identification in terms of classification efficiency of electronically disguised voice by different semitones.  

2014 ◽  
Vol 14 (3) ◽  
pp. 689-714
Author(s):  
Suzanne Franks ◽  
Rommel Barbosa

This article studies the acoustic characteristics of some oral vowels in tonic syllables of Brazilian Portuguese (BP) and which acoustic features are important for classifying native versus non-native speakers of BP. We recorded native and non-native speakers of BP for the purpose of the acoustic analysis of the vowels [a], [i], and [u] in tonic syllables. We analyzed the acoustic parameters of each segment using the Support Vector Machines algorithm to identify to which group, native or non-native, a new speaker belongs. When all of the variables were considered, a precision of 91% was obtained. The two most important acoustic cues to determine if a speaker is native or non-native were the durations of [i] and [u] in a word-final position. These findings can contribute to BP speaker identification as well as to the teaching of the pronunciation of Portuguese as a foreign language.


Author(s):  
Jeena Augustine

Abstract: Emotions recognition from the speech is one of the foremost vital subdomains within the sphere of signal process. during this work, our system may be a two-stage approach, particularly feature extraction, and classification engine. Firstly, 2 sets of options square measure investigated that are: thirty-nine Mel-frequency Cepstral coefficients (MFCC) and sixty-five MFCC options extracted supported the work of [20]. Secondly, we've got a bent to use the Support Vector Machine (SVM) because the most classifier engine since it is the foremost common technique within the sector of speech recognition. Besides that, we've a tendency to research the importance of the recent advances in machine learning along with the deep kerne learning, further because the numerous types of auto-encoders (the basic auto-encoder and also the stacked autoencoder). an oversized set of experiments unit conducted on the SAVEE audio information. The experimental results show that the DSVM technique outperforms the standard SVM with a classification rate of sixty-nine. 84% and 68.25% victimization thirty-nine MFCC, severally. To boot, the auto encoder technique outperforms the standard SVM, yielding a classification rate of 73.01%. Keywords: Emotion recognition, MFCC, SVM, Deep Support Vector Machine, Basic auto-encoder, Stacked Auto encode


2012 ◽  
Vol 459 ◽  
pp. 518-522
Author(s):  
Min Ma

A significant portion of the Chinese characters is phonogram, whose phonetic part can be used for overall sound inference. Phonetic degree is an inherent problem in the inference because low phonetic degree implies little phonetic dependence between the phonogram and its phonetic components. Solving the phonetic degree problem requires association each phonogram with the acoustic features. This paper introduces acoustic feature-based clustering, a classifying model that divides the common phonogram by defining new similarity of the sounds. This allows phonetic degree to be evaluated more reasonable. We demonstrate the clustering outperformed the traditional empirical estimation by having more accurate and real expressiveness. Acoustic feature-based clustering output 48.6% as phonetic degree, less than the empirical claim which is around 75%. As a clustering classifier, our model is competitive with a much clearer boundary on the phonogram dataset


2016 ◽  
Vol 13 (10) ◽  
pp. 6616-6627
Author(s):  
B Kanisha ◽  
G Balakrishnan

Speech recognition process applications are emerging as ever-zooming and efficient mechanisms in the hi-tech universe. There is a host of diverse interactive speech-aware applications in the market. With the rocketing requirement for upcoming embedded platforms and with the incredible increase in the demand for embedded computing, it is highly indispensable that the speech recognition systems (SRS) are put in place at the right time and in the proper form so that it is easily possible to perform multimedia tasks on these mechanisms. In this work, primarily through preprocessing the speech signal is processed where for the recognition of the particular signal, the noise is detached and then it enters into feature extraction in that peak signal frequency and it is compared with the standard signal and recognized. The signal is processed and noise free signal is produced by processing the signal to Mel frequency cepstral coefficients (MFCC), Tri-spectral feature, and discrete wave transform (DWT). To the input of the multi-class Support vector machine, the output of the above mentioned features is given. The processed signal is converted in to text by multi SVM. It is proved that our proposed technique is better than the existing technique by comparing the existing technique (FFBN) feed forward back propagation with the proposed technique. The proposed technique is implemented in the working platform of MATLAB.


Sign in / Sign up

Export Citation Format

Share Document