scholarly journals Speakar-Independent Isolated Word Recognition using HTK for Varhadi – a Dialect of Marathi

Speech recognition is widely used in the computer science to make well-organized communication between humans and computers. This paper addresses the problem of speech recognition for Varhadi, the regional language of the state of Maharashtra in India. Varhadi is widely spoken in Maharashtra state especially in Vidharbh region. Viterbi algorithm is used to recognize unknown words using Hidden Markov Model (HMM). The dataset is developed to train the system consists of 83 isolated Varhadi words. A Mel frequency cepstral coefficient (MFCCs) is used as feature extraction to perform the acoustical analysis of speech signal. Word model is implemented in speaker independent mode for the proposed varhadi automatic speech recognition system (V-ASR). The training and test dataset consist of isolated words uttered by 8 native speakers of Varhadi language. The V-ASR system has recognized the Varhadi words satisfactorily with 92.77%. recognition performance.

2020 ◽  
Vol 9 (1) ◽  
pp. 2182-2187

The purpose of this paper is to address the application to an Indian Regional Language, ODIA of a single word Automatically Speech Recognition System (ASRS). The toolkit is based on Hidden Markov Model (HMM). The details was obtained from 8 ODIA Language speakers. The program is then qualified for 205 different terms in ODIA. Samples from six separate speakers have again been obtained. This is then evaluated in real time. A GUI has been created to enhance the system's interactivity. We used and introduced the test framework for development of the GUI JAVA application. A comprehensive model of an ASR framework was developed to explain each HTK resource using HTK library modules and software. The findings of the experiment indicate that the overall machine efficiency is 93.45%.


2020 ◽  
Vol 24 ◽  
pp. 233121652093892
Author(s):  
Marc R. Schädler ◽  
David Hülsmeier ◽  
Anna Warzybok ◽  
Birger Kollmeier

The benefit in speech-recognition performance due to the compensation of a hearing loss can vary between listeners, even if unaided performance and hearing thresholds are similar. To accurately predict the individual performance benefit due to a specific hearing device, a prediction model is proposed which takes into account hearing thresholds and a frequency-dependent suprathreshold component of impaired hearing. To test the model, the German matrix sentence test was performed in unaided and individually aided conditions in quiet and in noise by 18 listeners with different degrees of hearing loss. The outcomes were predicted by an individualized automatic speech-recognition system where the individualization parameter for the suprathreshold component of hearing loss was inferred from tone-in-noise detection thresholds. The suprathreshold component was implemented as a frequency-dependent multiplicative noise (mimicking level uncertainty) in the feature-extraction stage of the automatic speech-recognition system. Its inclusion improved the root-mean-square prediction error of individual speech-recognition thresholds (SRTs) from 6.3 dB to 4.2 dB and of individual benefits in SRT due to common compensation strategies from 5.1 dB to 3.4 dB. The outcome predictions are highly correlated with both the corresponding observed SRTs ( R2 = .94) and the benefits in SRT ( R2 = .89) and hence might help to better understand—and eventually mitigate—the perceptual consequences of as yet unexplained hearing problems, also discussed in the context of hidden hearing loss.


2020 ◽  
pp. 72-79
Author(s):  
Ibrahim El El-Henawy ◽  
◽  
◽  
Marwa Abo Abo-Elazm

Arabic is one of the phonetically complex languages, and the creation of accurate speech recognition system is a challengeable task. Phonetic dictionary is essential component in automatic speech recognition system (ASR). The pronunciation variations in Arabic are tangible and are investigated widely using data driven approach or knowledge based approach. The phonological rules are used to get the pronunciation of each word accurately to reduce the mismatch between the actual phoneme representation of the spoken words and ASR dictionary. Several studies in Arabic ASR system are conducted using different number of phonological rules. In this paper we focus on those rule that handle within-word pronunciation variation and cross-word pronunciation variation. The experimental results indicate that handling within-word pronunciation variation using phonological rule doesn’t enhance the recognition performance, but using these rules to handle cross-word variation provide a good performance.


Sign in / Sign up

Export Citation Format

Share Document