scholarly journals Effects of recognition accuracy and vocabulary size of a speech recognition system on task performance and user acceptance

1991 ◽  
Vol 22 (5) ◽  
pp. 352
Author(s):  
Fang Chen ◽  
Cristiano Masi

Many studies have indicted that stress and workload can effect the recognition accuracy of the speech recognition system. This can include noise, vibration, G-force, information overload, vocal quality in noise, vocal quality and psychological stress, concurrent task performance and vocal fatigue. The commercially available speech recognition system has not yet reached the perfect design to recognize natural human speech. The military application of automatic speech recognition systems has been studied in a wide arrangement. Verbex’ Voice Master was recommended in its instruction book as especially suited well for use in a noisy environment. This system was selected as a candidate system for use in cockpits. Before implementing it in the cockpit, its strengths and weaknesses for special utterances need to be tested in a laboratory environment. The purpose of the study was to investigate the effects of noise on recognition accuracy in dual-task performance. The experiment was carried out in a noise-insulated room. The Verbex’ Voice Master speech recognition system was installed into the computer. Eleven male Swedish students were the subjects. Two noise levels were set up with a combination of mental workload and physical workload. The results showed that without noise and mental workload, the recognition accuracy could be as good as 99.4%. With noise and mental workload, the recognition accuracy could be reduced to 95%. The results indicated that noise had significant effects on the computer error while mental workload had significant effects on both subject error and computer error.


Author(s):  
Na Wang ◽  
Xiaohong Zhang ◽  
Ashutosh Sharma

: The computer assisted speech recognition system enabling voice recognition for understanding the spoken words using sound digitization is extensively being used in the field of education, scientific research, industry, etc. This article unveils the technological perspective of automated speech recognition system in order to realize the spoken English speech recognition system based on MATLAB. A speech recognition technology has been designed and implemented in this work which can collect the speech signals of the spoken English learning system and then filter those speech signals. This paper mainly adopts the preprocessing module for the processing of the raw speech data collected utilizing the MATLAB commands. The method of feature extraction is based on HMM model, codebook generation and template training. The research results show that the recognition accuracy of 98% is achieved by the spoken English speech recognition system studied in this paper. It can be seen that the spoken English speech recognition system based on MATLAB has high recognition accuracy and fast speed. This work addresses the current research issued needed to be tackled in the speech recognition field. This approach is able to provide the technical support and interface for the spoken English learning system.


2019 ◽  
Vol 8 (3) ◽  
pp. 7827-7831

Kannada is the regional language of India spoken in Karnataka. This paper presents development of continuous kannada speech recognition system using monophone modelling and triphone modelling using HTK. Mel Frequency Cepstral Coefficient (MFCC) is used as feature extractor, exploits cepstral and perceptual frequency scale leads good recognition accuracy. Hidden Markov Model is used as classifier. In this paper Gaussian mixture splitting is done that captures the variations of the phones. The paper presents performance of continuous Kannada Automatic Speech Recognition (ASR) system with respect to 2, 4,8,16 and 32 Gaussian mixtures with monophone and context dependent tri-phone modelling. The experimental result shows that good recognition accuracy is achieved for context dependent tri-phone modelling than monophone modelling as the number Gaussian mixture is increased.


2011 ◽  
Vol 268-270 ◽  
pp. 82-87
Author(s):  
Zhi Peng Zhao ◽  
Yi Gang Cen ◽  
Xiao Fang Chen

In this paper, we proposed a new noise speech recognition method based on the compressive sensing theory. Through compressive sensing, our method increases the anti-noise ability of speech recognition system greatly, which leads to the improvement of the recognition accuracy. According to the experiments, our proposed method achieved better recognition performance compared with the traditional isolated word recognition method based on DTW algorithm.


2014 ◽  
Vol 926-930 ◽  
pp. 1729-1732
Author(s):  
Sha Yang ◽  
Tian Hu ◽  
Yun Lu Zhang

After about 50 years of development, speech recognition technology has been able to achieve large vocabulary, non-specific human continuous speech recognition system. On account of Chinese pronunciation features, we research the small vocabulary, non-specific Chinese speech recognition based on continuous Hidden Markov Model approach. With comparing the datasets of VQ/DTW, VQ/DHMM, CHMM state-1 recognition algorithm and CHMM state-2 recognition algorithm, the results of our experiment show that: (1) CHMM state-2 branch method performs primely in reduction of the recognition time; and (2) the recognition accuracy is improved eventually.


An improved and different variation of Automatic Speech Recognition (ASR) is presented which is based on Vector Quantization (VQ). ASR for different languages and different applications has been introduced so far. In this paper, we have presented a Speech Recognition system to recognize the hymns (paath) of Gurbani (sentences of Japji Sahib) as continuous mode of speech. For this, speech corpus has been generated in which the entire path has been recited by different speakers. The speech mode here can be taken as continuous speech encapsulated with background music and different kinds of additional noises and have been eliminated. The work has been done by using VQ approach of speech recognition and LBG algorithm which design optimal codebooks for the process of recognition. Experimental results are included which show that recognition accuracy for such system was found to be 92.6% and 95.8% for different and same speakers with different and same sentences.


Author(s):  
J.M. KOO ◽  
H.S. KIM ◽  
C.K. UN

In this paper, we introduce a Korean large vocabulary speech recognition system. This system recognizes sentence utterances with a vocabulary size of 1160 words, and is designed for an automatic telephone number query service. The system consists of four subsystems. The first is an acoustic processor recognizing words in an input sentence by a Hidden Markov Model (HMM) based speech recognition algorithm. The second subsystem is a linguistic processor which estimates input sentences from the results of the acoustic processor and determines the following words using syntactic information. The third is a time reduction processor reducing recognition time by limiting the number of candidate words to be computed by the acoustic processor. The time reduction processor uses linguistic information and acoustic information contained in the input sentence. The last subsystem is a speaker adaptation processor which quickly adapts parameters of the speech recognition system to new speakers. This subsystem uses VQ adaptation and HMM parameter adaptation based on spectral mapping. We also present our recent work on improving the performance of the large vocabulary speech recognition system. These works focused on the enhancement of the acoustic processor and the time reduction processor for speaker-independent speech recognition. A new approach for speaker adaptation is also described.


1988 ◽  
Vol 32 (4) ◽  
pp. 232-236 ◽  
Author(s):  
Sherry P. Casali ◽  
Robert D. Dryden ◽  
Beverly H. Williges

The purpose of the present study was to determine the effects of recognizer accuracy and vocabulary size on system performance of a speech recognition system. Subjects, ranging in age from 20 to 55 years, performed a data entry task using a simulated speech recognizer which simulated three accuracy levels and three levels of available vocabulary. Task completion times and subjective measures of acceptability were recorded. Results indicated that the accuracy level at which the recognizer was performing significantly influenced the task completion time and the user's acceptability ratings. Vocabulary size also significantly affected task completion time, however, its affect on the acceptability ratings was negligible. Older subjects in general required longer times to complete the tasks, however, they consistently rated the speech input systems more favorably than the younger subjects.


Sign in / Sign up

Export Citation Format

Share Document