The Use of Word, Phrase and Intent Accuracy as Measures of Connected Speech Recognition Performance

Author(s):  
Tim Barry ◽  
Tom Solz ◽  
John Reising ◽  
Dave Williamson

Eleven subjects participated in a study designed to test the accuracy of a newer-generation connected speech recognition system using a 49 word vocabulary likely to be used in an aircraft cockpit environment. The 49 vocabulary words were used to create 392 test phrases. These phrases were divided into three groups: Complex phrases, which contain more than five words, and two groups of Simple phrases, which contain 5 words or less. The simple phrases were divided into Simple Alternate and Simple No-Alternate phrases, depending on whether or not the phrase was the only one in the entire vocabulary capable of carrying out a particular action once recognition occurred. Performance of the recognition system was measured with three accuracy statistics: word accuracy, the most commonly reported statistic in speech recognition research, phrase accuracy, which is gaining popularity in connected speech recognition research, and intent accuracy, which is probably the most relevant statistic that could be reported in research of this type. Significantly different word, phrase, and intent accuracy results were obtained for the three different phrase types.

2011 ◽  
Vol 268-270 ◽  
pp. 82-87
Author(s):  
Zhi Peng Zhao ◽  
Yi Gang Cen ◽  
Xiao Fang Chen

In this paper, we proposed a new noise speech recognition method based on the compressive sensing theory. Through compressive sensing, our method increases the anti-noise ability of speech recognition system greatly, which leads to the improvement of the recognition accuracy. According to the experiments, our proposed method achieved better recognition performance compared with the traditional isolated word recognition method based on DTW algorithm.


2011 ◽  
Vol 1 (1) ◽  
pp. 9-13
Author(s):  
Pavithra M ◽  
Chinnasamy G ◽  
Azha Periasamy

A Speech recognition system requires a combination of various techniques and algorithms, each of which performs a specific task for achieving the main goal of the system. Speech recognition performance can be enhanced by selecting the proper acoustic model. In this work, the feature extraction and matching is done by SKPCA with Unsupervised learning algorithm and maximum probability. SKPCA reduces the data maximization of the model. It represents a sparse solution for KPCA, because the original data can be reduced considering the weights, i.e., the weights show the vectors which most influence the maximization. Unsupervised learning algorithm is implemented to find the suitable representation of the labels and maximum probability is used to maximize thenormalized acoustic likelihood of the most likely state sequences of training data. The experimental results show the efficiency of SKPCA technique with the proposed approach and maximum probability produce the great performance in the speech recognition system.


2020 ◽  
Vol 24 ◽  
pp. 233121652093892
Author(s):  
Marc R. Schädler ◽  
David Hülsmeier ◽  
Anna Warzybok ◽  
Birger Kollmeier

The benefit in speech-recognition performance due to the compensation of a hearing loss can vary between listeners, even if unaided performance and hearing thresholds are similar. To accurately predict the individual performance benefit due to a specific hearing device, a prediction model is proposed which takes into account hearing thresholds and a frequency-dependent suprathreshold component of impaired hearing. To test the model, the German matrix sentence test was performed in unaided and individually aided conditions in quiet and in noise by 18 listeners with different degrees of hearing loss. The outcomes were predicted by an individualized automatic speech-recognition system where the individualization parameter for the suprathreshold component of hearing loss was inferred from tone-in-noise detection thresholds. The suprathreshold component was implemented as a frequency-dependent multiplicative noise (mimicking level uncertainty) in the feature-extraction stage of the automatic speech-recognition system. Its inclusion improved the root-mean-square prediction error of individual speech-recognition thresholds (SRTs) from 6.3 dB to 4.2 dB and of individual benefits in SRT due to common compensation strategies from 5.1 dB to 3.4 dB. The outcome predictions are highly correlated with both the corresponding observed SRTs ( R2 = .94) and the benefits in SRT ( R2 = .89) and hence might help to better understand—and eventually mitigate—the perceptual consequences of as yet unexplained hearing problems, also discussed in the context of hidden hearing loss.


2014 ◽  
Vol 623 ◽  
pp. 267-273
Author(s):  
Xin Fei Liu ◽  
Hui Zhou

This paper describes a Chinese small-vocabulary offline speech recognition system based on PocketSphinx which acoustic models are regenerated by improving the existing models of Sphinx and language model is generated by LMTool online tool. And then build an offline speech recognition system which could run on the Android smartphone in Android development environment in Linux system. The experiment results show that the system used for recognizing the voice commands for cell phone has good recognition performance.


Sign in / Sign up

Export Citation Format

Share Document