scholarly journals A Comparison of Microphone and Speech Recognition Engine Efficacy for Mobile Data Entry

Author(s):  
Joanna Lumsden ◽  
Scott Durling ◽  
Irina Kondratova
Author(s):  
David T. Williamson ◽  
Timothy P. Barry

This paper discusses the design, implementation, and evaluation of a prototype speech recognition interface to the Theater Air Planning (TAP) module of Theater Battle Management Core Systems (TBMCS). This effort was in support of a Kenney Battlelab Initiative proposal submitted to the Command and Control Battlelab at Hurlburt Field, FL to assess the operational benefits of speech recognition for data entry applications in a Joint Air Operations Center environment. Several factors contributing to the design of the “TAPTalk” speech interface included interviews with subject matter experts, speech system selection, grammar development, and integration into TAP, which required only minor modification of existing software. Results from the two week operational assessment with sixteen subjects from the Command and Control Training and Innovation Group, numbered Air Forces, Navy, and Marine Corp indicated that the Theater Air Planning process could be accomplished significantly faster with no increase in error rates. Subjectively, the sixteen planners unanimously agreed that the TAPTalk speech interface was a valuable addition to TAP and would recommend its inclusion in a future upgrade. Recommendations for further improving the TAPTalk system are discussed.


2017 ◽  
Vol 7 (1.3) ◽  
pp. 121
Author(s):  
Sreeja B P ◽  
Amrutha K G ◽  
Jeni Benedicta J ◽  
Kalaiselvi V ◽  
Ranjani R

The conventional interactive mode is especially used for geometric modeling software. This paper describes, a voice-assisted geometric modeling mechanism to improve the performance of modeling, speech recognition technology is used to design this model. This model states that after receiving the voice command, the system uses the speech recognition engine to identify the voice commands, then the voice commands identified are parsed and processed to generate the geometric design based on the users voice input dimensions, The outcome of the system is capable of generating the geometric designs to the user via speech recognition. This work also focuses on receiving the feedback from the users and customized the model based on the feedback.


2019 ◽  
Vol 9 (10) ◽  
pp. 2166 ◽  
Author(s):  
Mohamed Tamazin ◽  
Ahmed Gouda ◽  
Mohamed Khedr

Many new consumer applications are based on the use of automatic speech recognition (ASR) systems, such as voice command interfaces, speech-to-text applications, and data entry processes. Although ASR systems have remarkably improved in recent decades, the speech recognition system performance still significantly degrades in the presence of noisy environments. Developing a robust ASR system that can work in real-world noise and other acoustic distorting conditions is an attractive research topic. Many advanced algorithms have been developed in the literature to deal with this problem; most of these algorithms are based on modeling the behavior of the human auditory system with perceived noisy speech. In this research, the power-normalized cepstral coefficient (PNCC) system is modified to increase robustness against the different types of environmental noises, where a new technique based on gammatone channel filtering combined with channel bias minimization is used to suppress the noise effects. The TIDIGITS database is utilized to evaluate the performance of the proposed system in comparison to the state-of-the-art techniques in the presence of additive white Gaussian noise (AWGN) and seven different types of environmental noises. In this research, one word is recognized from a set containing 11 possibilities only. The experimental results showed that the proposed method provides significant improvements in the recognition accuracy at low signal to noise ratios (SNR). In the case of subway noise at SNR = 5 dB, the proposed method outperforms the mel-frequency cepstral coefficient (MFCC) and relative spectral (RASTA)–perceptual linear predictive (PLP) methods by 55% and 47%, respectively. Moreover, the recognition rate of the proposed method is higher than the gammatone frequency cepstral coefficient (GFCC) and PNCC methods in the case of car noise. It is enhanced by 40% in comparison to the GFCC method at SNR 0dB, while it is improved by 20% in comparison to the PNCC method at SNR −5dB.


Author(s):  
R.D. Sharp ◽  
E. Bocchieri ◽  
C. Castillo ◽  
S. Parthasarathy ◽  
C. Rath ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document