A Hybrid Speech Enhancement Algorithm for Voice Assistance Application

In recent years, speech recognition technology has become a more common notion. Speech quality and intelligibility are critical for the convenience and accuracy of information transmission in speech recognition. The speech processing systems used to converse or store speech are usually designed for an environment without any background noise. However, in a real-world atmosphere, background intervention in the form of background noise and channel noise drastically reduces the performance of speech recognition systems, resulting in imprecise information transfer and exhausting the listener. When communication systems’ input or output signals are affected by noise, speech enhancement techniques try to improve their performance. To ensure the correctness of the text produced from speech, it is necessary to reduce the external noises involved in the speech audio. Reducing the external noise in audio is difficult as the speech can be of single, continuous or spontaneous words. In automatic speech recognition, there are various typical speech enhancement algorithms available that have gained considerable attention. However, these enhancement algorithms work well in simple and continuous audio signals only. Thus, in this study, a hybridized speech recognition algorithm to enhance the speech recognition accuracy is proposed. Non-linear spectral subtraction, a well-known speech enhancement algorithm, is optimized with the Hidden Markov Model and tested with 6660 medical speech transcription audio files and 1440 Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) audio files. The performance of the proposed model is compared with those of various typical speech enhancement algorithms, such as iterative signal enhancement algorithm, subspace-based speech enhancement, and non-linear spectral subtraction. The proposed cascaded hybrid algorithm was found to achieve a minimum word error rate of 9.5% and 7.6% for medical speech and RAVDESS speech, respectively. The cascading of the speech enhancement and speech-to-text conversion architectures results in higher accuracy for enhanced speech recognition. The evaluation results confirm the incorporation of the proposed method with real-time automatic speech recognition medical applications where the complexity of terms involved is high.

Download Full-text

An improved switch speech enhancement algorithm for automatic speech recognition

2015 IEEE International Conference on Computer and Communications (ICCC) ◽

10.1109/compcomm.2015.7387610 ◽

2015 ◽

Author(s):

Yongbao Ma ◽

Yi Zhou ◽

Jingang Liu ◽

Jie Xia ◽

Hongqing Liu

Keyword(s):

Speech Recognition ◽

Speech Enhancement ◽

Automatic Speech Recognition ◽

Enhancement Algorithm

Download Full-text

Robust automatic speech recognition using a perceptually-based optimal spectral amplitude estimator speech enhancement algorithm in various low-SNR environments

10.21437/interspeech.2005-223 ◽

2005 ◽

Author(s):

Hesham Tolba ◽

Zili Li ◽

Douglas O'Shaughnessy

Keyword(s):

Speech Recognition ◽

Speech Enhancement ◽

Automatic Speech Recognition ◽

Spectral Amplitude ◽

Low Snr ◽

Enhancement Algorithm

Download Full-text

A Cross-Entropy-Guided (CEG) Measure for Speech Enhancement Front-End Assessing Performances of Back-End Automatic Speech Recognition

10.21437/interspeech.2019-2511 ◽

2019 ◽

Author(s):

Li Chai ◽

Jun Du ◽

Chin-Hui Lee

Keyword(s):

Speech Recognition ◽

Speech Enhancement ◽

Automatic Speech Recognition ◽

Cross Entropy ◽

Front End

Download Full-text

Dual Application of Speech Enhancement for Automatic Speech Recognition

2021 IEEE Spoken Language Technology Workshop (SLT) ◽

10.1109/slt48900.2021.9383624 ◽

2021 ◽

Author(s):

Ashutosh Pandey ◽

Chunxi Liu ◽

Yun Wang ◽

Yatharth Saraf

Keyword(s):

Speech Recognition ◽

Speech Enhancement ◽

Automatic Speech Recognition

Download Full-text

Speech Enhancement System for Automatic Speech Recognition in Automotive Environment

10.1109/icccnt51525.2021.9579986 ◽

2021 ◽

Author(s):

Gokul G. Nair ◽

C. Santhosh Kumar

Keyword(s):

Speech Recognition ◽

Speech Enhancement ◽

Automatic Speech Recognition

Download Full-text

Tamil speech enhancement using non-linear spectral subtraction

2014 International Conference on Communication and Signal Processing ◽

10.1109/iccsp.2014.6950095 ◽

2014 ◽

Cited By ~ 1

Author(s):

Prabhakaran G. ◽

Indra J. ◽

Kasthuri N.

Keyword(s):

Speech Enhancement ◽

Spectral Subtraction ◽

Non Linear

Download Full-text

An Efficient Speech Enhancement Algorithm for Digital Hearing Aids Based on Modified Spectral Subtraction and Companding

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1093/ietfec/e90-a.8.1628 ◽

2007 ◽

Vol E90-A (8) ◽

pp. 1628-1635 ◽

Cited By ~ 1

Author(s):

Y. W. LEE ◽

S. M. LEE ◽

Y. S. JI ◽

J. S. LEE ◽

Y. J. CHEE ◽

...

Keyword(s):

Speech Enhancement ◽

Hearing Aids ◽

Spectral Subtraction ◽

Digital Hearing Aids ◽

Enhancement Algorithm

Download Full-text

Auditory-Inspired Morphological Processing of Speech Spectrograms: Applications in Automatic Speech Recognition and Speech Enhancement

Cognitive Computation ◽

10.1007/s12559-012-9196-6 ◽

2012 ◽

Vol 5 (4) ◽

pp. 426-441 ◽

Cited By ~ 7

Author(s):

Joyner Cadore ◽

Francisco J. Valverde-Albacete ◽

Ascensión Gallardo-Antolín ◽

Carmen Peláez-Moreno

Keyword(s):

Speech Recognition ◽

Speech Enhancement ◽

Automatic Speech Recognition ◽

Morphological Processing

Download Full-text

Speech enhancement algorithm based on improved spectral subtraction

2009 IEEE International Conference on Intelligent Computing and Intelligent Systems ◽

10.1109/icicisys.2009.5358217 ◽

2009 ◽

Author(s):

Liuyang Gao ◽

Yunfei Guo ◽

Shaomei Li ◽

Fucai Chen

Keyword(s):

Speech Enhancement ◽

Spectral Subtraction ◽

Enhancement Algorithm

Download Full-text

An Improved Spectral Subtraction Algorithm Study Based on Voice Human-Computer Interaction in Cockpit

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.139-141.2154 ◽

2010 ◽

Vol 139-141 ◽

pp. 2154-2157

Author(s):

Ji Xiang Lu ◽

Ping Wang ◽

Long Yi

Keyword(s):

Speech Recognition ◽

Human Computer Interaction ◽

Speech Enhancement ◽

Spectral Subtraction ◽

Masking Effect ◽

Auditory Masking ◽

Speech Output ◽

Voice Interaction ◽

Speech Information ◽

The Voice

The voice interaction in cockpit mainly includes speech recognition, enhancement and synthesis. This interaction transfers the speech information to the corresponding orders to make machines in cockpit work unmistaken, also feedback the execution results to users by speech output devices or some other ways. The speech enhancement technology is studied in this paper, aiming at the Voice Interactive. We propose an improved spectral subtraction (SS) algorithm based on auditory masking effect, by using two steps SS. The simulated results based on the segment SNR compared to the traditional SS show the effectiveness and superiority of the improved algorithm.

Download Full-text