End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning

Mapping Intimacies ◽

10.21437/interspeech.2021-1981 ◽

2021 ◽

Author(s):

Tomohiro Tanaka ◽

Ryo Masumura ◽

Mana Ihori ◽

Akihiko Takashima ◽

Shota Orihashi ◽

...

Keyword(s):

Speech Recognition ◽

Supervised Learning ◽

Automatic Speech Recognition ◽

Download Full-text

Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition

10.21437/interspeech.2020-1930 ◽

2020 ◽

Author(s):

Ryo Masumura ◽

Naoki Makishima ◽

Mana Ihori ◽

Akihiko Takashima ◽

Tomohiro Tanaka ◽

...

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Large Scale ◽

Download Full-text

Low-Complexity DNN-Based End-to-End Automatic Speech Recognition using Low-Rank Approximation

2020 International SoC Design Conference (ISOCC) ◽

10.1109/isocc50952.2020.9332970 ◽

2020 ◽

Author(s):

Jongmin Park ◽

Youngjoo Lee

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Low Complexity ◽

Low Rank ◽

Low Rank Approximation ◽

Rank Approximation ◽

Download Full-text

Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition (L)

The Journal of the Acoustical Society of America ◽

10.1121/1.1624065 ◽

2003 ◽

Vol 114 (6) ◽

pp. 3032-3035 ◽

Author(s):

Odette Scharenborg ◽

Louis ten Bosch ◽

Lou Boves ◽

Dennis Norris

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Human Speech ◽

Download Full-text

Improving Aphasic Speech Recognition by Using Novel Semi-Supervised Learning Methods on AphasiaBank for English and Spanish

Applied Sciences ◽

10.3390/app11198872 ◽

2021 ◽

Vol 11 (19) ◽

pp. 8872

Author(s):

Iván G. Torre ◽

Mónica Romero ◽

Aitor Álvarez

Keyword(s):

Speech Recognition ◽

Supervised Learning ◽

Automatic Speech Recognition ◽

English Language ◽

Spanish Language ◽

Learning Methods ◽

Text Data ◽

Lower Performance ◽

Recognition Systems ◽

Automatic speech recognition in patients with aphasia is a challenging task for which studies have been published in a few languages. Reasonably, the systems reported in the literature within this field show significantly lower performance than those focused on transcribing non-pathological clean speech. It is mainly due to the difficulty of recognizing a more unintelligible voice, as well as due to the scarcity of annotated aphasic data. This work is mainly focused on applying novel semi-supervised learning methods to the AphasiaBank dataset in order to deal with these two major issues, reporting improvements for the English language and providing the first benchmark for the Spanish language for which less than one hour of transcribed aphasic speech was used for training. In addition, the influence of reinforcing the training and decoding processes with out-of-domain acoustic and text data is described by using different strategies and configurations to fine-tune the hyperparameters and the final recognition systems. The interesting results obtained encourage extending this technological approach to other languages and scenarios where the scarcity of annotated data to train recognition models is a challenging reality.

Download Full-text

Combining De-noising Auto-encoder and Recurrent Neural Networks in End-to-End Automatic Speech Recognition for Noise Robustness

2018 IEEE Spoken Language Technology Workshop (SLT) ◽

10.1109/slt.2018.8639597 ◽

2018 ◽

Author(s):

Tzu-Hsuan Ting ◽

Chia-Ping Chen

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Recurrent Neural Networks ◽

Noise Robustness ◽

Download Full-text

Fast offline transformer‐based end‐to‐end automatic speech recognition for real‐world applications

ETRI Journal ◽

10.4218/etrij.2021-0106 ◽

2021 ◽

Author(s):

Yoo Rhee Oh ◽

Kiyoung Park ◽

Jeon Gue Park

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Real World Applications ◽

Download Full-text

End-to-End Automatic Speech Recognition Integrated with CTC-Based Voice Activity Detection

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9054358 ◽

2020 ◽

Author(s):

Takenori Yoshimura ◽

Tomoki Hayashi ◽

Kazuya Takeda ◽

Shinji Watanabe

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Voice Activity Detection ◽

Activity Detection ◽

Download Full-text

Modular End-to-End Automatic Speech Recognition Framework for Acoustic-to-Word Model

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2020.3009477 ◽

2020 ◽

Vol 28 ◽

pp. 2174-2183

Author(s):

Qi Liu ◽

Zhehuai Chen ◽

Hao Li ◽

Mingkun Huang ◽

Yizhou Lu ◽

...

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Download Full-text

End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition

Speech Communication ◽

10.1016/j.specom.2019.01.004 ◽

2019 ◽

Vol 108 ◽

pp. 15-32 ◽

Author(s):

Dimitri Palaz ◽

Mathew Magimai-Doss ◽

Ronan Collobert

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Convolutional Neural Networks ◽

Automatic Speech Recognition ◽

Acoustic Modeling ◽

Download Full-text

Evaluating the Vulnerability of End-to-End Automatic Speech Recognition Models to Membership Inference Attacks

10.21437/interspeech.2021-1188 ◽

2021 ◽

Author(s):

Muhammad A. Shah ◽

Joseph Szurley ◽

Markus Mueller ◽

Athanasios Mouchtaris ◽

Jasha Droppo

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Inference Attacks ◽

Download Full-text