Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition

KsponSpeech: Korean Spontaneous Speech Corpus for Automatic Speech Recognition

Applied Sciences ◽

10.3390/app10196936 ◽

2020 ◽

Vol 10 (19) ◽

pp. 6936 ◽

Cited By ~ 1

Author(s):

Jeong-Uk Bang ◽

Seung Yun ◽

Seung-Hi Kim ◽

Mu-Yeol Choi ◽

Min-Kyu Lee ◽

...

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Large Scale ◽

Open Data ◽

Spontaneous Speech ◽

Open Domain ◽

Speech Corpus ◽

Clean Environment ◽

End To End ◽

Repeated Words

This paper introduces a large-scale spontaneous speech corpus of Korean, named KsponSpeech. This corpus contains 969 h of general open-domain dialog utterances, spoken by about 2000 native Korean speakers in a clean environment. All data were constructed by recording the dialogue of two people freely conversing on a variety of topics and manually transcribing the utterances. The transcription provides a dual transcription consisting of orthography and pronunciation, and disfluency tags for spontaneity of speech, such as filler words, repeated words, and word fragments. This paper also presents the baseline performance of an end-to-end speech recognition model trained with KsponSpeech. In addition, we investigated the performance of standard end-to-end architectures and the number of sub-word units suitable for Korean. We investigated issues that should be considered in spontaneous speech recognition in Korean. KsponSpeech is publicly available on an open data hub site of the Korea government.

Download Full-text

Low-Complexity DNN-Based End-to-End Automatic Speech Recognition using Low-Rank Approximation

2020 International SoC Design Conference (ISOCC) ◽

10.1109/isocc50952.2020.9332970 ◽

2020 ◽

Author(s):

Jongmin Park ◽

Youngjoo Lee

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Low Complexity ◽

Low Rank ◽

Low Rank Approximation ◽

Rank Approximation ◽

End To End

Download Full-text

Bridging automatic speech recognition and psycholinguistics: Extending Shortlist to an end-to-end model of human speech recognition (L)

The Journal of the Acoustical Society of America ◽

10.1121/1.1624065 ◽

2003 ◽

Vol 114 (6) ◽

pp. 3032-3035 ◽

Cited By ~ 9

Author(s):

Odette Scharenborg ◽

Louis ten Bosch ◽

Lou Boves ◽

Dennis Norris

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Human Speech ◽

End To End

Download Full-text

Combining De-noising Auto-encoder and Recurrent Neural Networks in End-to-End Automatic Speech Recognition for Noise Robustness

2018 IEEE Spoken Language Technology Workshop (SLT) ◽

10.1109/slt.2018.8639597 ◽

2018 ◽

Author(s):

Tzu-Hsuan Ting ◽

Chia-Ping Chen

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Recurrent Neural Networks ◽

Noise Robustness ◽

End To End

Download Full-text

Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition

10.21437/interspeech.2019-2641 ◽

2019 ◽

Author(s):

Khoi-Nguyen C. Mac ◽

Xiaodong Cui ◽

Wei Zhang ◽

Michael Picheny

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Large Scale ◽

Deep Neural Network ◽

Acoustic Modeling

Download Full-text

Fast offline transformer‐based end‐to‐end automatic speech recognition for real‐world applications

ETRI Journal ◽

10.4218/etrij.2021-0106 ◽

2021 ◽

Author(s):

Yoo Rhee Oh ◽

Kiyoung Park ◽

Jeon Gue Park

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Real World ◽

Real World Applications ◽

End To End

Download Full-text

LARGE-SCALE REMOTE ASSESSMENT OF VERBAL COGNITIVE FUNCTION USING AUTOMATIC SPEECH RECOGNITION

Alzheimer s & Dementia ◽

10.1016/j.jalz.2019.06.4334 ◽

2019 ◽

Vol 15 (7) ◽

pp. P162-P163

Author(s):

Francesca K. Cormack ◽

Nick Taptiklis ◽

Jennifer H. Barnett ◽

Merina Su

Keyword(s):

Speech Recognition ◽

Cognitive Function ◽

Automatic Speech Recognition ◽

Large Scale ◽

Remote Assessment

Download Full-text

End-to-End Automatic Speech Recognition Integrated with CTC-Based Voice Activity Detection

ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp40776.2020.9054358 ◽

2020 ◽

Author(s):

Takenori Yoshimura ◽

Tomoki Hayashi ◽

Kazuya Takeda ◽

Shinji Watanabe

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Voice Activity Detection ◽

Activity Detection ◽

End To End ◽

Voice Activity

Download Full-text

Modular End-to-End Automatic Speech Recognition Framework for Acoustic-to-Word Model

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2020.3009477 ◽

2020 ◽

Vol 28 ◽

pp. 2174-2183

Author(s):

Qi Liu ◽

Zhehuai Chen ◽

Hao Li ◽

Mingkun Huang ◽

Yizhou Lu ◽

...

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

End To End

Download Full-text

End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition

Speech Communication ◽

10.1016/j.specom.2019.01.004 ◽

2019 ◽

Vol 108 ◽

pp. 15-32 ◽

Cited By ~ 17

Author(s):

Dimitri Palaz ◽

Mathew Magimai-Doss ◽

Ronan Collobert

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Convolutional Neural Networks ◽

Automatic Speech Recognition ◽

Acoustic Modeling ◽

End To End

Download Full-text