Text and Synthetic Data for Domain Adaptation in End-to-End Speech Recognition

Mapping Intimacies ◽

10.1007/978-3-030-87802-3_25 ◽

2021 ◽

pp. 271-278

Author(s):

Juan Hussain ◽

Christian Huber ◽

Sebastian Stüker ◽

Alexander Waibel

Keyword(s):

Speech Recognition ◽

Domain Adaptation ◽

Synthetic Data ◽

Download Full-text

Domain Adaptation of End-to-end Speech Recognition in Low-Resource Settings

2018 IEEE Spoken Language Technology Workshop (SLT) ◽

10.1109/slt.2018.8639506 ◽

2018 ◽

Author(s):

Lahiru Samarakoon ◽

Brian Mak ◽

Albert Y.S. Lam

Keyword(s):

Speech Recognition ◽

Domain Adaptation ◽

Low Resource Settings ◽

Low Resource ◽

Download Full-text

Accented Speech Recognition Based on End-to-End Domain Adversarial Training of Neural Networks

Applied Sciences ◽

10.3390/app11188412 ◽

2021 ◽

Vol 11 (18) ◽

pp. 8412

Author(s):

Hyeong-Ju Na ◽

Jeong-Sik Park

Keyword(s):

Speech Recognition ◽

Domain Adaptation ◽

Training Data ◽

Baseline Model ◽

Linguistic Differences ◽

Computational Costs ◽

Accented Speech ◽

Feature Extractor ◽

Adversarial Training ◽

The performance of automatic speech recognition (ASR) may be degraded when accented speech is recognized because the speech has some linguistic differences from standard speech. Conventional accented speech recognition studies have utilized the accent embedding method, in which the accent embedding features are directly fed into the ASR network. Although the method improves the performance of accented speech recognition, it has some restrictions, such as increasing the computational costs. This study proposes an efficient method of training the ASR model for accented speech in a domain adversarial way based on the Domain Adversarial Neural Network (DANN). The DANN plays a role as a domain adaptation in which the training data and test data have different distributions. Thus, our approach is expected to construct a reliable ASR model for accented speech by reducing the distribution differences between accented speech and standard speech. DANN has three sub-networks: the feature extractor, the domain classifier, and the label predictor. To adjust the DANN for accented speech recognition, we constructed these three sub-networks independently, considering the characteristics of accented speech. In particular, we used an end-to-end framework based on Connectionist Temporal Classification (CTC) to develop the label predictor, a very important module that directly affects ASR results. To verify the efficiency of the proposed approach, we conducted several experiments of accented speech recognition for four English accents including Australian, Canadian, British (England), and Indian accents. The experimental results showed that the proposed DANN-based model outperformed the baseline model for all accents, indicating that the end-to-end domain adversarial training effectively reduced the distribution differences between accented speech and standard speech.

Download Full-text

Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition

2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) ◽

10.1109/asru46091.2019.9003776 ◽

2019 ◽

Author(s):

Zhong Meng ◽

Jinyu Li ◽

Yashesh Gaur ◽

Yifan Gong

Keyword(s):

Speech Recognition ◽

Student Learning ◽

Domain Adaptation ◽

Teacher Student ◽

Download Full-text

End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems

10.18653/v1/2020.emnlp-main.439 ◽

2020 ◽

Author(s):

Siamak Shakeri ◽

Cicero Nogueira dos Santos ◽

Henghui Zhu ◽

Patrick Ng ◽

Feng Nan ◽

...

Keyword(s):

Question Answering ◽

Domain Adaptation ◽

Synthetic Data ◽

Data Generation ◽

Synthetic Data Generation ◽

Question Answering Systems ◽

Download Full-text

Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition

2021 IEEE International Conference on Multimedia and Expo (ICME) ◽

10.1109/icme51207.2021.9428334 ◽

2021 ◽

Author(s):

Jian Luo ◽

Jianzong Wang ◽

Ning Cheng ◽

Edward Xiao ◽

Jing Xiao ◽

...

Keyword(s):

Speech Recognition ◽

Transfer Learning ◽

Automatic Speech Recognition ◽

Domain Adaptation ◽

Language Transfer ◽

Download Full-text

Semi-supervised domain adaptation using unlabeled data for end-to-end speech recognition

Phonetics and Speech Sciences ◽

10.13064/ksss.2020.12.2.029 ◽

2020 ◽

Vol 12 (2) ◽

pp. 29-37

Author(s):

Hyeonjae Jeong ◽

Jahyun Goo ◽

Hoirin Kim

Keyword(s):

Speech Recognition ◽

Domain Adaptation ◽

Unlabeled Data ◽

Download Full-text

Selective Adaptation of End-to-End Speech Recognition using Hybrid CTC/Attention Architecture for Noise Robustness

2020 28th European Signal Processing Conference (EUSIPCO) ◽

10.23919/eusipco47968.2020.9287836 ◽

2021 ◽

Author(s):

Cong-Thanh Do ◽

Shucong Zhang ◽

Thomas Hain

Keyword(s):

Speech Recognition ◽

Selective Adaptation ◽

Noise Robustness ◽

Download Full-text

Improving Attention Based Sequence-to-Sequence Models for End-to-End English Conversational Speech Recognition

10.21437/interspeech.2018-1030 ◽

2018 ◽

Author(s):

Chao Weng ◽

Jia Cui ◽

Guangsen Wang ◽

Jun Wang ◽

Chengzhu Yu ◽

...

Keyword(s):

Speech Recognition ◽

Conversational Speech ◽

Download Full-text

Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition

10.21437/interspeech.2020-1930 ◽

2020 ◽

Author(s):

Ryo Masumura ◽

Naoki Makishima ◽

Mana Ihori ◽

Akihiko Takashima ◽

Tomohiro Tanaka ◽

...

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Large Scale ◽

Download Full-text

Combination of End-to-End and Hybrid Models for Speech Recognition

10.21437/interspeech.2020-2141 ◽

2020 ◽

Author(s):

Jeremy H.M. Wong ◽

Yashesh Gaur ◽

Rui Zhao ◽

Liang Lu ◽

Eric Sun ◽

...

Keyword(s):

Speech Recognition ◽

Hybrid Models ◽

Download Full-text