Audio style transfer in non-native speech recognition

Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2018 ◽

10.1117/12.2501495 ◽

2018 ◽

Author(s):

Kacper Radzikowski

Keyword(s):

Speech Recognition ◽

Style Transfer ◽

Download Full-text

Non-native speech recognition using audio style transfer

Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2019 ◽

10.1117/12.2536535 ◽

2019 ◽

Author(s):

Kacper Radzikowski ◽

Mateusz Forc ◽

Le Wang ◽

Osamu Yoshie ◽

Robert M. Nowak

Keyword(s):

Speech Recognition ◽

Style Transfer ◽

Download Full-text

Accent modification for speech recognition of non-native speakers using neural style transfer

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-021-00199-3 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Kacper Radzikowski ◽

Le Wang ◽

Osamu Yoshie ◽

Robert Nowak

Keyword(s):

Speech Recognition ◽

English Language ◽

Native Speakers ◽

Native Speaker ◽

Mother Tongue ◽

Recognition System ◽

Style Transfer ◽

Native Speech ◽

Accent Modification ◽

AbstractNowadays automatic speech recognition (ASR) systems can achieve higher and higher accuracy rates depending on the methodology applied and datasets used. The rate decreases significantly when the ASR system is being used with a non-native speaker of the language to be recognized. The main reason for this is specific pronunciation and accent features related to the mother tongue of that speaker, which influence the pronunciation. At the same time, an extremely limited volume of labeled non-native speech datasets makes it difficult to train, from the ground up, sufficiently accurate ASR systems for non-native speakers.In this research, we address the problem and its influence on the accuracy of ASR systems, using the style transfer methodology. We designed a pipeline for modifying the speech of a non-native speaker so that it more closely resembles the native speech. This paper covers experiments for accent modification using different setups and different approaches, including neural style transfer and autoencoder. The experiments were conducted on English language pronounced by Japanese speakers (UME-ERJ dataset). The results show that there is a significant relative improvement in terms of the speech recognition accuracy. Our methodology reduces the necessity of training new algorithms for non-native speech (thus overcoming the obstacle related to the data scarcity) and can be used as a wrapper for any existing ASR system. The modification can be performed in real time, before a sample is passed into the speech recognition system itself.

Download Full-text

Unsupervised Joint Estimation of Grapheme-to-Phoneme Conversion Systems and Acoustic Model Adaptation for Non-Native Speech Recognition

10.21437/interspeech.2016-919 ◽

2016 ◽

Author(s):

Satoshi Tsujioka ◽

Sakriani Sakti ◽

Koichiro Yoshino ◽

Graham Neubig ◽

Satoshi Nakamura

Keyword(s):

Speech Recognition ◽

Joint Estimation ◽

Acoustic Model ◽

Model Adaptation ◽

Download Full-text

A hybrid approach to adapting acoustic and pronunciation models for non-native speech recognition

2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers ◽

10.1109/acssc.2009.5469755 ◽

2009 ◽

Author(s):

Yoo Rhee Oh ◽

Hong Kook Kim

Keyword(s):

Speech Recognition ◽

Hybrid Approach ◽

Download Full-text

Lexical modeling of non-native speech for automatic speech recognition

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100) ◽

10.1109/icassp.2000.862074 ◽

2002 ◽

Author(s):

K. Livescu ◽

J. Glass

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Download Full-text

Combined acoustic and pronunciation modelling for non-native speech recognition

10.21437/interspeech.2007-421 ◽

2007 ◽

Author(s):

G. Bouselmi ◽

Dominique Fohr ◽

I. Illina

Keyword(s):

Speech Recognition ◽

Download Full-text

Predicting word accuracy for the automatic speech recognition of non-native speech

10.21437/interspeech.2010-282 ◽

2010 ◽

Author(s):

Su-Youn Yoon ◽

Lei Chen ◽

Klaus Zechner

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Download Full-text

A comparison between native and non-native speech for automatic speech recognition

The Journal of the Acoustical Society of America ◽

10.1121/1.5101679 ◽

2019 ◽

Vol 145 (3) ◽

pp. 1827-1827

Author(s):

Seongjin Park ◽

John Culnan

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Download Full-text

Non-native speech recognition sentences: A new materials set for non-native speech perception research

Behavior Research Methods ◽

10.3758/s13428-019-01251-z ◽

2019 ◽

Vol 52 (2) ◽

pp. 561-571 ◽

Author(s):

Louise Stringer ◽

Paul Iverson

Keyword(s):

Speech Recognition ◽

Speech Perception ◽

New Materials ◽

Download Full-text

Acoustic Model Adaptation Based on Pronunciation Variability Analysis for Non-Native Speech Recognition

2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings ◽

10.1109/icassp.2006.1659976 ◽

2006 ◽

Author(s):

Yoo Rhee Oh ◽

Jae Sam Yoon ◽

Hong Kook Kim

Keyword(s):

Speech Recognition ◽

Acoustic Model ◽

Model Adaptation ◽

Variability Analysis ◽

Native Speech ◽

Pronunciation Variability

Download Full-text