A Time Delay Neural Network Acoustic Modeling for Hindi Speech Recognition

Advances in Data and Information Sciences - Lecture Notes in Networks and Systems ◽

10.1007/978-981-15-0694-9_40 ◽

2020 ◽

pp. 425-432

Author(s):

Ankit Kumar ◽

R. K. Aggarwal

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Acoustic Modeling

Download Full-text

Hindi speech recognition using time delay neural network acoustic modeling with i-vector adaptation

International Journal of Speech Technology ◽

10.1007/s10772-020-09757-0 ◽

2020 ◽

Author(s):

Ankit Kumar ◽

Rajesh Kumar Aggarwal

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Acoustic Modeling

Download Full-text

Efficient Acoustic Modeling Method for Unsupervised Speech Recognition using Multi-Task Deep Neural Network

Proceedings of the 2015 4th National Conference on Electrical, Electronics and Computer Engineering ◽

10.2991/nceece-15.2016.72 ◽

2016 ◽

Author(s):

Haitao Yao ◽

Maobo An ◽

Ji Xu ◽

Jian Liu

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Deep Neural Network ◽

Modeling Method ◽

Acoustic Modeling

Download Full-text

A space-perturbance/time-delay neural network for speech recognition

Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop ◽

10.1109/nnsp.1991.239503 ◽

2002 ◽

Author(s):

Ji Ming ◽

Chen Huihuang ◽

Shen Zhenkang

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Download Full-text

Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition

10.21437/interspeech.2019-2641 ◽

2019 ◽

Author(s):

Khoi-Nguyen C. Mac ◽

Xiaodong Cui ◽

Wei Zhang ◽

Michael Picheny

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Large Scale ◽

Deep Neural Network ◽

Acoustic Modeling

Download Full-text

Time-Delay Recurrent Neural Network for Cross-Lingual Speech Recognition

Advances in Intelligent Systems and Computing - Recent Developments in Intelligent Computing, Communication and Devices ◽

10.1007/978-981-10-8944-2_40 ◽

2018 ◽

pp. 341-348 ◽

Author(s):

Xia Mao ◽

Yulv Zhang

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Recurrent Neural Network ◽

Download Full-text

An Investigation of Multilingual TDNN-BLSTM Acoustic Modeling for Hindi Speech Recognition

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327911666210118143758 ◽

2021 ◽

Vol 11 ◽

Author(s):

Ankit Kumar ◽

Rajesh Kumar Aggarwal

Keyword(s):

Neural Network ◽

Speech Recognition ◽

High Accuracy ◽

Training Data ◽

Acoustic Modeling ◽

Training Dataset ◽

Acoustic Model ◽

Indian Languages ◽

Acoustic Models ◽

Background: In India, thousands of languages or dialects are in use. Most Indian dialects are low asset dialects. A well-performing Automatic Speech Recognition (ASR) system for Indian languages is unavailable due to a lack of resources. Hindi is one of them as large vocabulary Hindi speech datasets are not freely available. We have only a few hours of transcribed Hindi speech dataset. There is a lot of time and money involved in creating a well-transcribed speech dataset. Thus, developing a real-time ASR system with a few hours of the training dataset is the most challenging task. The different techniques like data augmentation, semi-supervised training, multilingual architecture, and transfer learning, have been reported in the past to tackle the fewer speech data issues. In this paper, we examine the effect of multilingual acoustic modeling in ASR systems for the Hindi language. Objective: This article’s objective is to develop a high accuracy Hindi ASR system with a reasonable computational load and high accuracy using a few hours of training data. Method: To achieve this goal we used Multilingual training with Time Delay Neural Network- Bidirectional Long Short Term Memory (TDNN-BLSTM) acoustic modeling. Multilingual acoustic modeling has significantly improved the ASR system's performance for low and limited resource languages. The common practice is to train the acoustic model by merging data from similar languages. In this work, we use three Indian languages, namely Hindi, Marathi, and Bengali. Hindi with 2.5 hours of training data and Marathi with 5.5 hours of training data and Bengali with 28.5 hours of transcribed data, was used in this work to train the proposed model. Results: The Kaldi toolkit was used to perform all the experiments. The paper is investigated over three main points. First, we present the monolingual ASR system using various Neural Network (NN) based acoustic models. Second, we show that Recurrent Neural Network (RNN) language modeling helps to improve the ASR performance further. Finally, we show that a multilingual ASR system significantly reduces the Word Error Rate (WER) (absolute 2% WER reduction for Hindi and 3% for the Marathi language). In all the three languages, the proposed TDNN-BLSTM-A multilingual acoustic models help to get the lowest WER. Conclusion: The multilingual hybrid TDNN-BLSTM-A architecture shows a 13.67% relative improvement over the monolingual Hindi ASR system. The best WER of 8.65% was recorded for Hindi ASR. For Marathi and Bengali, the proposed TDNN-BLSTM-A acoustic model reports the best WER of 30.40% and 10.85%.

Download Full-text

Review of TDNN (time delay neural network) architectures for speech recognition

1991., IEEE International Sympoisum on Circuits and Systems ◽

10.1109/iscas.1991.176402 ◽

2002 ◽

Author(s):

M. Sugiyama ◽

H. Sawai ◽

A.H. Waibel

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Network Architectures ◽

Neural Network Architectures

Download Full-text

Deep neural network acoustic modeling for native and non-native Mandarin speech recognition

The 9th International Symposium on Chinese Spoken Language Processing ◽

10.1109/iscslp.2014.6936617 ◽

2014 ◽

Author(s):

Xin Chen ◽

Jian Cheng

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Deep Neural Network ◽

Acoustic Modeling ◽

Mandarin Speech Recognition

Download Full-text

Gated Time Delay Neural Network for Speech Recognition

Journal of Physics Conference Series ◽

10.1088/1742-6596/1229/1/012077 ◽

2019 ◽

Vol 1229 ◽

pp. 012077 ◽

Author(s):

Kaibin Chen ◽

Weibin Zhang ◽

Dongpeng Chen ◽

Xiaorong Huang ◽

Boji Liu ◽

...

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Download Full-text

Time Delay Recurrent Neural Network for Speech Recognition

Journal of Physics Conference Series ◽

10.1088/1742-6596/1229/1/012078 ◽

2019 ◽

Vol 1229 ◽

pp. 012078 ◽

Author(s):

Boji Liu ◽

Weibin Zhang ◽

Xiangming Xu ◽

Dongpeng Chen

Keyword(s):

Neural Network ◽

Speech Recognition ◽

Recurrent Neural Network

Download Full-text