Automatic speech recognition for launch control center communication using recurrent neural networks with data augmentation and custom language model

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>

Download Full-text

Dynamic Sparsity Neural Networks for Automatic Speech Recognition

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9414505 ◽

2021 ◽

Author(s):

Zhaofeng Wu ◽

Ding Zhao ◽

Qiao Liang ◽

Jiahui Yu ◽

Anmol Gulati ◽

...

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Automatic Speech Recognition

Download Full-text

Towards end-to-end speech recognition for Chinese Mandarin using long short-term memory recurrent neural networks

10.21437/interspeech.2015-717 ◽

2015 ◽

Author(s):

Jie Li ◽

Heng Zhang ◽

Xinyuan Cai ◽

Bo Xu

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Recurrent Neural Networks ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

End To End ◽

Chinese Mandarin

Download Full-text

Multimodal Continuous Emotion Recognition with Data Augmentation Using Recurrent Neural Networks

Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop - AVEC'18 ◽

10.1145/3266302.3266304 ◽

2018 ◽

Cited By ~ 4

Author(s):

Jian Huang ◽

Ya Li ◽

Jianhua Tao ◽

Zheng Lian ◽

Mingyue Niu ◽

...

Keyword(s):

Neural Networks ◽

Emotion Recognition ◽

Recurrent Neural Networks ◽

Data Augmentation

Download Full-text

Enhancement of Speech Recognition System by neural network approaches of Clustering

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v6i1.4456 ◽

2013 ◽

Vol 6 (1) ◽

pp. 266-271

Author(s):

Anurag Upadhyay ◽

Chitranjanjit Kaur

Keyword(s):

Neural Networks ◽

Speech Recognition ◽

Recurrent Neural Networks ◽

Alternative Energy ◽

Recognition Rate ◽

Speech Sound ◽

Recognition System ◽

Training Methods ◽

Indian Languages ◽

Phone Recognition

This paper addresses the problem of speech recognition to identify various modes of speech data. Speaker sounds are the acoustic sounds of speech. Statistical models of speech have been widely used for speech recognition under neural networks. In paper we propose and try to justify a new model in which speech co articulation the effect of phonetic context on speech sound is modeled explicitly under a statistical framework. We study speech phone recognition by recurrent neural networks and SOUL Neural Networks. A general framework for recurrent neural networks and considerations for network training are discussed in detail. SOUL NN clustering the large vocabulary that compresses huge data sets of speech. This project also different Indian languages utter by different speakers in different modes such as aggressive, happy, sad, and angry. Many alternative energy measures and training methods are proposed and implemented. A speaker independent phone recognition rate of 82% with 25% frame error rate has been achieved on the neural data base. Neural speech recognition experiments on the NTIMIT database result in a phone recognition rate of 68% correct. The research results in this thesis are competitive with the best results reported in the literature.Â

Download Full-text

Fundamental frequency feature warping for frequency normalization and data augmentation in child automatic speech recognition

Speech Communication ◽

10.1016/j.specom.2021.08.002 ◽

2021 ◽

Author(s):

Gary Yeung ◽

Ruchao Fan ◽

Abeer Alwan

Keyword(s):

Speech Recognition ◽

Fundamental Frequency ◽

Automatic Speech Recognition ◽

Data Augmentation ◽

Frequency Feature

Download Full-text