scholarly journals Advanced Convolutional Neural Network-Based Hybrid Acoustic Models for Low-Resource Speech Recognition

Computers ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 36
Author(s):  
Tessfu Geteye Fantaye ◽  
Junqing Yu ◽  
Tulu Tilahun Hailu

Deep neural networks (DNNs) have shown a great achievement in acoustic modeling for speech recognition task. Of these networks, convolutional neural network (CNN) is an effective network for representing the local properties of the speech formants. However, CNN is not suitable for modeling the long-term context dependencies between speech signal frames. Recently, the recurrent neural networks (RNNs) have shown great abilities for modeling long-term context dependencies. However, the performance of RNNs is not good for low-resource speech recognition tasks, and is even worse than the conventional feed-forward neural networks. Moreover, these networks often overfit severely on the training corpus in the low-resource speech recognition tasks. This paper presents the results of our contributions to combine CNN and conventional RNN with gate, highway, and residual networks to reduce the above problems. The optimal neural network structures and training strategies for the proposed neural network models are explored. Experiments were conducted on the Amharic and Chaha datasets, as well as on the limited language packages (10-h) of the benchmark datasets released under the Intelligence Advanced Research Projects Activity (IARPA) Babel Program. The proposed neural network models achieve 0.1–42.79% relative performance improvements over their corresponding feed-forward DNN, CNN, bidirectional RNN (BRNN), or bidirectional gated recurrent unit (BGRU) baselines across six language collections. These approaches are promising candidates for developing better performance acoustic models for low-resource speech recognition tasks.

Author(s):  
Tshilidzi Marwala

In this chapter, a classifier technique that is based on a missing data estimation framework that uses autoassociative multi-layer perceptron neural networks and genetic algorithms is proposed. The proposed method is tested on a set of demographic properties of individuals obtained from the South African antenatal survey and compared to conventional feed-forward neural networks. The missing data approach based on the autoassociative network model proposed gives an accuracy of 92%, when compared to the accuracy of 84% obtained from the conventional feed-forward neural network models. The area under the receiver operating characteristics curve for the proposed autoassociative network model is 0.86 compared to 0.80 for the conventional feed-forward neural network model. The autoassociative network model proposed in this chapter, therefore, outperforms the conventional feed-forward neural network models and is an improved classifier. The reasons for this are: (1) the propagation of errors in the autoassociative network model is more distributed while for a conventional feed-forward network is more concentrated; and (2) there is no causality between the demographic properties and the HIV and, therefore, the HIV status does change the demographic properties and vice versa. Therefore, it is better to treat the problem as a missing data problem rather than a feed-forward problem.


2020 ◽  
Vol 4 (2) ◽  
pp. 73
Author(s):  
Sushan Poudel ◽  
Dr. R Anuradha

Speech is one of the most effective way for human and machine to interact. This project aims to build Speech Command Recognition System that is capable of predicting the predefined speech commands. Dataset provided by Google’s TensorFlow and AIY teams is used to implement different Neural Network models which include Convolutional Neural Network and Recurrent Neural Network combined with Convolutional Neural Network. The combination of Convolutional and Recurrent Neural Network outperforms Convolutional Neural Network alone by 8% and achieved 96.66% accuracy for 20 labels.


Author(s):  
Yu He ◽  
Jianxin Li ◽  
Yangqiu Song ◽  
Mutian He ◽  
Hao Peng

Traditional text classification algorithms are based on the assumption that data are independent and identically distributed. However, in most non-stationary scenarios, data may change smoothly due to long-term evolution and short-term fluctuation, which raises new challenges to traditional methods. In this paper, we present the first attempt to explore evolutionary neural network models for time-evolving text classification. We first introduce a simple way to extend arbitrary neural networks to evolutionary learning by using a temporal smoothness framework, and then propose a diachronic propagation framework to incorporate the historical impact into currently learned features through diachronic connections. Experiments on real-world news data demonstrate that our approaches greatly and consistently outperform traditional neural network models in both accuracy and stability.


2021 ◽  
Vol 1074 (1) ◽  
pp. 012025
Author(s):  
A Poornima ◽  
M Shyamala Devi ◽  
M Sumithra ◽  
Mullaguri Venkata Bharath ◽  
Swathi ◽  
...  

Author(s):  
Robert J. O’Shea ◽  
Amy Rose Sharkey ◽  
Gary J. R. Cook ◽  
Vicky Goh

Abstract Objectives To perform a systematic review of design and reporting of imaging studies applying convolutional neural network models for radiological cancer diagnosis. Methods A comprehensive search of PUBMED, EMBASE, MEDLINE and SCOPUS was performed for published studies applying convolutional neural network models to radiological cancer diagnosis from January 1, 2016, to August 1, 2020. Two independent reviewers measured compliance with the Checklist for Artificial Intelligence in Medical Imaging (CLAIM). Compliance was defined as the proportion of applicable CLAIM items satisfied. Results One hundred eighty-six of 655 screened studies were included. Many studies did not meet the criteria for current design and reporting guidelines. Twenty-seven percent of studies documented eligibility criteria for their data (50/186, 95% CI 21–34%), 31% reported demographics for their study population (58/186, 95% CI 25–39%) and 49% of studies assessed model performance on test data partitions (91/186, 95% CI 42–57%). Median CLAIM compliance was 0.40 (IQR 0.33–0.49). Compliance correlated positively with publication year (ρ = 0.15, p = .04) and journal H-index (ρ = 0.27, p < .001). Clinical journals demonstrated higher mean compliance than technical journals (0.44 vs. 0.37, p < .001). Conclusions Our findings highlight opportunities for improved design and reporting of convolutional neural network research for radiological cancer diagnosis. Key Points • Imaging studies applying convolutional neural networks (CNNs) for cancer diagnosis frequently omit key clinical information including eligibility criteria and population demographics. • Fewer than half of imaging studies assessed model performance on explicitly unobserved test data partitions. • Design and reporting standards have improved in CNN research for radiological cancer diagnosis, though many opportunities remain for further progress.


2018 ◽  
Vol 6 (11) ◽  
pp. 216-216 ◽  
Author(s):  
Zhongheng Zhang ◽  
◽  
Marcus W. Beck ◽  
David A. Winkler ◽  
Bin Huang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document