Convolutional recurrent neural networks with hidden Markov model bootstrap for scene text recognition

2017 ◽  
Vol 11 (6) ◽  
pp. 497-504 ◽  
Author(s):  
Fenglei Wang ◽  
Qiang Guo ◽  
Jun Lei ◽  
Jun Zhang
2000 ◽  
Vol 23 (4) ◽  
pp. 494-495
Author(s):  
Ingmar Visser

Page's manifesto makes a case for localist representations in neural networks, one of the advantages being ease of interpretation. However, even localist networks can be hard to interpret, especially when at some hidden layer of the network distributed representations are employed, as is often the case. Hidden Markov models can be used to provide useful interpretable representations.


2019 ◽  
Vol 34 (4) ◽  
pp. 349-363 ◽  
Author(s):  
Thinh Van Nguyen ◽  
Bao Quoc Nguyen ◽  
Kinh Huy Phan ◽  
Hai Van Do

In this paper, we present our first Vietnamese speech synthesis system based on deep neural networks. To improve the training data collected from the Internet, a cleaning method is proposed. The experimental results indicate that by using deeper architectures we can achieve better performance for the TTS than using shallow architectures such as hidden Markov model. We also present the effect of using different amounts of data to train the TTS systems. In the VLSP TTS challenge 2018, our proposed DNN-based speech synthesis system won the first place in all three subjects including naturalness, intelligibility, and MOS.


2020 ◽  
Vol 5 (8) ◽  
pp. 958-965
Author(s):  
Akshay Madhav Deshmukh

Understanding human speech precisely by a machine has been a major challenge for many years.With Automatic Speech Recognition (ASR) being decades old and considering the advancement of the technology, where it is not at the point where machines understand all speech, it is used on a regular basis in many applications and services. Hence, to advance research it is important to identify significant research directions, specifically to those that have not been pursued or funded in the past. The performance of such ASR systems, traditionally build upon an Hidden Markov Model (HMM), has improved due tothe application of Deep Neural Networks (DNNs). Despite this progress, building an ASR system remained a challenging task requiring multiple resources and training stages. The idea of using DNNs for Automatic Speech Recognition has gone further from being a single component in a pipeline to building a system mainly based on such a network.This paper provides a literature survey on state of the art researches on two major models, namely Deep Neural Network - Hidden Markov Model (DNN-HMM) and Recurrent Neural Networks trained with Connectionist Temporal Classification (RNN-CTC). It also provides the differences between these two models at the architectural level.


Sign in / Sign up

Export Citation Format

Share Document