Comparison of Tied-Mixture and State-Clustered HMMs with Respect to Recognition Performance and Training Method

2014 ◽  
Vol 7 (3) ◽  
pp. 15-31
Author(s):  
Hiroyuki Segi ◽  
Kazuo Onoe ◽  
Shoei Sato ◽  
Akio Kobayashi ◽  
Akio Ando

Tied-mixture HMMs have been proposed as the acoustic model for large-vocabulary continuous speech recognition and have yielded promising results. They share base-distribution and provide more flexibility in choosing the degree of tying than state-clustered HMMs. However, it is unclear which acoustic models to superior to the other under the same training data. Moreover, LBG algorithm and EM algorithm, which are the usual training methods for HMMs, have not been compared. Therefore in this paper, the recognition performance of the respective HMMs and the respective training methods are compared under the same condition. It was found that the number of parameters and the word error rate for both HMMs are equivalent when the number of codebooks is sufficiently large. It was also found that training method using the LBG algorithm achieves a 90% reduction in training time compared to training method using the EM algorithm, without degradation of recognition accuracy.

2016 ◽  
Vol 31 (4) ◽  
pp. 267
Author(s):  
Bao Quoc Nguyen ◽  
Thang Tat Vu ◽  
Mai Chi Luong

In this paper, the pre-training method based on denoising auto-encoder is investigated and proved to be good models for initializing bottleneck networks of Vietnamese speech recognition system that result in better recognition performance compared to base bottleneck features reported previously. The experiments are carried out on the dataset containing speeches on Voice of Vietnam channel (VOV). The results show that the DBNF extraction for Vietnamese recognition decreases relative word error rate by 14 % and 39 % compared to the base bottleneck features and MFCC baseline, respectively.


10.14311/1105 ◽  
2009 ◽  
Vol 49 (2) ◽  
Author(s):  
J. Rajnoha

Automatic speech recognition (ASR) systems frequently work in a noisy environment. As they are often trained on clean speech data, noise reduction or adaptation techniques are applied to decrease the influence of background disturbance even in the case of unknown conditions. Speech data mixed with noise recordings from particular environment are often used for the purposes of model adaptation. This paper analyses the improvement of recognition performance within such adaptation when multi-condition training data from a real environment is used for training initial models. Although the quality of such models can decrease with the presence of noise in the training material, they are assumed to include initial information about noise and consequently support the adaptation procedure. Experimental results show significant improvement of the proposed training method in a robust ASR task under unknown noisy conditions. The decrease by 29 % and 14 % in word error rate in comparison with clean speech training data was achieved for the non-adapted and adapted system, respectively. 


1984 ◽  
Vol 28 (6) ◽  
pp. 522-526 ◽  
Author(s):  
Amir M. Mané

The effectiveness of two training methods was investigated in the training of a complex perceptual-motor skill. Subjects learned how to play a computer controlled video-game. In adaptive training, pacing of the game elements was the adaptive variable. In the part training condition subjects received training in four subtasks which were designed to train specific elements of the skill necessary for the performance of the task. A comparison of training time as well as performance at fixed time points against a control group indicated that the training manipulations were effective. The part training method was clearly the best. The adaptive training method yielded mixed results. However, some support for the effectiveness of the method was observed. The results are discussed in terms of the principles for design of training devices and programs.


2020 ◽  
Author(s):  
Mark Britten-Jones

2021 ◽  
Vol 13 (9) ◽  
pp. 1713
Author(s):  
Songwei Gu ◽  
Rui Zhang ◽  
Hongxia Luo ◽  
Mengyao Li ◽  
Huamei Feng ◽  
...  

Deep learning is an important research method in the remote sensing field. However, samples of remote sensing images are relatively few in real life, and those with markers are scarce. Many neural networks represented by Generative Adversarial Networks (GANs) can learn from real samples to generate pseudosamples, rather than traditional methods that often require more time and man-power to obtain samples. However, the generated pseudosamples often have poor realism and cannot be reliably used as the basis for various analyses and applications in the field of remote sensing. To address the abovementioned problems, a pseudolabeled sample generation method is proposed in this work and applied to scene classification of remote sensing images. The improved unconditional generative model that can be learned from a single natural image (Improved SinGAN) with an attention mechanism can effectively generate enough pseudolabeled samples from a single remote sensing scene image sample. Pseudosamples generated by the improved SinGAN model have stronger realism and relatively less training time, and the extracted features are easily recognized in the classification network. The improved SinGAN can better identify sub-jects from images with complex ground scenes compared with the original network. This mechanism solves the problem of geographic errors of generated pseudosamples. This study incorporated the generated pseudosamples into training data for the classification experiment. The result showed that the SinGAN model with the integration of the attention mechanism can better guarantee feature extraction of the training data. Thus, the quality of the generated samples is improved and the classification accuracy and stability of the classification network are also enhanced.


Author(s):  
Lucia Vigoroso ◽  
Federica Caffaro ◽  
Margherita Micheletti Cremasco ◽  
Eugenio Cavallo

Digital games have been successfully applied in different working sectors as an occupational safety training method, but with a very limited application in agriculture. In agriculture and other productive sectors, unintentional injuries tend to occur with similar dynamics. A literature review was carried out to understand how occupational risks are addressed during game-based safety training in different productive sectors and how this can be transferred to agriculture. Literature about “serious game” and “gamification” as safety training methods was searched in WEB OF SCIENCE, SCOPUS, PUBMED and PsycINFO databases. In the forty-two publications retained, the computer was identified as the most adopted game support, whereas “points”, “levels”, “challenges” and “discovery” were the preferred game mechanics. Moreover, an association can be detected between the game mechanics and the elements developed in the game. Finally, during the game assessment, much positive feedback was collected and the games proved to be able to increase the operators’ skills and safety knowledge. In light of the results, insights are provided to develop an effective, satisfying and engaging safety game training for workers employed in agriculture. Games can be best used to learn and they are certain to improve over the next few years.


Author(s):  
Binbing Song ◽  
Hiroko Itoh ◽  
Yasumi Kawamura

AbstractVessel traffic service (VTS) is important to protect the safety of maritime traffic. Along with the expansion of monitoring area per VTS operator in Tokyo Bay, Japan, inexperienced operators must acquire the ability to quickly and accurately detect conditions that requires attention (CRAs) from a monitoring screen. In our previous study (Song B, Itoh H, Kawamura Y, Fukuto J (2018) Analysis of Cognitive Processes of Operators of Vessel Traffic Service. In: Proceedings of the 2018 International Association of Institutes of Navigation. IAIN 2018, pp 529–534, Song et al., J Jpn Inst Navig 140:48–54, 2019), we established a task analysis method based on the assumption that the cognitive process model consists of three stages: “situational awareness”, “situation judgment”, and “decision making”. A simulation experiment was conducted for VTS operators with different levels of ability and their cognitive processes were compared based on the observation of eye movements. The results showed that the inexperienced operators’ abilities to predict situation changes were lower. And it was considered that oral transmission of the knowledge is difficult, thus new training methods are needed to help the inexperienced operators to understand the prediction methods of experienced operators. In this study, based on the cognitive process of an experienced operator, we analyzed the prediction procedures of situation changes and developed an educational tool called vessel traffic routine (VTR). The training method learning VTR aims to quickly improve inexperienced VTS operators’ abilities to predict situation changes. A simulation verification experiment of the VTR effect was conducted for four inexperienced operators, who were divided into two groups with and without prior explanation of VTR. By evaluating the cognitive processes of inexperienced operators, it was confirmed that those given prior explanations of VTR were better at detecting CRAs.


Sign in / Sign up

Export Citation Format

Share Document