TONE RECOGNITION OF CONTINUOUS THAI SPEECH UNDER TONAL ASSIMILATION AND DECLINATION EFFECTS USING HALF-TONE MODEL

Author(s):  
NUTTAKORN THUBTHONG ◽  
BOONSERM KIJSIRIKUL

This paper presents a method for continuous Thai tone recognition. One of the main problems in tone recognition is that several interacting factors affect F0realization of tones. In this paper, we focus on the tonal assimilation and declination effects. These effects are compensated by the tone information of neighboring syllables, the F0downdrift and the context-dependent tone model. However, the context-dependent tone model is too large and its training time is very long. To overcome these problems, we propose a novel model called the half-tone model. The experiments, which compare all tone features and all tone models, were simulated by feedforward neural networks. The results show that the proposed tone features increase the recognition rates and the half-tone model outperforms conventional tone models, i.e. context-independent and context-dependent tone models, in terms of recognition rate and speed. The best results are 94.77% and 93.82% for the inside test and outside test, respectively.

2014 ◽  
Vol 571-572 ◽  
pp. 665-671 ◽  
Author(s):  
Sen Xu ◽  
Xu Zhao ◽  
Cheng Hua Duan ◽  
Xiao Lin Cao ◽  
Hui Yan Li ◽  
...  

As One of Features from other Languages, the Chinese Tone Changes of Chinese are Mainly Decided by its Vowels, so the Vowel Variation of Chinese Tone Becomes Important in Speech Recognition Research. the Normal Tone Recognition Ways are Always Based on Fundamental Frequency of Signal, which can Not Keep Integrity of Tone Signal. we Bring Forward to a Mathematical Morphological Processing of Spectrograms for the Tone of Chinese Vowels. Firstly, we will have Pretreatment to Recording Good Tone Signal by Using Cooledit Pro Software, and Converted into Spectrograms; Secondly, we will do Smooth and the Normalized Pretreatment to Spectrograms by Mathematical Morphological Processing; Finally, we get Whole Direction Angle Statistics of Tone Signal by Skeletonization way. the Neural Networks Stimulation Shows that the Speech Emotion Recognition Rate can Reach 92.50%.


Author(s):  
M. BOUAMAR ◽  
M. LADJAL

Water quality is one of the major concerns of countries around the world. Monitoring of water quality is becoming more and more interesting because of its effects on human life. The control of risks in the factories that produce and distribute water ensures the quality of this vital resource. Many techniques were developed in order to improve this process attending to rigorous follow-ups of the water quality. In this paper, we present a comparative study of the performance of three techniques resulting from the field of the artificial intelligence namely: Artificial Neural Networks (ANN), RBF Neural Networks (RBF-NN), and Support Vector Machines (SVM). Developed from the statistical learning theory, these methods display optimal training performances and generalization in many fields of application, among others the field of pattern recognition. In order to evaluate their performances regarding the recognition rate, training time, and robustness, a simulation using generated and real data is carried out. To validate their functionalities, an application performed on real data is presented. Applied as a classification tool, the technique selected should ensure, within a multisensor monitoring system, a direct and quasi permanent control of water quality.


2021 ◽  
Vol 29 (3) ◽  
Author(s):  
Bennilo Fernandes ◽  
Kasiprasad Mannepalli

Deep Neural Networks (DNN) are more than just neural networks with several hidden units that gives better results with classification algorithm in automated voice recognition activities. Then spatial correlation was considered in traditional feedforward neural networks and which do not manage speech signal properly to it extend, so recurrent neural networks (RNNs) were implemented. Long Short-Term Memory (LSTM) systems is a unique case of RNNs for speech processing, thus considering long-term dependencies Deep Hierarchical LSTM and BiLSTM is designed with dropout layers to reduce the gradient and long-term learning error in emotional speech analysis. Thus, four different combinations of deep hierarchical learning architecture Deep Hierarchical LSTM and LSTM (DHLL), Deep Hierarchical LSTM and BiLSTM (DHLB), Deep Hierarchical BiLSTM and LSTM (DHBL) and Deep Hierarchical dual BiLSTM (DHBB) is designed with dropout layers to improve the networks. The performance test of all four model were compared in this paper and better efficiency of classification is attained with minimal dataset of Tamil Language. The experimental results show that DHLB reaches the best precision of about 84% in recognition of emotions for Tamil database, however, the DHBL gives 83% of efficiency. Other design layers also show equal performance but less than the above models DHLL & DHBB shows 81% of efficiency for lesser dataset and minimal execution and training time.


2001 ◽  
Vol 11 (03) ◽  
pp. 219-228 ◽  
Author(s):  
HAMID BEIGY ◽  
MOHAMMAD R. MEYBODI

Despite of the many successful applications of backpropagation for training multi–layer neural networks, it has many drawbacks. For complex problems it may require a long time to train the networks, and it may not train at all. Long training time can be the result of the non-optimal parameters. It is not easy to choose appropriate value of the parameters for a particular problem. In this paper, by interconnection of fixed structure learning automata (FSLA) to the feedforward neural networks, we apply learning automata (LA) scheme for adjusting these parameters based on the observation of random response of neural networks. The main motivation in using learning automata as an adaptation algorithm is to use its capability of global optimization when dealing with multi-modal surface. The feasibility of proposed method is shown through simulations on three learning problems: exclusive-or, encoding problem, and digit recognition. The simulation results show that the adaptation of these parameters using this method not only increases the convergence rate of learning but it increases the likelihood of escaping from the local minima.


2021 ◽  
Vol 11 (4) ◽  
pp. 287-306
Author(s):  
Jarosław Bilski ◽  
Bartosz Kowalczyk ◽  
Andrzej Marjański ◽  
Michał Gandor ◽  
Jacek Zurada

Abstract In this paper1 a new neural networks training algorithm is presented. The algorithm originates from the Recursive Least Squares (RLS) method commonly used in adaptive filtering. It uses the QR decomposition in conjunction with the Givens rotations for solving a normal equation - resulting from minimization of the loss function. An important parameter in neural networks is training time. Many commonly used algorithms require a big number of iterations in order to achieve a satisfactory outcome while other algorithms are effective only for small neural networks. The proposed solution is characterized by a very short convergence time compared to the well-known backpropagation method and its variants. The paper contains a complete mathematical derivation of the proposed algorithm. There are presented extensive simulation results using various benchmarks including function approximation, classification, encoder, and parity problems. Obtained results show the advantages of the featured algorithm which outperforms commonly used recent state-of-the-art neural networks training algorithms, including the Adam optimizer and the Nesterov’s accelerated gradient.


Biomimetics ◽  
2019 ◽  
Vol 5 (1) ◽  
pp. 1 ◽  
Author(s):  
Michelle Gutiérrez-Muñoz ◽  
Astryd González-Salazar ◽  
Marvin Coto-Jiménez

Speech signals are degraded in real-life environments, as a product of background noise or other factors. The processing of such signals for voice recognition and voice analysis systems presents important challenges. One of the conditions that make adverse quality difficult to handle in those systems is reverberation, produced by sound wave reflections that travel from the source to the microphone in multiple directions. To enhance signals in such adverse conditions, several deep learning-based methods have been proposed and proven to be effective. Recently, recurrent neural networks, especially those with long short-term memory (LSTM), have presented surprising results in tasks related to time-dependent processing of signals, such as speech. One of the most challenging aspects of LSTM networks is the high computational cost of the training procedure, which has limited extended experimentation in several cases. In this work, we present a proposal to evaluate the hybrid models of neural networks to learn different reverberation conditions without any previous information. The results show that some combinations of LSTM and perceptron layers produce good results in comparison to those from pure LSTM networks, given a fixed number of layers. The evaluation was made based on quality measurements of the signal’s spectrum, the training time of the networks, and statistical validation of results. In total, 120 artificial neural networks of eight different types were trained and compared. The results help to affirm the fact that hybrid networks represent an important solution for speech signal enhancement, given that reduction in training time is on the order of 30%, in processes that can normally take several days or weeks, depending on the amount of data. The results also present advantages in efficiency, but without a significant drop in quality.


2011 ◽  
Vol 464 ◽  
pp. 38-42 ◽  
Author(s):  
Ping Ye ◽  
Gui Rong Weng

This paper proposed a novel method for leaf classification and recognition. In the method, the moment invariant and fractal dimension were regarded as the characteristic parameters of the plant leaf. In order to extract the representative characteristic parameters, pretreatment of the leaf images, including RGB-gray converting, image binarization and leafstalk removing. The extracted leaf characteristic parameters were further utilized as training sets to train the neural networks. The proposed method was proved effectively to reach a recognition rate about 92% for most of the testing leaf samples


2020 ◽  
Vol 53 (2) ◽  
pp. 1108-1113
Author(s):  
Magnus Malmström ◽  
Isaac Skog ◽  
Daniel Axehill ◽  
Fredrik Gustafsson

Sign in / Sign up

Export Citation Format

Share Document