Research on the Continuous Speech Feature Extraction Method for Different Noise

2014 ◽  
Vol 513-517 ◽  
pp. 3589-3592
Author(s):  
Wei Liu ◽  
Yi Ming Sun ◽  
Yan Xiu Liu

We propose a noise-robust continuous speech recognition (CSR) method for modeling and recognition. In recognition, we divide the continuous speech vectors to segments using proposed algorithm, then use DRA based on the segments for recognition. The proposed method efficiency is studied for noisy environment. DRA decreases the difference between the model and recognition continuous speech vectors. The new algorithm focuses on adjust the vectors by using different maxima in different segments. Segment-based DRA algorithm can make noisy speech feature vectors closer to the model. The average recognition rate has been improved at different noise and SNR conditions.

2013 ◽  
Vol 756-759 ◽  
pp. 4059-4062 ◽  
Author(s):  
Xiao Yan Wang

Based on traditional MFCC feature, this paper suggests a new kind of speech signal feature: CMFCC by introducing the method of nonlinear properties. Simulation results indicate that the method has a strong robust to noise and is able to enhance the recognition rate under low SNR.


This paper proposes a framework that is intended to do the comparably accurate recognition of speech and in precise, continuous speech recognition (CSR) based on triphone modelling for Kannada dialect. For designing the proposed framework, the features from the speech data are obtained from the well-known feature extraction technique Mel-frequency cepstral coefficients (MFCC) and from its transformations, like, linear discriminant analysis (LDA) and maximum likelihood linear transforms (MLLT) are obtained from Kannada speech data files. At that point, the system is trained to evaluate the hidden Markov model (HMM) parameters for continuous speech (CS) data. The persistent Kannada speech information is gathered from 2600 speakers (1560 men and 1040women) of the age bunch in the scope of 14 years-80 years. The speech information is acquired from different geographical regions of the Karnataka (one of the 29 states situated in the southern part of India) state under degraded condition. It comprises of 21,551 words that spread 30 locales. The performance evaluation of both monophone and triphone models concerning word error rate (WER) is done and the obtained results are compared with the standard databases such as TIMIT and aurora4. A significant reduction in WER is obtained for triphone models. The speech recognition (SR) rate is verified for both offline and online recognition mode for all the speakers. The results reveal that the recognition rate (RR) for Kannada speech corpus has got a better improvement over the state-of-the-art existing databases.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
Haikuan Wang ◽  
Feixiang Zhou ◽  
Wenju Zhou ◽  
Ling Chen

The recognition of human pose based on machine vision usually results in a low recognition rate, low robustness, and low operating efficiency. That is mainly caused by the complexity of the background, as well as the diversity of human pose, occlusion, and self-occlusion. To solve this problem, a feature extraction method combining directional gradient of depth feature (DGoD) and local difference of depth feature (LDoD) is proposed in this paper, which uses a novel strategy that incorporates eight neighborhood points around a pixel for mutual comparison to calculate the difference between the pixels. A new data set is then established to train the random forest classifier, and a random forest two-way voting mechanism is adopted to classify the pixels on different parts of the human body depth image. Finally, the gravity center of each part is calculated and a reasonable point is selected as the joint to extract human skeleton. The experimental results show that the robustness and accuracy are significantly improved, associated with a competitive operating efficiency by evaluating our approach with the proposed data set.


2014 ◽  
Vol 571-572 ◽  
pp. 205-208
Author(s):  
Guan Yu Li ◽  
Hong Zhi Yu ◽  
Yong Hong Li ◽  
Ning Ma

Speech feature extraction is discussed. Mel frequency cepstral coefficients (MFCC) and perceptual linear prediction coefficient (PLP) method is analyzed. These two types of features are extracted in Lhasa large vocabulary continuous speech recognition system. Then the recognition results are compared.


2018 ◽  
Vol 53 ◽  
pp. 01037
Author(s):  
Mi Zz ◽  
C Cong ◽  
Y Cheng ◽  
Zhang Hm

Aiming at the problems of low efficiency of traditional detection methods for surface defects of precision optical element and inconvenient detection for optical elements of different calibers, a adjustable optical element defects detecting device for large laser devices is designed. The key technical points of system composition, detection environment, illumination design and image stitching are expounded. According to the characteristics of surface defects of optical element, such as the difference of contour, gray scale, contrast and ambiguity, a classification method based on FCM is proposed. The experimental results show that the system can realize the automatic detection of surface defects, also it can effectively distinguishes micron-scale defects and has good defect recognition performance. The overall average recognition rate reached to 93.3%.


Sign in / Sign up

Export Citation Format

Share Document