Research on the Continuous Speech Feature Extraction Method for Different Noise

We propose a noise-robust continuous speech recognition (CSR) method for modeling and recognition. In recognition, we divide the continuous speech vectors to segments using proposed algorithm, then use DRA based on the segments for recognition. The proposed method efficiency is studied for noisy environment. DRA decreases the difference between the model and recognition continuous speech vectors. The new algorithm focuses on adjust the vectors by using different maxima in different segments. Segment-based DRA algorithm can make noisy speech feature vectors closer to the model. The average recognition rate has been improved at different noise and SNR conditions.

Download Full-text

The Improved MFCC Speech Feature Extraction Method and its Application

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.756-759.4059 ◽

2013 ◽

Vol 756-759 ◽

pp. 4059-4062 ◽

Cited By ~ 1

Author(s):

Xiao Yan Wang

Keyword(s):

Feature Extraction ◽

Extraction Method ◽

Recognition Rate ◽

Feature Extraction Method ◽

Low Snr ◽

Nonlinear Properties ◽

Speech Feature ◽

Simulation Results ◽

Robust To Noise ◽

Speech Feature Extraction

Based on traditional MFCC feature, this paper suggests a new kind of speech signal feature: CMFCC by introducing the method of nonlinear properties. Simulation results indicate that the method has a strong robust to noise and is able to enhance the recognition rate under low SNR.

Download Full-text

New continuous speech feature adjustment for a noise-robust CSR system

2011 11th International Symposium on Communications & Information Technologies (ISCIT) ◽

10.1109/iscit.2011.6089754 ◽

2011 ◽

Author(s):

Yiming Sun ◽

Yoshikazu Miyanaga

Keyword(s):

Continuous Speech ◽

Speech Feature ◽

Noise Robust

Download Full-text

Binaural bark subband pre-processing of nonstationary signals for noise robust speech feature extraction

1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258) ◽

10.1109/icassp.1999.758117 ◽

1999 ◽

Author(s):

M. Peters

Keyword(s):

Feature Extraction ◽

Nonstationary Signals ◽

Speech Feature ◽

Noise Robust ◽

Speech Feature Extraction

Download Full-text

Structured discriminative models for noise robust continuous speech recognition

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2011.5947426 ◽

2011 ◽

Cited By ~ 9

Author(s):

A. Ragni ◽

M. J. F. Gales

Keyword(s):

Speech Recognition ◽

Continuous Speech ◽

Continuous Speech Recognition ◽

Discriminative Models ◽

Noise Robust

Download Full-text

Creation and Instigation of Triphone based Big-Lexicon Speaker-Independent Continuous Speech Recognition Framework for Kannada Language

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b1090.1292s19 ◽

2019 ◽

Vol 9 (2S) ◽

pp. 152-158

Keyword(s):

Speech Recognition ◽

Recognition Rate ◽

Continuous Speech ◽

Continuous Speech Recognition ◽

Mel Frequency Cepstral Coefficients ◽

Speech Corpus ◽

Linear Discriminant ◽

Geographical Regions ◽

Speech Data ◽

Speech Information

This paper proposes a framework that is intended to do the comparably accurate recognition of speech and in precise, continuous speech recognition (CSR) based on triphone modelling for Kannada dialect. For designing the proposed framework, the features from the speech data are obtained from the well-known feature extraction technique Mel-frequency cepstral coefficients (MFCC) and from its transformations, like, linear discriminant analysis (LDA) and maximum likelihood linear transforms (MLLT) are obtained from Kannada speech data files. At that point, the system is trained to evaluate the hidden Markov model (HMM) parameters for continuous speech (CS) data. The persistent Kannada speech information is gathered from 2600 speakers (1560 men and 1040women) of the age bunch in the scope of 14 years-80 years. The speech information is acquired from different geographical regions of the Karnataka (one of the 29 states situated in the southern part of India) state under degraded condition. It comprises of 21,551 words that spread 30 locales. The performance evaluation of both monophone and triphone models concerning word error rate (WER) is done and the obtained results are compared with the standard databases such as TIMIT and aurora4. A significant reduction in WER is obtained for triphone models. The speech recognition (SR) rate is verified for both offline and online recognition mode for all the speakers. The results reveal that the recognition rate (RR) for Kannada speech corpus has got a better improvement over the state-of-the-art existing databases.

Download Full-text

Structured support vector machines for noise robust continuous speech recognition

10.21437/interspeech.2011-406 ◽

2011 ◽

Author(s):

Shi-Xiong Zhang ◽

M. J. F. Gales

Keyword(s):

Speech Recognition ◽

Support Vector Machines ◽

Support Vector ◽

Continuous Speech ◽

Continuous Speech Recognition ◽

Vector Machines ◽

Noise Robust

Download Full-text

Human Pose Recognition Based on Depth Image Multifeature Fusion

Complexity ◽

10.1155/2018/6271348 ◽

2018 ◽

Vol 2018 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Haikuan Wang ◽

Feixiang Zhou ◽

Wenju Zhou ◽

Ling Chen

Keyword(s):

Random Forest ◽

Recognition Rate ◽

Depth Image ◽

Operating Efficiency ◽

Body Depth ◽

Data Set ◽

Feature Extraction Method ◽

The Difference ◽

Human Pose ◽

Depth Feature

The recognition of human pose based on machine vision usually results in a low recognition rate, low robustness, and low operating efficiency. That is mainly caused by the complexity of the background, as well as the diversity of human pose, occlusion, and self-occlusion. To solve this problem, a feature extraction method combining directional gradient of depth feature (DGoD) and local difference of depth feature (LDoD) is proposed in this paper, which uses a novel strategy that incorporates eight neighborhood points around a pixel for mutual comparison to calculate the difference between the pixels. A new data set is then established to train the random forest classifier, and a random forest two-way voting mechanism is adopted to classify the pixels on different parts of the human body depth image. Finally, the gravity center of each part is calculated and a reasonable point is selected as the joint to extract human skeleton. The experimental results show that the robustness and accuracy are significantly improved, associated with a competitive operating efficiency by evaluating our approach with the proposed data set.

Download Full-text

Features Extraction for Lhasa Tibetan Speech Recognition

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.571-572.205 ◽

2014 ◽

Vol 571-572 ◽

pp. 205-208

Author(s):

Guan Yu Li ◽

Hong Zhi Yu ◽

Yong Hong Li ◽

Ning Ma

Keyword(s):

Speech Recognition ◽

Linear Prediction ◽

Recognition System ◽

Continuous Speech Recognition ◽

Mel Frequency Cepstral Coefficients ◽

Linear Prediction Coefficient ◽

Speech Feature ◽

Perceptual Linear Prediction ◽

Prediction Coefficient ◽

Speech Feature Extraction

Speech feature extraction is discussed. Mel frequency cepstral coefficients (MFCC) and perceptual linear prediction coefficient (PLP) method is analyzed. These two types of features are extracted in Lhasa large vocabulary continuous speech recognition system. Then the recognition results are compared.

Download Full-text

Study on defects detection technique of precise optical element

E3S Web of Conferences ◽

10.1051/e3sconf/20185301037 ◽

2018 ◽

Vol 53 ◽

pp. 01037

Author(s):

Mi Zz ◽

C Cong ◽

Y Cheng ◽

Zhang Hm

Keyword(s):

Surface Defects ◽

Recognition Performance ◽

Recognition Rate ◽

Optical Element ◽

Detection Methods ◽

Optical Elements ◽

Average Recognition Rate ◽

The Difference ◽

Low Efficiency ◽

Defect Recognition

Aiming at the problems of low efficiency of traditional detection methods for surface defects of precision optical element and inconvenient detection for optical elements of different calibers, a adjustable optical element defects detecting device for large laser devices is designed. The key technical points of system composition, detection environment, illumination design and image stitching are expounded. According to the characteristics of surface defects of optical element, such as the difference of contour, gray scale, contrast and ambiguity, a classification method based on FCM is proposed. The experimental results show that the system can realize the automatic detection of surface defects, also it can effectively distinguishes micron-scale defects and has good defect recognition performance. The overall average recognition rate reached to 93.3%.

Download Full-text