Compact Wake-Up Word Speech Recognition on Embedded Platforms

The wake-up word speech recognition system is a new paradigm in the field of automatic speech recognition (ASR). This new paradigm is not yet widely recognized but useful in many applications such as mobile phones and smart home systems. In this paper we describe the development of a compact wake-up word recognizer for embedded platforms. To keep resource cost low, a variety of simplification techniques are used. Speech feature observations are compressed to lower dimension and the simple distance-based template matching method is used in place of complex Viterbi scoring. We apply double scoring method to achieve a better performance. To cooperate with double scoring method, the support vector machine classifier is used as well. We were able to accomplish a performance improvement with false rejection rate reduced from 6.88% to 5.50% and false acceptance rate reduced from 8.40% to 3.01%.

Download Full-text

Speech feature compensation in multiple model based speech recognition system using vts-based environmental parameter estimation

2013 International Conference on Computer Applications Technology (ICCAT) ◽

10.1109/iccat.2013.6522048 ◽

2013 ◽

Author(s):

Yongjoo Chung

Keyword(s):

Parameter Estimation ◽

Speech Recognition ◽

Recognition System ◽

Environmental Parameter ◽

Speech Recognition System ◽

Multiple Model ◽

Model Based ◽

Speech Feature

Download Full-text

An Efficient Continuous Speech Recognition System for Dravidian Languages Using Support Vector Machine

Advances in Intelligent Systems and Computing - Artificial Intelligence and Evolutionary Algorithms in Engineering Systems ◽

10.1007/978-81-322-2126-5_40 ◽

2014 ◽

pp. 359-367

Author(s):

J. Sangeetha ◽

S. Jothilakshmi

Keyword(s):

Support Vector Machine ◽

Speech Recognition ◽

Recognition System ◽

Support Vector ◽

Speech Recognition System ◽

Continuous Speech ◽

Continuous Speech Recognition

Download Full-text

A noise-robust speech recognition system based on ZCPA features and support vector machine

2009 ISECS International Colloquium on Computing, Communication, Control, and Management ◽

10.1109/cccm.2009.5267999 ◽

2009 ◽

Cited By ~ 2

Author(s):

Jing Bai ◽

Xueying Zhang

Keyword(s):

Support Vector Machine ◽

Speech Recognition ◽

Recognition System ◽

Support Vector ◽

Robust Speech Recognition ◽

Speech Recognition System ◽

Noise Robust Speech Recognition ◽

Noise Robust

Download Full-text

Embedded Speech Recognition Based on Multiclass Support Vector Machine

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.467-469.1905 ◽

2011 ◽

Vol 467-469 ◽

pp. 1905-1910

Author(s):

Jun Feng Zhao ◽

Ye Ping Zhu

Keyword(s):

Support Vector Machine ◽

Speech Recognition ◽

Recognition System ◽

Support Vector ◽

Decision Tree Classifier ◽

Advantages And Disadvantages ◽

Embedded Platform ◽

Tree Classifier ◽

Multiclass Support Vector Machine ◽

Multiclass Svm

This paper introduces the characteristics and requirements of speech recognition technology based on embedded platform. It also describes the basic theory and related properties of Support Vector Machine. The advantages and disadvantages of the Multiclass SVM algorithms are analyzed, providing the algorithms principles for training and recognition of SVM application in the embedded speech recognition system. Finally, we proposed a design strategy based on multiclass SVM decision tree classifier, combined with the features of the embedded speech recognition.

Download Full-text

A Robust Isolated Automatic Speech Recognition System using Machine Learning Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j8765.0881019 ◽

2019 ◽

Vol 8 (10) ◽

pp. 2325-2331

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Speech Recognition ◽

Speaker Recognition ◽

Search Algorithm ◽

Recognition System ◽

Machine Learning Techniques ◽

Support Vector ◽

Speech Recognition System ◽

Work Done

In order to make fast communication between human and machine, speech recognition system are used. Number of speech recognition systems have been developed by various researchers. For example speech recognition, speaker verification and speaker recognition. The basic stages of speech recognition system are pre-processing, feature extraction and feature selection and classification. Numerous works have been done for improvement of all these stages to get accurate and better results. In this paper the main focus is given to addition of machine learning in speech recognition system. This paper covers architecture of ASR that helps in getting idea about basic stages of speech recognition system. Then focus is given to the use of machine learning in ASR. The work done by various researchers using Support vector machine and artificial neural network is also covered in a section of the paper. Along with this review is presented on work done using SVM, ELM, ANN, Naive Bayes and kNN classifier. The simulation results show that the best accuracy is achieved using ELM classifier. The last section of paper covers the results obtained by using proposed approaches in which SVM, ANN with Cuckoo search algorithm and ANN with back propagation classifier is used. The focus is also on the improvement of pre-processing and feature extraction processes.

Download Full-text

AUDD: Audio Urdu Digits Dataset for Automatic Audio Urdu Digit Recognition

Applied Sciences ◽

10.3390/app11198842 ◽

2021 ◽

Vol 11 (19) ◽

pp. 8842

Author(s):

Aisha Aiman ◽

Yao Shen ◽

Malika Bendechache ◽

Irum Inayat ◽

Teerath Kumar

Keyword(s):

Speech Recognition ◽

South Asian ◽

Recognition System ◽

National Language ◽

Support Vector ◽

Asian Countries ◽

Digit Recognition ◽

Research Activities ◽

Recognition Systems ◽

Ongoing Development

The ongoing development of audio datasets for numerous languages has spurred research activities towards designing smart speech recognition systems. A typical speech recognition system can be applied in many emerging applications, such as smartphone dialing, airline reservations, and automatic wheelchairs, among others. Urdu is a national language of Pakistan and is also widely spoken in many other South Asian countries (e.g., India, Afghanistan). Therefore, we present a comprehensive dataset of spoken Urdu digits ranging from 0 to 9. Our dataset has 25,518 sound samples that are collected from 740 participants. To test the proposed dataset, we apply different existing classification algorithms on the datasets including Support Vector Machine (SVM), Multilayer Perceptron (MLP), and flavors of the EfficientNet. These algorithms serve as a baseline. Furthermore, we propose a convolutional neural network (CNN) for audio digit classification. We conduct the experiment using these networks, and the results show that the proposed CNN is efficient and outperforms the baseline algorithms in terms of classification accuracy.

Download Full-text

New Low-Power Architectures of Support Vector Machine Classifier for Speech Recognition System

10.1109/lascas51355.2021.9667155 ◽

2021 ◽

Author(s):

Gracieth C. Batista ◽

Duarte L. Oliveira ◽

Osamu Saotome

Keyword(s):

Support Vector Machine ◽

Speech Recognition ◽

Low Power ◽

Support Vector Machine Classifier ◽

Recognition System ◽

Support Vector ◽

Speech Recognition System ◽

Low Power Architectures

Download Full-text

Features Extraction for Lhasa Tibetan Speech Recognition

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.571-572.205 ◽

2014 ◽

Vol 571-572 ◽

pp. 205-208

Author(s):

Guan Yu Li ◽

Hong Zhi Yu ◽

Yong Hong Li ◽

Ning Ma

Keyword(s):

Speech Recognition ◽

Linear Prediction ◽

Recognition System ◽

Continuous Speech Recognition ◽

Mel Frequency Cepstral Coefficients ◽

Linear Prediction Coefficient ◽

Speech Feature ◽

Perceptual Linear Prediction ◽

Prediction Coefficient ◽

Speech Feature Extraction

Speech feature extraction is discussed. Mel frequency cepstral coefficients (MFCC) and perceptual linear prediction coefficient (PLP) method is analyzed. These two types of features are extracted in Lhasa large vocabulary continuous speech recognition system. Then the recognition results are compared.

Download Full-text

Iqro Reading Learning System through Speech Recognition Using Mel Frequency Cepstral Coefficient (MFCC) and Vector Quantization (VQ) Method

IJAIT (International Journal of Applied Information Technology) ◽

10.25124/ijait.v2i01.1173 ◽

2018 ◽

Vol 2 (01) ◽

pp. 29

Author(s):

Youllia Indrawaty Nurhasanah ◽

Irma Amelia Dewi ◽

Bagus Ade Saputro

Keyword(s):

Feature Extraction ◽

Speech Recognition ◽

Vector Quantization ◽

Recognition System ◽

Learning System ◽

Recognition Method ◽

Signal Features ◽

Extraction Step ◽

Speech Feature ◽

Mel Frequency Cepstral Coefficient

Historically, the study of Qur'an in Indonesia evolved along with the spread of Islam. Learning methods of reading the Qur'an have been found ranging from al-Baghdadi, al-Barqi, Qiraati, Iqro', Human, Tartila, and others, which can make it easier to learn to read the Qur'an. Currently, the development of speech recognition technology can be used for the detection of Iqro vol 3 reading pronunciations. Speech recognition consists of two general stages of feature extraction and speech matching. The feature extraction step is used to derive speech-feature and speech-matching stages to compare compatibility between test sound and train voice. The speech recognition method used to recognize Iqro readings is extracting speech signal features using Mel Frequency Cepstral Coefficient (MFCC) and classifying them using Vector Quantization (VQ) to get the appropriate speech results. The result of testing for speech recognition system of Iqro reading has been tested for 30 peoples as a sample of data and there are 6 utterances indicating the information failed, so the system has a success rate of 80%.

Download Full-text

Intelligent model for speech recognition based on SVM: A case study on English language

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189314 ◽

2020 ◽

pp. 1-11

Author(s):

Qian Hou ◽

Cuijuan Li ◽

Min Kang ◽

Xin Zhao

Keyword(s):

Speech Recognition ◽

English Language ◽

Feature Recognition ◽

Learning Algorithm ◽

Recognition System ◽

Spectral Subtraction ◽

Spectral Amplitude ◽

Support Vector ◽

Linear Classifiers ◽

Intelligent Technology

English feature recognition has a certain influence on the development of English intelligent technology. In particular, the speech recognition technology has the problem of accuracy when performing English feature recognition. In order to improve the English feature recognition effect, this study takes the intelligent learning algorithm as the system algorithm and combines support vector machines to construct an English feature recognition system and uses linear classifiers and nonlinear classifiers to complete the relevant work of subjective recognition. Moreover, spectral subtraction is introduced in the front end of feature extraction, and the spectral amplitude of the noise-free signal is subtracted from the spectral amplitude of the noise to obtain the spectral amplitude of the pure signal. By taking advantage of the insensitivity of speech to the phase, the phase angle information before spectral subtraction is directly used to reconstruct the signal after spectral subtraction to obtain the denoised speech. In addition, this study uses a nonlinear power function that simulates the hearing characteristics of the human ear to extract the features of the denoised speech signal and combines the English features to expand the recognition. Finally, this study analyzes the performance of the algorithm proposed in this study through comparative experiments. The research results show that the algorithm in this paper has a certain effect.

Download Full-text