Introduction to Speech Recognition

Intelligent Information Technologies ◽

10.4018/978-1-59904-941-0.ch007 ◽

2011 ◽

pp. 141-161

Author(s):

Sergio Suárez-Guerra ◽

Jose Luis Oropeza-Rodriguez

Keyword(s):

Artificial Intelligence ◽

Signal Processing ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Speech Signal ◽

Essential Information ◽

Science Field ◽

Applied Artificial Intelligence ◽

And Linguistics ◽

Successful Technology

This chapter presents the state-of-the-art automatic speech recognition (ASR) technology, which is a very successful technology in the computer science field, related to multiple disciplines such as the signal processing and analysis, mathematical statistics, applied artificial intelligence and linguistics, and so forth. The unit of essential information used to characterize the speech signal in the most widely used ASR systems is the phoneme. However, recently several researchers have questioned this representation and demonstrated the limitations of the phonemes, suggesting that ASR with better performance can be developed replacing the phoneme by triphones and syllables as the unit of essential information used to characterize the speech signal. This chapter presents an overview of the most successful techniques used in ASR systems together with some recently proposed ASR systems that intend to improve the characteristics of conventional ASR systems.

Download Full-text

Introduction to Speech Recognition

Pattern Recognition Technologies and Applications ◽

10.4018/978-1-59904-807-9.ch005 ◽

2008 ◽

pp. 90-109

Author(s):

Sergio Suárez-Guerra ◽

Jose Luis Oropeza-Rodriguez

Keyword(s):

Artificial Intelligence ◽

Signal Processing ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Speech Signal ◽

Essential Information ◽

Science Field ◽

Applied Artificial Intelligence ◽

And Linguistics ◽

Successful Technology

This chapter presents the state-of-the-art automatic speech recognition (ASR) technology, which is a very successful technology in the computer science field, related to multiple disciplines such as the signal processing and analysis, mathematical statistics, applied artificial intelligence and linguistics, and so forth. The unit of essential information used to characterize the speech signal in the most widely used ASR systems is the phoneme. However, recently several researchers have questioned this representation and demonstrated the limitations of the phonemes, suggesting that ASR with better performance can be developed replacing the phoneme by triphones and syllables as the unit of essential information used to characterize the speech signal. This chapter presents an overview of the most successful techniques used in ASR systems together with some recently proposed ASR systems that intend to improve the characteristics of conventional ASR systems.

Download Full-text

Signal Processing Cues to Improve Automatic Speech Recognition for Low Resource Indian Languages

10.21437/sltu.2018-6 ◽

2018 ◽

Cited By ~ 1

Author(s):

Arun Baby ◽

Karthik Pandia D S ◽

Hema A Murthy

Keyword(s):

Signal Processing ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Indian Languages ◽

Low Resource

Download Full-text

Automatic Speech Recognition of Continuous Speech Signal of Gujarati Language Using Machine Learning

Advances in Intelligent Systems and Computing - Mathematical Modeling, Computational Intelligence Techniques and Renewable Energy ◽

10.1007/978-981-15-9953-8_13 ◽

2021 ◽

pp. 147-159

Author(s):

Purnima Pandit ◽

Priyank Makwana ◽

Shardav Bhatt

Keyword(s):

Machine Learning ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Speech Signal ◽

Continuous Speech ◽

Gujarati Language

Download Full-text

A study on bias-based speech signal conditioning techniques for improving the robustness of automatic speech recognition

2009 Canadian Conference on Electrical and Computer Engineering ◽

10.1109/ccece.2009.5090212 ◽

2009 ◽

Author(s):

Md Foezur Rahman Chowdhury ◽

Sid-Ahmed Selouani ◽

Douglas O'Shaughnessy

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Speech Signal ◽

Signal Conditioning

Download Full-text

Automatic Speech Recognition with Stuttering Speech Removal using Long Short-Term Memory (LSTM)

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.e6230.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 1677-1681

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Speech Signal ◽

Short Term Memory ◽

Long Short Term Memory ◽

Increase In Accuracy ◽

Two Stages ◽

The Given ◽

Asr System

Stuttering or Stammering is a speech defect within which sounds, syllables, or words are rehashed or delayed, disrupting the traditional flow of speech. Stuttering can make it hard to speak with other individuals, which regularly have an effect on an individual's quality of life. Automatic Speech Recognition (ASR) system is a technology that converts audio speech signal into corresponding text. Presently ASR systems play a major role in controlling or providing inputs to the various applications. Such an ASR system and Machine Translation Application suffers a lot due to stuttering (speech dysfluency). Dysfluencies will affect the phrase consciousness accuracy of an ASR, with the aid of increasing word addition, substitution and dismissal rates. In this work we focused on detecting and removing the prolongation, silent pauses and repetition to generate proper text sequence for the given stuttered speech signal. The stuttered speech recognition consists of two stages namely classification using LSTM and testing in ASR. The major phases of classification system are Re-sampling, Segmentation, Pre-Emphasis, Epoch Extraction and Classification. The current work is carried out in UCLASS Stuttering dataset using MATLAB with 4% to 6% increase in accuracy when compare with ANN and SVM.

Download Full-text

AN OVERVIEW OF METHODS FOR GENERATING, AUGMENTING AND EVALUATING ROOM IMPULSE RESPONSE USING ARTIFICIAL NEURAL NETWORKS

Mokslas - Lietuvos ateitis ◽

10.3846/mla.2021.15152 ◽

2021 ◽

Vol 13 (0) ◽

pp. 1-5

Author(s):

Mantas Tamulionis

Keyword(s):

Neural Networks ◽

Signal Processing ◽

Artificial Neural Networks ◽

Speech Recognition ◽

Impulse Response ◽

Automatic Speech Recognition ◽

Audio Signal ◽

Training Data ◽

Audio Signal Processing ◽

Artificial Neural

Methods based on artificial neural networks (ANN) are widely used in various audio signal processing tasks. This provides opportunities to optimize processes and save resources required for calculations. One of the main objects we need to get to numerically capture the acoustics of a room is the room impulse response (RIR). Increasingly, research authors choose not to record these impulses in a real room but to generate them using ANN, as this gives them the freedom to prepare unlimited-sized training datasets. Neural networks are also used to augment the generated impulses to make them similar to the ones actually recorded. The widest use of ANN so far is observed in the evaluation of the generated results, for example, in automatic speech recognition (ASR) tasks. This review also describes datasets of recorded RIR impulses commonly found in various studies that are used as training data for neural networks.

Download Full-text