Incorporating speech recognition engine into an intelligent assistive reading system for dyslexic students

The conventional interactive mode is especially used for geometric modeling software. This paper describes, a voice-assisted geometric modeling mechanism to improve the performance of modeling, speech recognition technology is used to design this model. This model states that after receiving the voice command, the system uses the speech recognition engine to identify the voice commands, then the voice commands identified are parsed and processed to generate the geometric design based on the users voice input dimensions, The outcome of the system is capable of generating the geometric designs to the user via speech recognition. This work also focuses on receiving the feedback from the users and customized the model based on the feedback.

Download Full-text

Real-time speech recognition engine for accent correction using Hidden Markov Model

10.1063/1.5080882 ◽

2018 ◽

Author(s):

J. B. Lazaro ◽

M. C. P. Po ◽

L. M. Ramones ◽

P. M. L. Tolidanes

Keyword(s):

Speech Recognition ◽

Markov Model ◽

Hidden Markov Model ◽

Real Time ◽

Hidden Markov ◽

Speech Recognition Engine

Download Full-text

Speech Recognition Engine using ConvNet for the development of a Voice Command Controller for Fixed Wing Unmanned Aerial Vehicle (UAV)

2019 12th International Conference on Information & Communication Technology and System (ICTS) ◽

10.1109/icts.2019.8850961 ◽

2019 ◽

Author(s):

Cherry Mae J. Galangque ◽

Sherwin A. Guirnaldo

Keyword(s):

Speech Recognition ◽

Unmanned Aerial Vehicle ◽

Voice Command ◽

Aerial Vehicle ◽

Speech Recognition Engine

Download Full-text

Dynamic Improvements in a Cloud-Based Speech Recognition Engine by Incorporating Trending Data

2016 4th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud) ◽

10.1109/mobilecloud.2016.12 ◽

2016 ◽

Cited By ~ 1

Author(s):

Milind Bhavsar ◽

Prudhvi Kosaraju ◽

G. Ananthakrishnan ◽

Gurudas Subray Shet ◽

Saurav Anand

Keyword(s):

Speech Recognition ◽

Speech Recognition Engine

Download Full-text

The Watson speech recognition engine

1997 IEEE International Conference on Acoustics, Speech, and Signal Processing ◽

10.1109/icassp.1997.604839 ◽

2002 ◽

Cited By ~ 14

Author(s):

R.D. Sharp ◽

E. Bocchieri ◽

C. Castillo ◽

S. Parthasarathy ◽

C. Rath ◽

...

Keyword(s):

Speech Recognition ◽

Speech Recognition Engine

Download Full-text

Low-power neuromorphic speech recognition engine with coarse-grain sparsity

2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC) ◽

10.1109/aspdac.2017.7858305 ◽

2017 ◽

Cited By ~ 3

Author(s):

Shihui Yin ◽

Deepak Kadetotad ◽

Bonan Yan ◽

Chang Song ◽

Yiran Chen ◽

...

Keyword(s):

Speech Recognition ◽

Low Power ◽

Coarse Grain ◽

Speech Recognition Engine

Download Full-text

A NOVEL TASK-ORIENTED APPROACH TOWARD AUTOMATED LIP-READING SYSTEM IMPLEMENTATION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliv-2-w1-2021-85-2021 ◽

2021 ◽

Vol XLIV-2/W1-2021 ◽

pp. 85-89

Author(s):

D. Ivanko ◽

D. Ryumin

Keyword(s):

Speech Recognition ◽

Visual Information ◽

Visual Speech ◽

System Implementation ◽

Visual Speech Recognition ◽

Rough Approximation ◽

Lip Reading ◽

Reading System ◽

Task Oriented ◽

Oriented Approach

Abstract. Visual information plays a key role in automatic speech recognition (ASR) when audio is corrupted by background noise, or even inaccessible. Speech recognition using visual information is called lip-reading. The initial idea of visual speech recognition comes from humans’ experience: we are able to recognize spoken words from the observation of a speaker's face without or with limited access to the sound part of the voice. Based on the conducted experimental evaluations as well as on analysis of the research field we propose a novel task-oriented approach towards practical lip-reading system implementation. Its main purpose is to be some kind of a roadmap for researchers who need to build a reliable visual speech recognition system for their task. In a rough approximation, we can divide the task of lip-reading into two parts, depending on the complexity of the problem. First, if we need to recognize isolated words, numbers or small phrases (e.g. Telephone numbers with a strict grammar or keywords). Or second, if we need to recognize continuous speech (phrases or sentences). All these stages disclosed in detail in this paper. Based on the proposed approach we implemented from scratch automatic visual speech recognition systems of three different architectures: GMM-CHMM, DNN-HMM and purely End-to-end. A description of the methodology, tools, step-by-step development and all necessary parameters are disclosed in detail in current paper. It is worth noting that for the Russian speech recognition, such systems were created for the first time.

Download Full-text

A Comparison of Microphone and Speech Recognition Engine Efficacy for Mobile Data Entry

On the Move to Meaningful Internet Systems: OTM 2008 Workshops - Lecture Notes in Computer Science ◽

10.1007/978-3-540-88875-8_75 ◽

2008 ◽

pp. 519-527 ◽

Cited By ~ 1

Author(s):

Joanna Lumsden ◽

Scott Durling ◽

Irina Kondratova

Keyword(s):

Speech Recognition ◽

Data Entry ◽

Mobile Data ◽

Speech Recognition Engine

Download Full-text

Performance of forced-alignment algorithms on children’s speech

10.31234/osf.io/97jp4 ◽

2020 ◽

Author(s):

Tristan Mahr ◽

Visar Berisha ◽

Kan Kawabata ◽

Julie Liss ◽

Katherine Hustad

Keyword(s):

Speech Recognition ◽

Gold Standard ◽

Alignment Accuracy ◽

Speech Sample ◽

Older Children ◽

Adaptive Training ◽

Alignment Algorithms ◽

Child Speech ◽

Speech Recognition Engine ◽

Children's Speech

Aim. We compared the performance of five forced-alignment algorithms on a corpus of child speech.Method. The child speech sample included 42 children between 3 and 6 years of age. The corpus was force-aligned using the Montreal Forced Aligner with and without speaker adaptive training, triphone alignment from the Kaldi speech recognition engine, the Prosodylab Aligner, and the Penn Phonetics Lab Forced Aligner. The sample was also manually aligned to create gold-standard alignments. We evaluated alignment algorithms in terms of accuracy (whether the interval covers the midpoint of the manual alignment) and difference in phone-onset times between the automatic and manual intervals.Results. The Montreal Forced Aligner with speaker adaptive training showed the highest accuracy and smallest timing differences. Vowels were consistently the most accurately aligned class of sounds across all the aligners, and alignment accuracy increased with age for fricative sounds across the aligners too. Interpretation. The best-performing aligner fell just short of human-level reliability for forced alignment. Researchers can use forced alignment with child speech for certain classes of sounds (vowels, fricatives for older children), especially as part of a semi-automated workflow where alignments are later inspected for gross errors.

Download Full-text

Design of Speech Recognition Engine

Text, Speech and Dialogue - Lecture Notes in Computer Science ◽

10.1007/3-540-45323-7_44 ◽

2000 ◽

pp. 259-264 ◽

Cited By ~ 6

Author(s):

Ludek Mŭller ◽

Josef Psutka ◽

Luboš Šmídl

Keyword(s):

Speech Recognition ◽

Speech Recognition Engine

Download Full-text