Highly parallel implementation of Sphinx-3 voice recognition algorithm

2013 Africon ◽  
2013 ◽  
Author(s):  
Dimitris Tsiamasiotis ◽  
Ioannis Papaefstathiou ◽  
Charalampos Manifavas
Author(s):  
Song Li ◽  
Mustafa Ozkan Yerebakan ◽  
Yue Luo ◽  
Ben Amaba ◽  
William Swope ◽  
...  

Abstract Voice recognition has become an integral part of our lives, commonly used in call centers and as part of virtual assistants. However, voice recognition is increasingly applied to more industrial uses. Each of these use cases has unique characteristics that may impact the effectiveness of voice recognition, which could impact industrial productivity, performance, or even safety. One of the most prominent among them is the unique background noises that are dominant in each industry. The existence of different machinery and different work layouts are primary contributors to this. Another important characteristic is the type of communication that is present in these settings. Daily communication often involves longer sentences uttered under relatively silent conditions, whereas communication in industrial settings is often short and conducted in loud conditions. In this study, we demonstrated the importance of taking these two elements into account by comparing the performances of two voice recognition algorithms under several background noise conditions: a regular Convolutional Neural Network (CNN) based voice recognition algorithm to an Auto Speech Recognition (ASR) based model with a denoising module. Our results indicate that there is a significant performance drop between the typical background noise use (white noise) and the rest of the background noises. Also, our custom ASR model with the denoising module outperformed the CNN based model with an overall performance increase between 14-35% across all background noises. . Both results give proof that specialized voice recognition algorithms need to be developed for these environments to reliably deploy them as control mechanisms.


2021 ◽  
Author(s):  
Salman Sohrabi ◽  
Rebecca S. Moore ◽  
Coleen Tara Murphy

C. elegans is used as a model organism to study a wide range of topics in molecular and cellular biology. Conventional C. elegans assays often require a large sample size with frequent manipulations, rendering them labor-intensive. Automated high-throughput workflows may not be always the best solution to reduce benchwork labor, as they may introduce more complexity. Thus, most assays are carried out manually, where logging and digitizing experimental data can be as time-consuming as picking and scoring worms. Here we report the development of CeAid, C. elegans Application for inputting data, which significantly expedites the data entry process, utilizing swiping gestures and a voice recognition algorithm for logging data using a standard smartphone or Android device. This modular platform can also be adapted for a wide range of assays where recording data is laborious, even beyond worm research.


2012 ◽  
Vol 263-266 ◽  
pp. 2328-2331 ◽  
Author(s):  
Yun Hong Li ◽  
Zi Ling Li

Due to reduce the amount of DTW algorithm and improve the recognition rate. Through using the traditional DTW algorithm for analysis and research, putting forward a kind of local path constraints and regional restrictions combined with improved DTW algorithm. Through the experiment, using the improved DTW algorithm can reduce the calculate operations and improve the recognition rate.


2013 ◽  
Vol 416-417 ◽  
pp. 1156-1159
Author(s):  
Bo Nian Yi

Speech recognition technology is one of the hottest and the most promising new information technologies in the world. This paper studied the voice pretreatment and extractions of MFCC characteristic parameters, constructed speech keywords recognition algorithm with the core of the VQ model and the HMM model, using MATLAB to complete the training and simulation of algorithm, FPGA-based voice recognition technology, and the simulation and implementation of its hardware and software. It laid the foundation for the realization of speech recognition and control based FPGA.


2018 ◽  
Vol 7 (1.9) ◽  
pp. 268
Author(s):  
C Rukkumani ◽  
Dr Krishna Mohanta.S ◽  
Govindaraj S

The consistent increase in the number and ownership population of mobile devices introduces a variety of limitations. A set of this limitations revolve around interactivity. The overly dependent haptic mechanism of interaction has caused device falls, slower time to interaction, health concerns, and limited support for the disabled among other problems. There is need to formulate innovative techniques that facilitate our interaction with these devices for users. In order to achieve this, a Real-time Voice Recognition Algorithm is formulated that lets users of mobile devices acquire freedom to move about and reduce the need for constantly glancing at their screen. This is achieved by allowing users to verbally command their devices to carry out ordinary tasks. An added unique feature is that it also offers offline access as any commands given by a user are processed and executed locally on the device.


Author(s):  
Hossein Ghaffari Nik ◽  
Gregory M. Gutt ◽  
Nathalia Peixoto

Author(s):  
Rahma Della ◽  
Yasdinul Huda

This study aims to design and create a media application for learning the Quran that can be used on Android smartphones by reading and listening to the reading of letters according to their meaning and can immediately be practiced using voice speech. The system development method used is the waterfall method. This application is built on the Android platform and implements voice recognition. Voice recognition algorithm used is devide and conquer to convert sound into discrete form so that synchronization of speech recognition time occurs. The application test results obtained suitability between the algorithm and the design of the application that has been made in accordance with the design objectives. Keywords: tahsin, interactive, android, voice recognation


2011 ◽  
Vol 340 ◽  
pp. 156-160
Author(s):  
Peksinski Jakub ◽  
Mikolajczak Grzegorz

In the article the authors present an algorithm to recognize the human voice, whose operation is based on the analysis of spectra using a measure of the quality of NMSE (normalized mean square error) used to compare digital images. Proposed in the article of voice recognition algorithm has been applied practically to voice control a visual surveillance system.


Sign in / Sign up

Export Citation Format

Share Document