Dynamic Improvements in a Cloud-Based Speech Recognition Engine by Incorporating Trending Data

Author(s):  
Milind Bhavsar ◽  
Prudhvi Kosaraju ◽  
G. Ananthakrishnan ◽  
Gurudas Subray Shet ◽  
Saurav Anand
2017 ◽  
Vol 7 (1.3) ◽  
pp. 121
Author(s):  
Sreeja B P ◽  
Amrutha K G ◽  
Jeni Benedicta J ◽  
Kalaiselvi V ◽  
Ranjani R

The conventional interactive mode is especially used for geometric modeling software. This paper describes, a voice-assisted geometric modeling mechanism to improve the performance of modeling, speech recognition technology is used to design this model. This model states that after receiving the voice command, the system uses the speech recognition engine to identify the voice commands, then the voice commands identified are parsed and processed to generate the geometric design based on the users voice input dimensions, The outcome of the system is capable of generating the geometric designs to the user via speech recognition. This work also focuses on receiving the feedback from the users and customized the model based on the feedback.


Author(s):  
R.D. Sharp ◽  
E. Bocchieri ◽  
C. Castillo ◽  
S. Parthasarathy ◽  
C. Rath ◽  
...  

2020 ◽  
Author(s):  
Tristan Mahr ◽  
Visar Berisha ◽  
Kan Kawabata ◽  
Julie Liss ◽  
Katherine Hustad

Aim. We compared the performance of five forced-alignment algorithms on a corpus of child speech.Method. The child speech sample included 42 children between 3 and 6 years of age. The corpus was force-aligned using the Montreal Forced Aligner with and without speaker adaptive training, triphone alignment from the Kaldi speech recognition engine, the Prosodylab Aligner, and the Penn Phonetics Lab Forced Aligner. The sample was also manually aligned to create gold-standard alignments. We evaluated alignment algorithms in terms of accuracy (whether the interval covers the midpoint of the manual alignment) and difference in phone-onset times between the automatic and manual intervals.Results. The Montreal Forced Aligner with speaker adaptive training showed the highest accuracy and smallest timing differences. Vowels were consistently the most accurately aligned class of sounds across all the aligners, and alignment accuracy increased with age for fricative sounds across the aligners too. Interpretation. The best-performing aligner fell just short of human-level reliability for forced alignment. Researchers can use forced alignment with child speech for certain classes of sounds (vowels, fricatives for older children), especially as part of a semi-automated workflow where alignments are later inspected for gross errors.


2011 ◽  
Author(s):  
Theologos Athanaselis ◽  
Stelios Bakamidis ◽  
Ioannis Dologlou ◽  
Evmorfia N. Argyriou ◽  
Antonis Symvonis

Sign in / Sign up

Export Citation Format

Share Document