Greek Verbs and User Friendliness in the Speech Recognition and the Speech Production Module of Dialog Systems for the Broad Public

Author(s):  
Christina Alexandris ◽  
Ioanna Malagardi
2020 ◽  
Author(s):  
Karthik Gopalakrishnan ◽  
Behnam Hedayatnia ◽  
Longshaokan Wang ◽  
Yang Liu ◽  
Dilek Hakkani-Tür

Author(s):  
Lakshika Kavmini ◽  
Thilini Dinushika ◽  
Uthayasanker Thayasivam ◽  
Sanath Jayasena

The recent advancements in conversational Artificial Intelligence (AI) are fastly getting integrated with every realm of human lives. Conversational agents who can learn, understand human languages and mimic the human thinking process have already created a revolution in human lifestyle. Understanding the intention of a speaker from his natural speech is a significant step in conversational AI. A major challenge that hinders the efficacy of this process is the lack of language resources. In this research, we address this issue and develop a domain-specific speech command classification system for the Sinhala language, one of the low-resourced languages. An effective speech command classification system can be utilized in several value added applications such as speech dialog systems. Our speech command classification system is developed by integrating Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU). The ASR engine is implemented using Gaussian Mixture Model-Hidden Markov Model (GMM-HMM) and it converts a Sinhala speech command into a corresponding text representation. The text classifier, which is implemented as an ensemble unit of several classifiers, predicts the intent of the speaker when provided with the above text output. In this paper, we discuss and evaluate various algorithms and techniques that can be utilized to optimize the performance of both the ASR and text classifier. As well, we present our novel Sinhala speech data corpus of 4.15[Formula: see text]h which is based on the banking domain. As the final outcome, our system reports its Sinhala speech command classification accuracy as 91.03%. It shows that our system outperforms the state-of-the-art speech-to-intent mapping systems developed for the Sinhala language. The individual evaluation on the ASR system reports a 9.91% Word Error Rate and a 19.95% Sentence Error Rate, suggesting the applicability of advanced speech recognition techniques despite the limited language resources. Finally, our findings deliver useful insights on further research in speech command classification in the low-resourced context.


2021 ◽  
Vol 111 (09) ◽  
pp. 579-582
Author(s):  
Daniel Schulte ◽  
Martin Sudhoff ◽  
Bernd Kuhlenkötter

In diesem Beitrag wird die Konzeption und Erprobung eines Systems zur Datenerfassung mittels Spracherkennung in der manuellen Montage beschrieben. Dieses wurde in einem realen Montagesystem in der Lern- und Forschungsfabrik (LFF) des Lehrstuhls für Produktionssysteme (LPS) zur Prozesszeitaufnahme eingesetzt. Anschließend wurde die Qualität der Daten sowie auf die Bedienerfreundlichkeit untersucht. Es konnte gezeigt werden, dass die Spracherkennung eine gute Ergänzung zur manuellen Datenerfassung darstellt.   This paper describes the design and testing of a system for data acquisition using speech recognition in manual assembly. This was used in a real assembly system in the Learning and Research Factory of the Chair of Production Systems for process time recording. Subsequently, the quality of the data as well as the user-friendliness were examined. It could be shown that speech recognition is a good complement to manual data acquisition.


Sign in / Sign up

Export Citation Format

Share Document