scholarly journals An open-source end-to-end ASR system for Brazilian Portuguese using DNNs built from newly assembled corpora

2020 ◽  
Vol 35 (1) ◽  
pp. 230-242
Author(s):  
Igor Quintanilha ◽  
Sergio Netto ◽  
Luiz Biscainho
Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3063
Author(s):  
Aleksandr Laptev ◽  
Andrei Andrusenko ◽  
Ivan Podluzhny ◽  
Anton Mitrofanov ◽  
Ivan Medennikov ◽  
...  

With the rapid development of speech assistants, adapting server-intended automatic speech recognition (ASR) solutions to a direct device has become crucial. For on-device speech recognition tasks, researchers and industry prefer end-to-end ASR systems as they can be made resource-efficient while maintaining a higher quality compared to hybrid systems. However, building end-to-end models requires a significant amount of speech data. Personalization, which is mainly handling out-of-vocabulary (OOV) words, is another challenging task associated with speech assistants. In this work, we consider building an effective end-to-end ASR system in low-resource setups with a high OOV rate, embodied in Babel Turkish and Babel Georgian tasks. We propose a method of dynamic acoustic unit augmentation based on the Byte Pair Encoding with dropout (BPE-dropout) technique. The method non-deterministically tokenizes utterances to extend the token’s contexts and to regularize their distribution for the model’s recognition of unseen words. It also reduces the need for optimal subword vocabulary size search. The technique provides a steady improvement in regular and personalized (OOV-oriented) speech recognition tasks (at least 6% relative word error rate (WER) and 25% relative F-score) at no additional computational cost. Owing to the BPE-dropout use, our monolingual Turkish Conformer has achieved a competitive result with 22.2% character error rate (CER) and 38.9% WER, which is close to the best published multilingual system.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3691
Author(s):  
Ciprian Orhei ◽  
Silviu Vert ◽  
Muguras Mocofan ◽  
Radu Vasiu

Computer Vision is a cross-research field with the main purpose of understanding the surrounding environment as closely as possible to human perception. The image processing systems is continuously growing and expanding into more complex systems, usually tailored to the certain needs or applications it may serve. To better serve this purpose, research on the architecture and design of such systems is also important. We present the End-to-End Computer Vision Framework, an open-source solution that aims to support researchers and teachers within the image processing vast field. The framework has incorporated Computer Vision features and Machine Learning models that researchers can use. In the continuous need to add new Computer Vision algorithms for a day-to-day research activity, our proposed framework has an advantage given by the configurable and scalar architecture. Even if the main focus of the framework is on the Computer Vision processing pipeline, the framework offers solutions to incorporate even more complex activities, such as training Machine Learning models. EECVF aims to become a useful tool for learning activities in the Computer Vision field, as it allows the learner and the teacher to handle only the topics at hand, and not the interconnection necessary for visual processing flow.


2021 ◽  
Author(s):  
Joni Rasanen ◽  
Aaro Altonen ◽  
Alexandre Mercat ◽  
Jarno Vanne

2014 ◽  
Vol 10 (8) ◽  
pp. e1003806 ◽  
Author(s):  
Greg Finak ◽  
Jacob Frelinger ◽  
Wenxin Jiang ◽  
Evan W. Newell ◽  
John Ramey ◽  
...  

2020 ◽  
Author(s):  
Abhinav Garg ◽  
Gowtham P. Vadisetti ◽  
Dhananjaya Gowda ◽  
Sichen Jin ◽  
Aditya Jayasimha ◽  
...  
Keyword(s):  

2012 ◽  
pp. 333-352
Author(s):  
Fatma Meawad ◽  
Geneen Stubbs

This chapter discusses the principles underpinning the design and the development of a framework, MobiGlam, which supports ubiquitous and scalable access to learning activities. The framework allows full end to end interconnectivity among open source virtual learning environments (VLEs) and Java-enabled mobile devices. Through this framework, interoperability and adaptivity techniques are combined to address the technical, pedagogical, and institutional challenges of mobile learning. The discussed framework achieved a level of flexibility and simplicity that resulted in a wide acceptance of the framework institutionally, allowing its use in various real world settings.


Sign in / Sign up

Export Citation Format

Share Document