La tecnología habla-texto como herramienta de documentación para intérpretes: Nuevo método para compilar un corpus ad hoc y extraer terminología a partir de discursos orales en vídeo

TRANS Revista de Traductología ◽

10.24310/trans.2020.v0i24.7876 ◽

2020 ◽

pp. 263-281

Author(s):

Mahmoud Gaber ◽

Gloria Corpas Pastor ◽

Ahmed Omer

Keyword(s):

Climate Change ◽

Speech Recognition ◽

Automatic Speech Recognition ◽

Domain Knowledge ◽

Ad Hoc ◽

Central Component ◽

Computer Assisted ◽

Documentation Tool

Although interpreting has not yet benefited from technology as much as its sister field, translation, interest in developing tailor-made solutions for interpreters has risen sharply in recent years. In particular, Automatic Speech Recognition (ASR) is being used as a central component of Computer-Assisted Interpreting (CAI) tools, either bundled or standalone. This study pursues three main aims: (i) to establish the most suitable ASR application for building ad hoc corpora by comparing several ASR tools and assessing their performance; (ii) to use ASR in order to extract terminology from the transcriptions obtained from video-recorded speeches, in this case talks on climate change and adaptation; and (iii) to promote the adoption of ASR as a new documentation tool among interpreters. To the best of our knowledge, this is one of the first studies to explore the possibility of Speech-to-Text (S2T) technology for meeting the preparatory needs of interpreters as regards terminology and background/domain knowledge.

Download Full-text

Computer Assisted Language Learning system based on dynamic question generation and error prediction for automatic speech recognition

Speech Communication ◽

10.1016/j.specom.2009.03.006 ◽

2009 ◽

Vol 51 (10) ◽

pp. 995-1005 ◽

Cited By ~ 15

Author(s):

Hongcui Wang ◽

Christopher J. Waple ◽

Tatsuya Kawahara

Keyword(s):

Speech Recognition ◽

Language Learning ◽

Automatic Speech Recognition ◽

Learning System ◽

Error Prediction ◽

Computer Assisted ◽

Computer Assisted Language Learning ◽

Question Generation

Download Full-text

Automatic Speech Recognition (ASR) Systems Applied to Pronunciation Assessment of L2 Spanish for Japanese Speakers

Applied Sciences ◽

10.3390/app11156695 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6695

Author(s):

Cristian Tejedor-García ◽

Valentín Cardeñoso-Payo ◽

David Escudero-Mancebo

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

General Purpose ◽

Future Research ◽

Computer Assisted ◽

Native Japanese Speakers ◽

Current Configuration ◽

Minimal Pairs ◽

Japanese Speakers ◽

Post Test

General-purpose automatic speech recognition (ASR) systems have improved in quality and are being used for pronunciation assessment. However, the assessment of isolated short utterances, such as words in minimal pairs for segmental approaches, remains an important challenge, even more so for non-native speakers. In this work, we compare the performance of our own tailored ASR system (kASR) with the one of Google ASR (gASR) for the assessment of Spanish minimal pair words produced by 33 native Japanese speakers in a computer-assisted pronunciation training (CAPT) scenario. Participants in a pre/post-test training experiment spanning four weeks were split into three groups: experimental, in-classroom, and placebo. The experimental group used the CAPT tool described in the paper, which we specially designed for autonomous pronunciation training. A statistically significant improvement for the experimental and in-classroom groups was revealed, and moderate correlation values between gASR and kASR results were obtained, in addition to strong correlations between the post-test scores of both ASR systems and the CAPT application scores found at the final stages of application use. These results suggest that both ASR alternatives are valid for assessing minimal pairs in CAPT tools, in the current configuration. Discussion on possible ways to improve our system and possibilities for future research are included.

Download Full-text

Automatic speech recognition in computer-assisted language learning for individual learning in speaking

JEES (Journal of English Educators Society) ◽

10.21070/jees.v5i2.867 ◽

2020 ◽

Vol 5 (2) ◽

pp. 193-197

Author(s):

Esti Junining ◽

Sony Alif ◽

Nuria Setiarini

Keyword(s):

Speech Recognition ◽

Foreign Language ◽

Language Learning ◽

Programming Language ◽

Automatic Speech Recognition ◽

Computer Assisted ◽

Computer Assisted Language Learning ◽

Efl Learners ◽

C Programming Language ◽

C Programming

This study is intended to help English as a Foreign Language (EFL) learners in Indonesia to reduce their anxiety level while speaking in front of other people. This study helps to develop an atmosphere that encourages students to practice speaking independently. The interesting atmosphere can be obtained by using Automatic Speech Recognition (ASR) where every student can practice speaking individually without feeling anxious or pressurized, because he/she can practice independently in front of a computer or a gadget. This study used research and development design as it tried to develop a product which can create an atmosphere that encourages students to practice their speaking. The instrument used is a questionnaire which is used to analyze the students’ need of learning English. This study developed a product which utilized ASR technology using C# programming language. This study revealed that the product developed using ASR can make students practice speaking individually without feeling anxious and pressurized.

Download Full-text

The Effect of Automatic Speech Recognition EyeSpeak Software on Iraqi Students’ English Pronunciation: A Pilot Study

Advances in Language and Literary Studies ◽

10.7575/aiac.alls.v.8n.2p.48 ◽

2017 ◽

Vol 8 (2) ◽

pp. 48

Author(s):

Lina Fathi Sidig Sidgi ◽

Ahmad Jelani Shaari

Keyword(s):

Speech Recognition ◽

Foreign Language ◽

Language Learning ◽

Automatic Speech Recognition ◽

Teaching And Learning ◽

Mother Tongue ◽

Target Language ◽

Computer Assisted ◽

English Pronunciation ◽

Sound Transformation

The use of technology, such as computer-assisted language learning (CALL), is used in teaching and learning in the foreign language classrooms where it is most needed. One promising emerging technology that supports language learning is automatic speech recognition (ASR). Integrating such technology, especially in the instruction of pronunciation in the classroom, is important in helping students to achieve correct pronunciation. In Iraq, English is a foreign language, and it is not surprising that learners commit many pronunciation mistakes. One factor contributing to these mistakes is the difference between the Arabic and English phonetic systems. Thus, the sound transformation from the mother tongue (Arabic) to the target language (English) is one barrier for Arab learners. The purpose of this study is to investigate the effectiveness of using automatic speech recognition ASR EyeSpeak software in improving the pronunciation of Iraqi learners of English. An experimental research project with a pretest-posttest design is conducted over a one-month period in the Department of English at Al-Turath University College in Baghdad, Iraq. The ten participants are randomly selected first-year college students enrolled in a pronunciation class that uses traditional teaching methods and ASR EyeSpeak software. The findings show that using EyeSpeak software leads to a significant improvement in the students’ English pronunciation, evident from the test scores they achieve after using EyeSpeak software.

Download Full-text

Automatic speech recognition in the booth

Target ◽

10.1075/target.19166.def ◽

2020 ◽

Author(s):

Bart Defrancq ◽

Claudio Fantinuoli

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Machine Learning Techniques ◽

Computer Assisted ◽

Psychological Benefits ◽

Error Matrix ◽

Simultaneous Interpreting ◽

Subjective Perceptions ◽

Potential Benefits ◽

Almost All

Abstract Automatic Speech Recognition (ASR) has been proposed as a means to enhance state-of-the-art computer-assisted interpreting (CAI) tools and to allow machine-learning techniques to enter the workflow of professional interpreters. In this article, we test the usefulness of real-time transcription with number highlighting of a source speech for simultaneous interpreting using InterpretBank ASR. The system’s precision is high (96%) and its latency low enough to fit interpreters’ ear–voice span (EVS). We evaluate the potential benefits among first-time users of this technology by applying an error matrix and by investigating the users’ subjective perceptions through a questionnaire. The results show that the ASR provision improves overall performance for almost all number types. Interaction with the ASR support is varied and participants consult it for just over half of the stimuli. The study also provides some evidence of the psychological benefits of ASR availability and of overreliance on ASR support.

Download Full-text

Automatic Speech Recognition (ASR) Systems Applied to Pronunciation Assessment of L2 Spanish for Japanese Speakers

10.20944/preprints202106.0687.v1 ◽

2021 ◽

Author(s):

Cristian Tejedor-García ◽

Valentín Cardeñoso-Payo ◽

David Escudero-Mancebo

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Future Research ◽

Computer Assisted ◽

Native Japanese Speakers ◽

Current Configuration ◽

Minimal Pairs ◽

Japanese Speakers ◽

Post Test ◽

Important Challenge

General–purpose automatic speech recognition (ASR) systems have improved their quality and are being used for pronunciation assessment. However, the assessment of isolated short utterances, as words in minimal pairs for segmental approaches, remains an important challenge, even more for non-native speakers. In this work, we compare the performance of our own tailored ASR system (kASR) with the one of Google ASR (gASR) for the assessment of Spanish minimal pair words produced by 33 native Japanese speakers in a computer-assisted pronunciation training (CAPT) scenario. Participants of a pre/post-test training experiment spanning four weeks were split into three groups: experimental, in-classroom, and placebo. Experimental group used the CAPT tool described in the paper, which we specially designed for autonomous pronunciation training. Statistically significant improvement for experimental and in-classroom groups is revealed, and moderate correlation values between gASR and kASR results were obtained, beside strong correlations between the post-test scores of both ASR systems with the CAPT application scores found at the final stages of application use. These results suggest that both ASR alternatives are valid for assessing minimal pairs in CAPT tools, in the current configuration. Discussion on possible ways to improve our system and possibilities for future research are included.

Download Full-text

Computer-Assisted Interpreting Tools (CAI) and options for automation with Automatic Speech Recognition

Tradterm ◽

10.11606/issn.2317-9511.v32i0p9-31 ◽

2018 ◽

Vol 32 ◽

pp. 9-31

Author(s):

Luis Eduardo Schild Ortiz ◽

Patrizia Cavallo

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

New Technologies ◽

State Of The Art ◽

The State ◽

Current Level ◽

Computer Assisted ◽

Future Perspectives ◽

Potential Benefits

In recent years, several studies have indicated interpreters resist adopting new technologies. Yet, such technologies have enabled the development of several tools to help those professionals. In this paper, using bibliographical and documental research, we briefly analyse the tools cited by several authors to identify which ones remain up to date and available on the market. Following that, we present concepts about automation, and observe the usage of automatic speech recognition (ASR), while analysing its potential benefits and the current level of maturity of such an approach, especially regarding Computer-Assisted Interpreting (CAI) tools. The goal of this paper is to present the community of interpreters and researchers with a view of the state of the art in technology for interpreting as well as some future perspectives for this area.

Download Full-text

The effectiveness of computer-based speech corrective feedback for improving segmental quality in L2 Dutch

ReCALL ◽

10.1017/s0958344008000724 ◽

2008 ◽

Vol 20 (2) ◽

pp. 225-243 ◽

Cited By ~ 33

Author(s):

Ambra Neri ◽

Catia Cucchiarini ◽

Helmer Strik

Keyword(s):

Speech Recognition ◽

Adult Learners ◽

Automatic Speech Recognition ◽

Corrective Feedback ◽

Computer Assisted ◽

Feedback Group ◽

Automatic Feedback ◽

Before And After ◽

Computer Based

AbstractAlthough the success of automatic speech recognition (ASR)-based Computer Assisted Pronunciation Training (CAPT) systems is increasing, little is known about the pedagogical effectiveness of these systems. This is particularly regrettable because ASR technology still suffers from limitations that may result in the provision of erroneous feedback, possibly leading to learning breakdowns. To study the effectiveness of ASR-based feedback for improving pronunciation, we developed and tested a CAPT system providing automatic feedback on Dutch phonemes that are problematic for adult learners of Dutch. Thirty immigrants who were studying Dutch were assigned to three groups using either the ASR-based CAPT system with automatic feedback, a CAPT system without feedback, or no CAPT system. Pronunciation quality was assessed for each participant before and after the training by human experts who evaluated overall segmental quality and the quality of the phonemes addressed in the training. The participants' impressions of the CAPT system used were also studied through anonymous questionnaires. The results on global segmental quality show that the group receiving ASR-based feedback made the largest mean improvement, but the groups' mean improvements did not differ significantly. The group receiving ASR-based feedback showed a significantly larger improvement than the no-feedback group in the segmental quality of the problematic phonemes targeted.

Download Full-text

Speech Processing for Language Learning: A Practical Approach to Computer-Assisted Pronunciation Teaching

Electronics ◽

10.3390/electronics10030235 ◽

2021 ◽

Vol 10 (3) ◽

pp. 235

Author(s):

Natalia Bogach ◽

Elena Boitsova ◽

Sergey Chernonog ◽

Anton Lamtev ◽

Maria Lesnichaya ◽

...

Keyword(s):

Signal Processing ◽

Speech Recognition ◽

Language Learning ◽

Automatic Speech Recognition ◽

Speech Processing ◽

Foreign Language Learning ◽

Performance Estimation ◽

Third Party ◽

Computer Assisted ◽

Speech Transcription

This article contributes to the discourse on how contemporary computer and information technology may help in improving foreign language learning not only by supporting better and more flexible workflow and digitizing study materials but also through creating completely new use cases made possible by technological improvements in signal processing algorithms. We discuss an approach and propose a holistic solution to teaching the phonological phenomena which are crucial for correct pronunciation, such as the phonemes; the energy and duration of syllables and pauses, which construct the phrasal rhythm; and the tone movement within an utterance, i.e., the phrasal intonation. The working prototype of StudyIntonation Computer-Assisted Pronunciation Training (CAPT) system is a tool for mobile devices, which offers a set of tasks based on a “listen and repeat” approach and gives the audio-visual feedback in real time. The present work summarizes the efforts taken to enrich the current version of this CAPT tool with two new functions: the phonetic transcription and rhythmic patterns of model and learner speech. Both are designed on a base of a third-party automatic speech recognition (ASR) library Kaldi, which was incorporated inside StudyIntonation signal processing software core. We also examine the scope of automatic speech recognition applicability within the CAPT system workflow and evaluate the Levenstein distance between the transcription made by human experts and that obtained automatically in our code. We developed an algorithm of rhythm reconstruction using acoustic and language ASR models. It is also shown that even having sufficiently correct production of phonemes, the learners do not produce a correct phrasal rhythm and intonation, and therefore, the joint training of sounds, rhythm and intonation within a single learning environment is beneficial. To mitigate the recording imperfections voice activity detection (VAD) is applied to all the speech records processed. The try-outs showed that StudyIntonation can create transcriptions and process rhythmic patterns, but some specific problems with connected speech transcription were detected. The learners feedback in the sense of pronunciation assessment was also updated and a conventional mechanism based on dynamic time warping (DTW) was combined with cross-recurrence quantification analysis (CRQA) approach, which resulted in a better discriminating ability. The CRQA metrics combined with those of DTW were shown to add to the accuracy of learner performance estimation. The major implications for computer-assisted English pronunciation teaching are discussed.

Download Full-text

Automatic speech recognition at the University of Paris

PsycEXTRA Dataset ◽

10.1037/e506252009-002 ◽

1975 ◽

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

University Of Paris ◽

The University

Download Full-text