speech translation Latest Research Papers

For native English-speaking teachers, the ability to overcome communication issues caused by not having the same first language as their pupils is a challenge, especially with low-level students. The increased use of video lectures due to COVD-19 has made this even more difficult. This study was conducted to investigate whether the use of Artificial Intelligence-powered interlingual Simultaneous Speech Translation subtitled video lectures could be a practical solution to overcome this challenge. To that end, 14 participants from a first-semester prerequisite General English course took part in this study. A semi-structured interview was combined with surveys and descriptive statistics, and data was analyzed through qualitative means of thematic, descriptive, and inductive procedures that relied on simultaneous analysis and category construction. Key findings were as follows: First, respondents found the subtitled videos to be highly satisfactory and fairly accurate. Second, respondents reported greater content understanding as the main advance and less emphasis on improving listening ability as the primary disadvantage. Third, the use of English instead of Korean subtitles or subtitling only certain sections of the video in Korean were the main suggestions for the future. Specific responses from the student interviews and future implications are discussed.

Download Full-text

Leveraging Multimodal Out-of-Domain Information to Improve Low-Resource Speech Translation

Security and Communication Networks ◽

10.1155/2021/9915130 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Wenbo Zhu ◽

Hao Jin ◽

WeiChang Yeh ◽

Jianwen Chen ◽

Lufeng Luo ◽

...

Keyword(s):

Large Scale ◽

Small Sample ◽

Training Data ◽

Label Free ◽

Speech Translation ◽

Low Resource ◽

Training Samples ◽

Limited Effectiveness ◽

Fine Tune ◽

Domain Information

Speech translation (ST) is a bimodal conversion task from source speech to the target text. Generally, deep learning-based ST systems require sufficient training data to obtain a competitive result, even with a state-of-the-art model. However, the training data is usually unable to meet the completeness condition due to the small sample problems. Most low-resource ST tasks improve data integrity with a single model, but this optimization has a single dimension and limited effectiveness. In contrast, multimodality is introduced to leverage different dimensions of data features for multiperspective modeling. This approach mutually addresses the gaps in the different modalities to enhance the representation of the data and improve the utilization of the training samples. Therefore, it is a new challenge to leverage the enormous multimodal out-of-domain information to improve the low-resource tasks. This paper describes how to use multimodal out-of-domain information to improve low-resource models. First, we propose a low-resource ST framework to reconstruct large-scale label-free audio by combining self-supervised learning. At the same time, we introduce a machine translation (MT) pretraining model to complement text embedding and fine-tune decoding. In addition, we analyze the similarity at the decoder side. We reduce multimodal invalid pseudolabels by performing random depth pruning in the similarity layer to minimize error propagation and use additional CTC loss in the nonsimilarity layer to optimize the ensemble loss. Finally, we study the weighting ratio of the fusion technique in the multimodal decoder. Our experiment results show that the proposed method is promising for low-resource ST, with improvements of up to +3.6 BLEU points compared to baseline low-resource ST models.

Download Full-text

Simultaneous Speech-to-Speech Translation System with Transformer-Based Incremental ASR, MT, and TTS

10.1109/o-cocosda202152914.2021.9660477 ◽

2021 ◽

Author(s):

Ryo Fukuda ◽

Sashi Novitasari ◽

Yui Oka ◽

Yasumasa Kano ◽

Yuki Yano ◽

...

Keyword(s):

Translation System ◽

Speech Translation ◽

Speech To Speech Translation

Download Full-text

Khmer Speech Translation Corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC)

10.1109/o-cocosda202152914.2021.9660421 ◽

2021 ◽

Author(s):

Kak Soky ◽

Masato Mimura ◽

Tatsuya Kawahara ◽

Sheng Li ◽

Chenchen Ding ◽

...

Keyword(s):

Speech Translation

Download Full-text

Picking the crucial language technologies and how Wikipedia can help: the Welsh experience

Septentrio Conference Series ◽

10.7557/5.6204 ◽

2021 ◽

Author(s):

Gareth Morlais

Keyword(s):

Digital Media ◽

Action Plan ◽

Work Package ◽

Speech Translation ◽

Language Technology ◽

Welsh Language ◽

Search Data ◽

Text Content ◽

Synthetic Voices ◽

Language Technologies

When you're making plans to get people using your language as much and as often as possible, there's a list of things related to Wikipedia which can really help. I'll share our experience with the Welsh language. Supporting the Welsh-language Wikipedia community forms Work Package 15 of 27 in the Welsh Government's Welsh Language Technology Action Plan https://gov.wales/sites/default/files/publications/2018-12/welsh-language-technology-and-digital-media-action-plan.pdf. We like supporting Welsh language Wikipedia editing workshops, video workshops and other channels that encourage people to create and publish Welsh-language video, audio, graphic and text content because we're on a mission to try to help double daily use of Welsh by 2050. I'll share developments we're funding in speech, translation and conversational AI. The partners we're giving grants to publish what they develop under open licence. So we can share what we've funded with many companies. We think Microsoft might have used some to make their new synthetic voices in Welsh. We're excited by the potential Wikidata offers. We'll look at its potential in populating Welsh maps this year. We've already used Wikipedia search data as a way of prioritising the training of a Welsh virtual assistant. Welsh may not be spending as much as Icelandic and Estonian do on language technologies, but we'd like to share what we're learning as a smaller language about the important areas to focus on and how Wikipedia can help.

Download Full-text

How Teachers Can Communicate Effectively with Parents Who Speak a Different Language

International Journal of Linguistics Literature & Translation ◽

10.32996/ijllt.2021.4.9.17 ◽

2021 ◽

Vol 4 (9) ◽

pp. 166-178

Author(s):

Adekunle Lawal

Keyword(s):

United States ◽

Real Time ◽

The United States ◽

Language Barrier ◽

Speech Translation ◽

Language Differences ◽

English Speaking ◽

Teachers And Parents ◽

K 12

Language differences between parents and teachers, if not carefully managed, can cause miscommunication or communication gaps that could hinder both the school’s and students’ progress. This paper explores various ways of translating real-time conversations between teachers and parents who speak a different language. Fourteen K-12 teachers in the United States were surveyed and nine were interviewed to determine how English-speaking teachers can communicate effectively with non-English speaking parents. The findings from the study suggest Microsoft Translator technology for speech translation for conversations to break the language barrier, bridge communication gaps and promote effective bi/multilingual parent-teacher conferences.

Download Full-text

Signal Recognition for English Speech Translation Based on Improved Wavelet Denoising Method

Advances in Mathematical Physics ◽

10.1155/2021/6811192 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Zhuo Chen

Keyword(s):

Speech Recognition ◽

Recognition Accuracy ◽

Wavelet Denoising ◽

Simulation Software ◽

Signal Recognition ◽

Denoising Method ◽

Speech Translation ◽

Time Frequency ◽

Traditional Algorithm ◽

Translation Signals

The signal corresponding to English speech contains a lot of redundant information and environmental interference information, which will produce a lot of distortion in the process of English speech translation signal recognition. Based on this, a large number of studies focus on encoding and processing English speech, so as to achieve high-precision speech recognition. The traditional wavelet denoising algorithm plays an obvious role in the recognition of English speech translation signals, which mainly depends on the excellent local time-frequency domain characteristics of the wavelet signal algorithm, but the traditional wavelet signal algorithm is still difficult to select the recognition threshold, and the recognition accuracy is easy to be affected. Based on this, this paper will improve the traditional wavelet denoising algorithm, abandon the single-threshold judgment of the original traditional algorithm, innovatively adopt the combination of soft threshold and hard threshold, further solve the distortion problem of the denoising algorithm in the process of English speech translation signal recognition, improve the signal-to-noise ratio of English speech recognition, and further reduce the root mean square error of the signal. Good noise reduction effect is realized, and the accuracy of speech recognition is improved. In the experiment, the algorithm is compared with the traditional algorithm based on MATLAB simulation software. The simulation results are consistent with the actual theoretical results. At the same time, the algorithm proposed in this paper has obvious advantages in the recognition accuracy of English speech translation signals, which reflects the superiority and practical value of the algorithm.

Download Full-text

Real-Time Text & Speech Translation Using Sequence To Sequence Approach

10.1109/icirca51532.2021.9544509 ◽

2021 ◽

Author(s):

Dikshita Patel ◽

Minakshi Kudalkar ◽

Shashank Gupta ◽

Renuka Pawar

Keyword(s):

Real Time ◽

Speech Translation

Download Full-text

speech translation
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

ConWST: Non-native Multi-source Knowledge Distillation for Low Resource Speech Translation

Neural Incremental Speech Recognition Toward Real-Time Machine Speech Translation

Student Insights Related to the Use of Simultaneous Speech Translation for Video Lectures in a University English Course

Leveraging Multimodal Out-of-Domain Information to Improve Low-Resource Speech Translation

Simultaneous Speech-to-Speech Translation System with Transformer-Based Incremental ASR, MT, and TTS

Khmer Speech Translation Corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC)

Picking the crucial language technologies and how Wikipedia can help: the Welsh experience

How Teachers Can Communicate Effectively with Parents Who Speak a Different Language

Signal Recognition for English Speech Translation Based on Improved Wavelet Denoising Method

Real-Time Text & Speech Translation Using Sequence To Sequence Approach

Export Citation Format

speech translationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

ConWST: Non-native Multi-source Knowledge Distillation for Low Resource Speech Translation

Neural Incremental Speech Recognition Toward Real-Time Machine Speech Translation

Student Insights Related to the Use of Simultaneous Speech Translation for Video Lectures in a University English Course

Leveraging Multimodal Out-of-Domain Information to Improve Low-Resource Speech Translation

Simultaneous Speech-to-Speech Translation System with Transformer-Based Incremental ASR, MT, and TTS

Khmer Speech Translation Corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC)

Picking the crucial language technologies and how Wikipedia can help: the Welsh experience

How Teachers Can Communicate Effectively with Parents Who Speak a Different Language

Signal Recognition for English Speech Translation Based on Improved Wavelet Denoising Method

Real-Time Text & Speech Translation Using Sequence To Sequence Approach

speech translation
Recently Published Documents