scholarly journals Cross-language forced alignment to assist community-based linguistics for low resource languages

2017 ◽  
Author(s):  
Timothy Kempton
Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 179 ◽  
Author(s):  
Chongchong Yu ◽  
Yunbing Chen ◽  
Yueqiao Li ◽  
Meng Kang ◽  
Shixuan Xu ◽  
...  

To rescue and preserve an endangered language, this paper studied an end-to-end speech recognition model based on sample transfer learning for the low-resource Tujia language. From the perspective of the Tujia language international phonetic alphabet (IPA) label layer, using Chinese corpus as an extension of the Tujia language can effectively solve the problem of an insufficient corpus in the Tujia language, constructing a cross-language corpus and an IPA dictionary that is unified between the Chinese and Tujia languages. The convolutional neural network (CNN) and bi-directional long short-term memory (BiLSTM) network were used to extract the cross-language acoustic features and train shared hidden layer weights for the Tujia language and Chinese phonetic corpus. In addition, the automatic speech recognition function of the Tujia language was realized using the end-to-end method that consists of symmetric encoding and decoding. Furthermore, transfer learning was used to establish the model of the cross-language end-to-end Tujia language recognition system. The experimental results showed that the recognition error rate of the proposed model is 46.19%, which is 2.11% lower than the that of the model that only used the Tujia language data for training. Therefore, this approach is feasible and effective.


2012 ◽  
Vol 18 (Suppl 1) ◽  
pp. A26.1-A26
Author(s):  
SR Mashreky ◽  
A Rahman ◽  
L SvanstrÖm ◽  
MJ Linnan ◽  
S Shafinaz ◽  
...  

Burns ◽  
2011 ◽  
Vol 37 (5) ◽  
pp. 770-775 ◽  
Author(s):  
S.R. Mashreky ◽  
A. Rahman ◽  
L. SvanstrÖm ◽  
M.J. Linnan ◽  
S. Shafinaz ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8313
Author(s):  
Łukasz Lepak ◽  
Kacper Radzikowski ◽  
Robert Nowak ◽  
Karol J. Piczak

Models for keyword spotting in continuous recordings can significantly improve the experience of navigating vast libraries of audio recordings. In this paper, we describe the development of such a keyword spotting system detecting regions of interest in Polish call centre conversations. Unfortunately, in spite of recent advancements in automatic speech recognition systems, human-level transcription accuracy reported on English benchmarks does not reflect the performance achievable in low-resource languages, such as Polish. Therefore, in this work, we shift our focus from complete speech-to-text conversion to acoustic similarity matching in the hope of reducing the demand for data annotation. As our primary approach, we evaluate Siamese and prototypical neural networks trained on several datasets of English and Polish recordings. While we obtain usable results in English, our models’ performance remains unsatisfactory when applied to Polish speech, both after mono- and cross-lingual training. This performance gap shows that generalisation with limited training resources is a significant obstacle for actual deployments in low-resource languages. As a potential countermeasure, we implement a detector using audio embeddings generated with a generic pre-trained model provided by Google. It has a much more favourable profile when applied in a cross-lingual setup to detect Polish audio patterns. Nevertheless, despite these promising results, its performance on out-of-distribution data are still far from stellar. It would indicate that, in spite of the richness of internal representations created by more generic models, such speech embeddings are not entirely malleable to cross-language transfer.


Author(s):  
Lucia CORSINI ◽  
Clara B. ARANDA-JAN ◽  
Henderson Cassi ◽  
James MOULTRIE

Participatory design is a widely recognised approach in Design for Development projects. It supports collaborative, community-based practices and it empowers users to take ownership. Despite the importance of participatory design in solving global challenges, the majority of research has focused its application in the Global North. Recently, some studies have explored participatory design methods in more low-resource settings. Still there is a gap between the existence of these methods, and designers being able to use them successfully because of the complex realities they face in low-resource settings. Existing knowledge is fragmented and there is a lack of best practice guidance for practitioners using participatory design in low-resource settings. We address this problem by reporting the experiences of Simprints, a technology company based in the UK, providing biometric identification solutions in the Global South. Our study reveals key recommendations for participatory design in low-resource settings, providing useful insights for practitioners and design researchers.


Sign in / Sign up

Export Citation Format

Share Document