scholarly journals Domain Adaptation for Machine Translation with Instance Selection

2015 ◽  
Vol 103 (1) ◽  
pp. 5-20 ◽  
Author(s):  
Ergun Biçici

Abstract Domain adaptation for machine translation (MT) can be achieved by selecting training instances close to the test set from a larger set of instances. We consider 7 different domain adaptation strategies and answer 7 research questions, which give us a recipe for domain adaptation in MT. We perform English to German statistical MT (SMT) experiments in a setting where test and training sentences can come from different corpora and one of our goals is to learn the parameters of the sampling process. Domain adaptation with training instance selection can obtain 22% increase in target 2-gram recall and can gain up to 3:55 BLEU points compared with random selection. Domain adaptation with feature decay algorithm (FDA) not only achieves the highest target 2-gram recall and BLEU performance but also perfectly learns the test sample distribution parameter with correlation 0:99. Moses SMT systems built with FDA selected 10K training sentences is able to obtain F1 results as good as the baselines that use up to 2M sentences. Moses SMT systems built with FDA selected 50K training sentences is able to obtain F1 point better results than the baselines.

Author(s):  
Marta R. Costa-jussà

In the last years, deep learning algorithms have highly revolutionized several areas including speech, image and natural language processing. The specific field of Machine Translation (MT) has not remained invariant. Integration of deep learning in MT varies from re-modeling existing features into standard statistical systems to the development of a new architecture. Among the different neural networks, research works use feed-forward neural networks, recurrent neural networks and the encoder-decoder schema. These architectures are able to tackle challenges as having low-resources or morphology variations. This extended abstract focuses on describing the foundational works on the neural MT approach; mentioning its strengths and weaknesses; and including an analysis of the corresponding challenges and future work. The full manuscript [Costa-jussà, 2018] describes, in addition, how these neural networks have been integrated to enhance different aspects and models from statistical MT, including language modeling, word alignment, translation, reordering, and rescoring; and on describing the new neural MT approach together with recent approaches on using subword, characters and training with multilingual languages, among others.


2018 ◽  
Vol 61 ◽  
pp. 947-974 ◽  
Author(s):  
Marta R. Costa-jussà

In the last years, deep learning algorithms have highly revolutionized several areas including speech, image and natural language processing. The specific field of Machine Translation (MT) has not remained invariant. Integration of deep learning in MT varies from re-modeling existing features into standard statistical systems to the development of a new architecture. Among the different neural networks, research works use feed-forward neural networks, recurrent neural networks and the encoder-decoder schema. These architectures are able to tackle challenges as having low-resources or morphology variations. This manuscript focuses on describing how these neural networks have been integrated to enhance different aspects and models from statistical MT, including language modeling, word alignment, translation, reordering, and rescoring. Then, we report the new neural MT approach together with a description of the foundational related works and recent approaches on using subword, characters and training with multilingual languages, among others. Finally, we include an analysis of the corresponding challenges and future work in using deep learning in MT.


2021 ◽  
Vol 11 (5) ◽  
pp. 603
Author(s):  
Chunlei Shi ◽  
Xianwei Xin ◽  
Jiacai Zhang

Machine learning methods are widely used in autism spectrum disorder (ASD) diagnosis. Due to the lack of labelled ASD data, multisite data are often pooled together to expand the sample size. However, the heterogeneity that exists among different sites leads to the degeneration of machine learning models. Herein, the three-way decision theory was introduced into unsupervised domain adaptation in the first time, and applied to optimize the pseudolabel of the target domain/site from functional magnetic resonance imaging (fMRI) features related to ASD patients. The experimental results using multisite fMRI data show that our method not only narrows the gap of the sample distribution among domains but is also superior to the state-of-the-art domain adaptation methods in ASD recognition. Specifically, the ASD recognition accuracy of the proposed method is improved on all the six tasks, by 70.80%, 75.41%, 69.91%, 72.13%, 71.01% and 68.85%, respectively, compared with the existing methods.


2021 ◽  
Author(s):  
Hoang Trung Chinh ◽  
Nguyen Hong Buu Long ◽  
Luong An Vinh

Author(s):  
Gainiya Tazhina ◽  
Alessandro Figus ◽  
Ramón Bouzas-Lorenzo ◽  
Diana Spulber

The DeSTT concept of teacher training for leadership examines the importance of non-formal education, i.e., training for teachers. The monitoring study revealed the urgent needs of Kazakhstani teachers in training their leadership skills. The paper analyzes two sets of research questions (each consists of 6 sub-questions), which we defined as follows: 1) What are the challenges of teacher training/upskilling for leadership and their involvement in the local community? This group of questions was studied at the stage of preparing the project proposal. 2) What are the impacts of DeSTT training on teachers' leadership skills and experiences? This group of questions was studied during the 2nd year of the project lifetime. The purpose of this paper is to indicate the findings and implementation of the concept of preparing teachers for leadership from the project proposal launch to the execution of pilot trainings. Research methods employed in the study are interviews of universities specialists and analyzes of the State data/reports for the project proposal. Observations of training participants and post-training interviews were used to study the 2nd group of research questions. The findings of the study confirm the data obtained in both groups of interviews and observations. Participants were enthusiastic and interested in the pieces of training, aware and confident of the need to continue learning, share experiences, and develop leadership skills achieved in DeSTT training. The reflection on the central terms of leadership and training has proved to be crucial for teachers. Further research is to survey the implications of the DeSTT project for all its consumers. The dissemination and sustainability perspective of the project is to collaborate with the National Center ORLEU for training leadership skills to the instructors from 17 regional branches who, in turn, will train teachers for leadership. The authors acknowledge the Erasmus Plus CBHE for funding the DeSTT project.


2017 ◽  
Author(s):  
Rui Wang ◽  
Masao Utiyama ◽  
Lemao Liu ◽  
Kehai Chen ◽  
Eiichiro Sumita

Sign in / Sign up

Export Citation Format

Share Document