scholarly journals A Set of Recommendations for Assessing Human–Machine Parity in Language Translation

2020 ◽  
Vol 67 ◽  
Author(s):  
Samuel Läubli ◽  
Sheila Castilho ◽  
Graham Neubig ◽  
Rico Sennrich ◽  
Qinlan Shen ◽  
...  

The quality of machine translation has increased remarkably over the past years, to the degree that it was found to be indistinguishable from professional human translation in a number of empirical investigations. We reassess Hassan et al.'s 2018 investigation into Chinese to English news translation, showing that the finding of human–machine parity was owed to weaknesses in the evaluation design—which is currently considered best practice in the field. We show that the professional human translations contained significantly fewer errors, and that perceived quality in human evaluation depends on the choice of raters, the availability of linguistic context, and the creation of reference translations. Our results call for revisiting current best practices to assess strong machine translation systems in general and human–machine parity in particular, for which we offer a set of recommendations based on our empirical findings.  

Machine Translation (MT) is the process of converting the text from one language (source) to another language (TL). MT draws the idea of linguistics, computer science, artificial intelligence, sociology, psychology etc. The linguistically rich country like India has the demand to develop a full-fledged MT system to convert the text across different languages. Though the research has been made on MT for the past 60 years, still it is considered to be a challenging task. Building a fully automatic MT system is extremely difficult. This paper deals with the various ideas in MT systems for Indian Languages. Advantages and limitations of some of the important Dravidian Language translation systems developed using MT techniques are discussed.


Symmetry ◽  
2020 ◽  
Vol 13 (1) ◽  
pp. 9
Author(s):  
John H. Graham

Best practices in studies of developmental instability, as measured by fluctuating asymmetry, have developed over the past 60 years. Unfortunately, they are haphazardly applied in many of the papers submitted for review. Most often, research designs suffer from lack of randomization, inadequate replication, poor attention to size scaling, lack of attention to measurement error, and unrecognized mixtures of additive and multiplicative errors. Here, I summarize a set of best practices, especially in studies that examine the effects of environmental stress on fluctuating asymmetry.


Dermatology ◽  
2021 ◽  
pp. 1-10
Author(s):  
Yaron Har-Shai ◽  
Lior Har-Shai ◽  
Viktor A. Zouboulis ◽  
Christos C. Zouboulis

<b><i>Background:</i></b> Auricular keloids belong to the most perplexing medical conditions, which have significant psychosocial impact on the patient’s body image and quality of life. <b><i>Summary:</i></b> The article is purposed to provide dermatologists and plastic surgeons with the best proven practice using intralesional cryosurgery for the treatment of the different auricular keloid types in order to obtain superior clinical results by minimizing the probability of recurrence. In the past 20 years, the authors have developed novel procedures in order to increase the effectiveness of intralesional cryosurgery on auricular keloids, including hydrodissection, warm gauze technique, and excision of dangling skin. Long-lasting clinical results with a low recurrence rate and a satisfactory aesthetic outcome are achieved with no deformation of the ear framework.


2011 ◽  
Vol 20 (01) ◽  
pp. 146-155
Author(s):  
A. V. Alekseyenko ◽  
Y. Aphinyanaphongs ◽  
S. Brown ◽  
D. Fenyo ◽  
L. Fu ◽  
...  

SummaryTo survey major developments and trends in the field of Bioinformatics in 2010 and their relationships to those of previous years, with emphasis on long-term trends, on best practices, on quality of the science of informatics, and on quality of science as a function of informatics.A critical review of articles in the literature of Bioinformatics over the past year.Our main results suggest that Bioinformatics continues to be a major catalyst for progress in Biology and Translational Medicine, as a consequence of new assaying technologies, most predominantly Next Generation Sequencing, which are changing the landscape of modern biological and medical research. These assays critically depend on bioinformatics and have led to quick growth of corresponding informatics methods development. Clinical-grade molecular signatures are proliferating at a rapid rate. However, a highly publicized incident at a prominent university showed that deficiencies in informatics methods can lead to catastrophic consequences for important scientific projects. Developing evidence-driven protocols and best practices is greatly needed given how serious are the implications for the quality of translational and basic science.Several exciting new methods have appeared over the past 18 months, that open new roads for progress in bioinformatics methods and their impact in biomedicine. At the same time, the range of open problems of great significance is extensive, ensuring the vitality of the field for many years to come.


Author(s):  
A.V. Kozina ◽  
Yu.S. Belov

Automatically assessing the quality of machine translation is an important yet challenging task for machine translation research. Translation quality assessment is understood as predicting translation quality without reference to the source text. Translation quality depends on the specific machine translation system and often requires post-editing. Manual editing is a long and expensive process. Since the need to quickly determine the quality of translation increases, its automation is required. In this paper, we propose a quality assessment method based on ensemble supervised machine learning methods. The bilingual corpus WMT 2019 for the EnglishRussian language pair was used as data. The text data volume is 17089 sentences, 85% of the data was used for training, and 15% for testing the model. Linguistic functions extracted from the text in the source and target languages were used as features for training the system, since it is these characteristics that can most accurately characterize the translation in terms of quality. The following tools were used for feature extraction: a free language modeling tool based on SRILM and a Stanford POS Tagger parts of speech tagger. Before training the system, the text was preprocessed. The model was trained using three regression methods: Bagging, Extra Tree, and Random Forest. The algorithms were implemented in the Python programming language using the Scikit learn library. The parameters of the random forest method have been optimized using a grid search. The performance of the model was assessed by the mean absolute error MAE and the root mean square error RMSE, as well as by the Pearsоn coefficient, which determines the correlation with human judgment. Testing was carried out using three machine translation systems: Google and Bing neural systems, Mouses statistical machine translation systems based on phrases and based on syntax. Based on the results of the work, the method of additional trees showed itself best. In addition, for all categories of indicators under consideration, the best results are achieved using the Google machine translation system. The developed method showed good results close to human judgment. The system can be used for further research in the task of assessing the quality of translation.


2018 ◽  
Vol 34 (4) ◽  
pp. 752-771
Author(s):  
Chen-li Kuo

Abstract Statistical approaches have become the mainstream in machine translation (MT), for their potential in producing less rigid and more natural translations than rule-based approaches. However, on closer examination, the uses of function words between statistical machine-translated Chinese and the original Chinese are different, and such differences may be associated with translationese as discussed in translation studies. This article examines the distribution of Chinese function words in a comparable corpus consisting of MTs and the original Chinese texts extracted from Wikipedia. An attribute selection technique is used to investigate which types of function words are significant in discriminating between statistical machine-translated Chinese and the original texts. The results show that statistical MT overuses the most frequent function words, even when alternatives exist. To improve the quality of the end product, developers of MT should pay close attention to modelling Chinese conjunctions and adverbial function words. The results also suggest that machine-translated Chinese shares some characteristics with human-translated texts, including normalization and being influenced by the source language; however, machine-translated texts do not exhibit other characteristics of translationese such as explicitation.


2019 ◽  
Vol 28 (3) ◽  
pp. 455-464 ◽  
Author(s):  
M. Anand Kumar ◽  
B. Premjith ◽  
Shivkaran Singh ◽  
S. Rajendran ◽  
K. P. Soman

Abstract In recent years, the multilingual content over the internet has grown exponentially together with the evolution of the internet. The usage of multilingual content is excluded from the regional language users because of the language barrier. So, machine translation between languages is the only possible solution to make these contents available for regional language users. Machine translation is the process of translating a text from one language to another. The machine translation system has been investigated well already in English and other European languages. However, it is still a nascent stage for Indian languages. This paper presents an overview of the Machine Translation in Indian Languages shared task conducted on September 7–8, 2017, at Amrita Vishwa Vidyapeetham, Coimbatore, India. This machine translation shared task in Indian languages is mainly focused on the development of English-Tamil, English-Hindi, English-Malayalam and English-Punjabi language pairs. This shared task aims at the following objectives: (a) to examine the state-of-the-art machine translation systems when translating from English to Indian languages; (b) to investigate the challenges faced in translating between English to Indian languages; (c) to create an open-source parallel corpus for Indian languages, which is lacking. Evaluating machine translation output is another challenging task especially for Indian languages. In this shared task, we have evaluated the participant’s outputs with the help of human annotators. As far as we know, this is the first shared task which depends completely on the human evaluation.


2018 ◽  
Vol 6 ◽  
pp. 145-157 ◽  
Author(s):  
Zaixiang Zheng ◽  
Hao Zhou ◽  
Shujian Huang ◽  
Lili Mou ◽  
Xinyu Dai ◽  
...  

Existing neural machine translation systems do not explicitly model what has been translated and what has not during the decoding phase. To address this problem, we propose a novel mechanism that separates the source information into two parts: translated Past contents and untranslated Future contents, which are modeled by two additional recurrent layers. The Past and Future contents are fed to both the attention model and the decoder states, which provides Neural Machine Translation (NMT) systems with the knowledge of translated and untranslated contents. Experimental results show that the proposed approach significantly improves the performance in Chinese-English, German-English, and English-German translation tasks. Specifically, the proposed model outperforms the conventional coverage model in terms of both the translation quality and the alignment error rate.


2018 ◽  
Vol 2 (6) ◽  
pp. 763-767
Author(s):  
Trevor Lane

The quality of the research record in the form of peer-reviewed journal archives is a reflection of not only the quality of the research publication and correction process, but also the quality of the underlying knowledge creation process. Key to the integrity of the research record are honesty and accountability from all parties involved in governing, performing, and publishing scholarly work. A concerted effort is needed to nurture an ethical research publishing culture by promoting ethical practice, relevant training, and effective systems for responding to allegations of research or publication misconduct. The Committee on Publication Ethics (COPE) is a membership organisation that aims to promote integrity in research publishing, for example, by developing and encouraging best practices to ensure that research is reported ethically, completely, and transparently. COPE uses the Principles of Transparency and Best Practice in Scholarly Publishing as part of its criteria when evaluating publishers and journals as members. Researchers can also make use of these guidelines to assess a journal's quality and to gain insights into what peer-reviewed journals expect from authors. The present article outlines and discusses these guidelines to help life science researchers publish ethically, as well as to identify ethical journals as readers, authors, and reviewers.


EDUTECH ◽  
2015 ◽  
Vol 14 (1) ◽  
pp. 52
Author(s):  
Rudi Susilana

Abstract. The implementation of the new 2013 Curriculum in schools has been started in July 2013. The implementation of the curriculum is expected to increase the quality of management and process of education at any unit of education that leads to the effort of improving the quality of learning and education. In connection with the application of the new curriculum, this research would like to reveal the problems regarding how elementary school teachers respond to the implementation of 2013 Curriculum in Bandung City with regard to planning, implementation, and assessment of curriculum. It also studied the best practice which can be emulated in terms of planning, implementation and assessment of curriculum conducted by elementary school teachers in Bandung City.The results showed that the response of elementary school teachers on the implementation of 2013 Curriculum in Bandung City was in the positive category. The planning activity was in the category of very positive while the implementation and assessment activities were in the positive category. Some best practices that can be emulated in planning, implementation and assessment conducted by the teachers in implementing the 2013 Curriculum are the activities of sharing, hearing, in-house training and real teaching modeling that were carried out in Teacher Working Group (KKG).Keywords: 2013 Curriculum, teachers’ response, best practice, curriculum implementation Abstrak. Penerapan kurikulum baru, yakni implementasi Kurikulum 2013 di sekolah telah dimulai sejak bulan Juli 2013.  Implementasi Kurikulum tersebut diharapkan mendorong peningkatan kualitas pengelolaan dan proses pendidikan pada setiap satuan pendidikan yang mengarah pada upaya peningkatan mutu pembelajaran dan pendidikan. Dilatarbelakangi oleh adanya penerapan kurikulum tersebut, penelitian ini ingin mengungkap permasalahan yang berkenaan dengan "Bagaimanakah respon guru SD terhadap terhadap implementasi Kurikulum 2013 di Kota Bandung dilihat dari kegiatan perencanaan, pelaksanaan, dan penilaian kurikulum? dan "Best Practice" apa yang dapat dicontoh dalam hal perencanaan, pelaksanaan, dan penilaian kurikulum yang dilakukan oleh guru SD di Kota Bandung? Hasil penelitian menunjukkan bahwa Respon guru SD terhadap terhadap implementasi Kurikulum 2013 di Kota Bandung berada pada kategori positif.  Untuk kegiatan perencanaan berada pada kategori sangat positif, sedangkan untuk kegiatan pelaksanaan dan penilaian kurikulum berada pada kategori positif. Terdapat beberapa  "best practice" yang dapat dicontoh dalam perencanaan, pelaksanaan, dan penilaian  dari guru SD di Kota Bandung terkait dengan implementasi Kurikulum 2013 berupa kegiatan "sharing", "hearing", "in house training", dan "modelling real teaching" yang dilaksanakan di KKG atau KKG gugus. Kata kunci:      Kurikulum 2013, Respon Guru dan Best Practice Implementasi Kurikulum.


Sign in / Sign up

Export Citation Format

Share Document