scholarly journals Automatic Generation of German Translation Candidates for SNOMED CT Textual Descriptions

Author(s):  
Andrea Prunotto ◽  
Stefan Schulz ◽  
Martin Boeker

We present an approach called MTP (multiple translation paths) aiming at assisting human translation in SNOMED CT localisation projects based on free, web-based machine translation tools. For a chosen target language, MTP generates a scored output of translation candidates (TCs) for each input concept. This paper describes the basic idea of MTP, the distribution of its output TCs and discusses typical examples with German as target language. The MTP approach capitalises on combinatorial growth by the combination of input languages, support languages, and translation engines. We applied MTP on the SNOMED CT Starter Set, using Google Translator, DeepL and Systran, together with the four source languages English, Spanish, Swedish and French, and Danish, Dutch, Norwegian, Italian, Portuguese, Polish and Russian as support languages. The descriptive assessment of TC variety, together with an analysis of typical results is the focus of this paper. MTP defines, for each input concept, TPs by the combination of input languages, support languages and translation engines, resulting in 91 translation results with various degrees of co-incidence (cardinality). The most configurations produce an average number of TCs indicating that the same TC is often derived via different translation paths. Combinations of translation engines result in distributions with a higher number of distinct TCs per concept. We present work in progress on using machine translation (MT) for terminology translation, by leveraging several free MT tools fed by different languages and language combinations. A first qualitative analysis was promising and supports our hypothesis that a majority voting applied to many translation candidates yields higher quality results than from one single engine and input language.

Author(s):  
Long Zhou ◽  
Jiajun Zhang ◽  
Chengqing Zong

Existing approaches to neural machine translation (NMT) generate the target language sequence token-by-token from left to right. However, this kind of unidirectional decoding framework cannot make full use of the target-side future contexts which can be produced in a right-to-left decoding direction, and thus suffers from the issue of unbalanced outputs. In this paper, we introduce a synchronous bidirectional–neural machine translation (SB-NMT) that predicts its outputs using left-to-right and right-to-left decoding simultaneously and interactively, in order to leverage both of the history and future information at the same time. Specifically, we first propose a new algorithm that enables synchronous bidirectional decoding in a single model. Then, we present an interactive decoding model in which left-to-right (right-to-left) generation does not only depend on its previously generated outputs, but also relies on future contexts predicted by right-to-left (left-to-right) decoding. We extensively evaluate the proposed SB-NMT model on large-scale NIST Chinese-English, WMT14 English-German, and WMT18 Russian-English translation tasks. Experimental results demonstrate that our model achieves significant improvements over the strong Transformer model by 3.92, 1.49, and 1.04 BLEU points, respectively, and obtains the state-of-the-art per- formance on Chinese-English and English- German translation tasks. 1


2014 ◽  
Vol 998-999 ◽  
pp. 1178-1181
Author(s):  
Nan Lu

This paper proposed a novel method to extract bilingual translation pairs from the web. Based on the observation that translation pairs tend to appear collectively on the web, a recursive process is used to extract high quality translation pairs from the web. First query the search engine with some seed data and crawl the returned pages. Then identify the Collective Translation Pair Block (CTPB) which contains the collective translation pairs using a heuristic evaluation method. After the CTPB has been identified, a PAT tree is employed to generate the extraction patterns automatically. Then a ranking SVM model is used to re-rank these patterns based on the F measure. The top 10 patterns are adopted to extract the translation pairs with the help of surface pattern. At last in order to get the high quality extraction translation, the extracted translation pairs are verified by a SVM classifier based on the translation relevant between the source and the target language.


2017 ◽  
Vol 5 (1) ◽  
pp. 49
Author(s):  
Rizka Mahardika

Translation tools are commonly used for translating a text written in one language (source language) into another language (target language). They are used to help translators in translating big numbers of translation works in effective time. There are three types of translation tools being studied in the article entitled Machine Translation Tools: Tools of the Translator’s Trade written by Peter Katsberg published in 2012. They are Fully Automated Machine Translation (or FAMT), Human Aided Machine Translation (or HAMT) and Machine Aided Human Translation (or MAHT). Katsberg analyzed how each translation tool works, the naturality and approriateness of its translation and the compatibility of using it. In this digital era, translation tools are not only popular among translators but also among EFL learners. Beginning with the use of portable dictionary such as Alfalink and expanding to the more sopisticated translation tool such as Google Translate. Some novice learners usually use this translation tools in doing their task without recorrecting the translation result. This happens perhaps because they do not have enough background knowledge to evaluate the translation result. Thus, it will be better when the learners have good mastery in basic English and train them to be aware in evaluating the result from translation tools. On the other words, Human Aided Machine Translation may be the wise choice to do translation task effectively and efficiently particularly in managing the time.


Author(s):  
Zhenpeng Chen ◽  
Sheng Shen ◽  
Ziniu Hu ◽  
Xuan Lu ◽  
Qiaozhu Mei ◽  
...  

Sentiment classification typically relies on a large amount of labeled data. In practice, the availability of labels is highly imbalanced among different languages. To tackle this problem, cross-lingual sentiment classification approaches aim to transfer knowledge learned from one language that has abundant labeled examples (i.e., the source language, usually English) to another language with fewer labels (i.e., the target language). The source and the target languages are usually bridged through off-the-shelf machine translation tools. Through such a channel, cross-language sentiment patterns can be successfully learned from English and transferred into the target languages. This approach, however, often fails to capture sentiment knowledge specific to the target language. In this paper, we employ emojis, which are widely available in many languages, as a new channel to learn both the cross-language and the language-specific sentiment patterns. We propose a novel representation learning method that uses emoji prediction as an instrument to learn respective sentiment-aware representations for each language. The learned representations are then integrated to facilitate cross-lingual sentiment classification.


Author(s):  
Mir Aadil ◽  
M. Asger

The process Machine translation is a combination of many complex sub-processes and the quality of results of each sub-process executed in a well defined sequence determine the overall accuracy of the translation. Statistical Machine Translation approach considers each sentence in target language as a possible translation of any source language sentence. The possibility is calculated by probability and as obvious, sentence with highest probability is treated as the best translation. SMT is the most favoured approach not only because of its good results for corpus rich language pairs, but also for the tools that  SMT approach has been enhanced  with in past two and half decades. The paper gives a brief introduction to SMT:  its steps and different tools available for each step.


Author(s):  
Herry Sujaini

Extended Word Similarity Based (EWSB) Clustering is a word clustering algorithm based on the value of words similarity obtained from the computation of a corpus. One of the benefits of clustering with this algorithm is to improve the translation of a statistical machine translation. Previous research proved that EWSB algorithm could improve the Indonesian-English translator, where the algorithm was applied to Indonesian language as target language.This paper discusses the results of a research using EWSB algorithm on a Indonesian to Minang statistical machine translator, where the algorithm is applied to Minang language as the target language. The research obtained resulted that the EWSB algorithm is quite effective when used in Minang language as the target language. The results of this study indicate that EWSB algorithm can improve the translation accuracy by 6.36%.


Proceedings ◽  
2020 ◽  
Vol 63 (1) ◽  
pp. 56
Author(s):  
Bianca Han

This paper reflects the technology-induced novelty of translation, which is perceived as a bridge between languages and cultures. We debate the extent to which the translation process maintains its specificity in the light of the new technology-enhanced working methods ensured by a large variety of Computer-Assisted Translation (CAT) and Machine Translation (MT) tools that aim to enhance the process, which includes the translation itself, the translator, the translation project manager, the linguist, the terminologist, the reviewer, and the client. This paper also hints at the topic from the perspective of the translation teacher, who needs to provide students with transversal competencies that are suitable for the digital area, supported by the ability to tackle Cloud-based translation tools, in view of Industry 4.0 requirements.


2014 ◽  
Vol 102 (1) ◽  
pp. 69-80 ◽  
Author(s):  
Torregrosa Daniel ◽  
Forcada Mikel L. ◽  
Pérez-Ortiz Juan Antonio

Abstract We present a web-based open-source tool for interactive translation prediction (ITP) and describe its underlying architecture. ITP systems assist human translators by making context-based computer-generated suggestions as they type. Most of the ITP systems in literature are strongly coupled with a statistical machine translation system that is conveniently adapted to provide the suggestions. Our system, however, follows a resource-agnostic approach and suggestions are obtained from any unmodified black-box bilingual resource. This paper reviews our ITP method and describes the architecture of Forecat, a web tool, partly based on the recent technology of web components, that eases the use of our ITP approach in any web application requiring this kind of translation assistance. We also evaluate the performance of our method when using an unmodified Moses-based statistical machine translation system as the bilingual resource.


2014 ◽  
Vol 05 (01) ◽  
pp. 127-152 ◽  
Author(s):  
E. Sundvall ◽  
K.R. Gøeg ◽  
A.R. Højen

SummaryInconsistent use of SNOMED CT concepts may reduce comparability of information in health information systems. Terminology implementation should be approached by common strategies for navigating and selecting proper concepts. This study aims to explore ways of illustrating common pathways and ancestors of particular sets of concepts, to support consistent use of SNOMED CT and also assess potential applications for such visualizations.The open source prototype presented is an interactive web-based re-implementation of the terminology visualization tool TermViz that provides an overview of concepts and their hierarchical relations. It provides terminological features such as interactively rearranging graphs, fetching more concept nodes, highlighting least common parents and shared pathways in merged graphs etc.Four teams of three to four people used the prototype to complete a terminology mapping task and then, in focus group interviews, discussed the user experience and potential future tool usage. Potential purposes discussed included SNOMED CT search and training, consistent selection of concepts and content management.The evaluation indicated that the tool may be useful in many contexts especially if integrated with existing systems, and that the graph layout needs further tuning and development.Citation: Højen AR, Sundvall E, Gøeg KR. Methods and applications for visualization of SNOMED CT concept sets. Appl Clin Inf 2014; 5: 127–152http://dx.doi.org/10.4338/ACI-2013-09-RA-0071


Author(s):  
Joshua Evans

Machine translation tools such as Google Translate are at best seen as useful approximators, rather than offering any literary potential. In this experiment and short methodological reflection, I use Google Translate to recursively translate Austrian poet Georg Trakl’s celebrated WWI poem, ‘Grodek’, between German and English, until the two versions stabilise. I am attentive to places in which the poem and its renderings are simplified and/or literary value may be lost, but also places in which new or unexpected renderings emerge. This is a preliminary foray, but I propose that the method of recursive machine translation offers a new way to explore the translation of literary texts—a timely proposal, given the increasing applications of computer programmes and machine learning both within the humanities and throughout wider literary culture.


Sign in / Sign up

Export Citation Format

Share Document