Translation description for assessment and post-editing

Target ◽  
2018 ◽  
Vol 30 (1) ◽  
pp. 112-136
Author(s):  
Noelia Ramón ◽  
Camino Gutiérrez-Lanza

Abstract This paper presents a corpus-based descriptive research procedure for the identification of significant divergences between original Spanish and Spanish translated from English. When considering the language pair English-Spanish, personal pronouns seem to be good markers of significant differences (anchor phenomena), since they must obligatorily occur in English, but not in Spanish. To test this hypothesis, empirical data have been extracted from a large reference corpus in Spanish (CREA) and from an English-Spanish parallel corpus (P-ACTRES), in both cases from the fiction subcorpora. Statistically significant differences have been found in some of the uses of personal pronouns, having textual and pragmatic implications in the target texts. The aim is to use the results obtained in the case of personal pronouns, together with results from other linguistic areas, to build a semi-automated tool for the post-editing of Spanish translations of texts written originally in English.

2016 ◽  
Vol 1 (1) ◽  
pp. 45-49
Author(s):  
Avinash Singh ◽  
Asmeet Kour ◽  
Shubhnandan S. Jamwal

The objective behind this paper is to analyze the English-Dogri parallel corpus translation. Machine translation is the translation from one language into another language. Machine translation is the biggest application of the Natural Language Processing (NLP). Moses is statistical machine translation system allow to train translation models for any language pair. We have developed translation system using Statistical based approach which helps in translating English to Dogri and vice versa. The parallel corpus consists of 98,973 sentences. The system gives accuracy of 80% in translating English to Dogri and the system gives accuracy of 87% in translating Dogri to English system.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Ikhwanuddin Ikhwanuddin ◽  
Muhammad Farid ◽  
Ahmad Syafi'i

Unsuri has a religious characteristic (culture aswaja), like a public campus characterized by “Islam". Characteristics like this, become one of the efforts to as well as a means to request support and cooperation with the community. Admittedly or not, Islamic education institutions or universities in several regions have until now felt that the community often dominated private universities, this was also felt by Unsuri, that most of the people around Unsuri did not immediately choose Unsuri as their chosen campus, based on this phenomenon, the author feels interested in knowing the initial state of the committee's strategy or the new Sunan Giri Surabaya student admission team, to find out the strategies and committee programs of Sunan Giri University Surabaya, this type of research is descriptive qualitative research, namely data collected in the form of words words, images, not numbers, according to Bagda and Taylor, as quoted by Lexy J. Moleong, qualitative research is a research procedure that produces descriptive data in the form of written or oral words from people and observed behavior. Meanwhile descriptive research is a form of research that is shown to describe or describe phenomena that exist both natural phenomena and human engineering.


1999 ◽  
Vol 2 (2) ◽  
pp. 211-229 ◽  
Author(s):  
Tony McEnery ◽  
Richard Xiao

This paper uses an English-Chinese parallel corpus, an L1 Chinese comparable corpus, and an L1 Chinese reference corpus to examine how aspectual meanings in English are translated into Chinese and explore the effects of domains, text types and translation on aspect marking. We will show that while English and Chinese both mark aspect grammatically, the aspect system in the two languages differs considerably. Even though Chinese, as an aspect language, is rich in aspect markers, covert marking (LVM) is a frequent and important strategy in Chinese discourse. The distribution of aspect markers varies significantly across domain and text type. The study also sheds new light on the translation effect by contrasting aspect marking in translated Chinese texts and L1 Chinese texts.


Author(s):  
Łukasz Grabowski ◽  
Nicholas Groom

Abstract This study uses both parallel and comparable reference corpora in the English-Polish language pair to explore how translators deal with recurrent multi-word items performing specific discoursal functions. We also consider whether the observed tendencies overlap with those found in native texts, and the extent to which the discoursal functions realised by the multi-word items under scrutiny are “preserved” in translation. Capitalizing on findings from earlier research (Granger, 2014; Grabar & Lefer, 2015), we analyzed a pre-selected set of phrases signaling stance-taking and those functioning as textual, discourse-structuring devices originally found in the European Parliament proceedings corpus (Koehn, 2005) and included in the English-Polish parallel corpus Paralela (Pęzik, 2016). Since our goal was to explore whether and to what extent English functionally-defined phrases reflect the same level of formulaicity and regularity in both Polish translations and native Polish texts, the findings provided insights into the translation tendencies of such items, and revealed – using inter-rater agreement metrics – that the discoursal functions of recurrent n-grams may change in translation.


1999 ◽  
Vol 2 (1) ◽  
pp. 115-130 ◽  
Author(s):  
Diana Santos ◽  
Signe Oksefjell

This paper examines the results from two corpus-based contrastive studies. Both studies offer cross-linguistic claims about the language pair English-Portuguese. We attempt to replicate the studies and check the findings against a different corpus, viz. the English—Portuguese part of the English—Norwegian Parallel Corpus, to see whether the regularities observed in the original corpora can be confirmed. After a brief presentation of each study, we describe how we gathered equivalent data, present our findings in the new corpus, and discuss some possible reasons for discrepancies in relation to the earlier studies. The topics investigated are boundary-crossing movement descriptions (after Slobin 1997) and perception verbs (after Santos 1998).


Author(s):  
Edyta Więcławska

Abstract Context is a notion that is commonly invoked in many linguistic studies, either with very general reference or, more specifically, in the light of one of a number of research approaches which assign distinct definitions to context, ranging from factors that can be recovered from a text, through social parameters serving as an index for the appropriation of discursive performance, to factors that bring texts into being and give them meaning. This exploratory and descriptive research problematises the notion of context specifically on the grounds of English/Polish translation of corporate documentation processed in company registration proceedings, touching upon factors that are presumed to be discursively relevant in this communicative situation. The study is conducted from the perspective of the sociocultural approach and it adopts the parallel corpus methodology. The author discusses the concept of context on the ground of legal communication and secondarily presents a corpus-based description of the context categories that are idiosyncratic and potentially discursively relevant for the said communicative situation in the cross-linguistic perspective. The contextual variation is tested for its capacity to affect translation performance. The results reveal specific tendencies as regards the distribution patterns in the values corresponding to the investigated context categories. They point to some divergencies in translation output caused by the source text variantivity and they pave the way and directions for further research. Already at this stage the findings may have significant pedagogical value and they constitute a solid starting point for sociolinguistic research on discourse variantivity.


2006 ◽  
Vol 51 (2) ◽  
pp. 343-367 ◽  
Author(s):  
Ho-Jeong Cheong

Abstract This paper contradicts the prevailing assumptions among the advocates of translation universals (TU’s) that explicitation, a translation behavior which consists of spelling things out rather than leaving them implicit in translation, is a potential TU, irrespective of the specific language pairs involved in the process of translation. Specifically, via a study employing a newly built 517,609-word parallel corpus, it is shown that implicitation and the subsequent TT contraction as well as explicitation and TT expansion entailed were both observed in translations involving Korean and English. The significance of the direction of language combinations in translations employing the same language pair was identified, together with the introduction and verification of the validity of the four measurement units devised for this study to capture diverse aspects of explicitation/implicitation which in turn entail TT expansion/contraction.


2020 ◽  
Vol 10 (2) ◽  
Author(s):  
Fashihah Fashihah ◽  
Abd Qohar

<span lang="EN-GB">Representation is the interpretation of students' thinking to a problem </span><span>in</span><span lang="EN-GB"> the form of words or verbal, text, tables, graphs, or symbols of mathematics as a</span><span>n aid instrument</span><span lang="EN-GB"> to solve problem</span><span>s</span><span lang="EN-GB">. Representation is an important process standard in mathematics learning. There are 3 representation process standards in learning. One of the material</span><span>s</span><span lang="EN-GB"> that </span><span>requires</span><span lang="EN-GB"> the ability of representation at junior high school level is relation. This study </span><span>aimed at describing</span><span lang="EN-GB"> the mathematical representation process standard in learning mathematics in relation material. This study </span><span>used</span><span lang="EN-GB"> a qualitative approach to the type of descriptive research. The research procedure</span><span>s</span><span lang="EN-GB"> in this article </span><span>were</span><span lang="EN-GB"> (1) designing learning instruments, (2) </span><span>implementing</span><span lang="EN-GB"> learning design in peer teaching, (3) conducting learning </span><span>analysis</span><span lang="EN-GB"> related to the abilities. The results of the study showed that the first </span><span>process </span><span lang="EN-GB">standard was seen in apperception </span><span>activity</span><span lang="EN-GB">, Activity 1 on LKS</span><span> (student worksheet)</span><span lang="EN-GB">, Activity 2 on LKS</span><span> (student worksheet)</span><span lang="EN-GB">, group presentations, and exercises. The second process standard </span><span>was</span><span lang="EN-GB"> seen prominently in </span><span>t</span><span lang="EN-GB">he </span><span>exercise</span><span lang="EN-GB"> activities, while the third process standard </span><span>was seen</span><span lang="EN-GB"> prominent</span><span>ly</span><span lang="EN-GB"> in Activity 2 on LKS</span><span> (student worksheet)</span><span lang="EN-GB">. Based on the results of the study, it can be concluded that the three </span><span>process </span><span lang="EN-GB">standard</span><span>s</span><span lang="EN-GB"> have been seen in almost all learning activities</span>


2021 ◽  
Vol 14 (2) ◽  
pp. 494-508
Author(s):  
Francina Sole-Mauri ◽  
Pilar Sánchez-Gijón ◽  
Antoni Oliver

This article presents Cadlaws, a new English–French corpus built from Canadian legal documents, and describes the corpus construction process and preliminary statistics obtained from it. The corpus contains over 16 million words in each language and includes unique features since it is composed of documents that are legally equivalent in both languages but not the result of a translation. The corpus is built upon enactments co-drafted by two jurists to ensure legal equality of each version and to re­flect the concepts, terms and institutions of two legal traditions. In this article the corpus definition as a parallel corpus instead of a comparable one is also discussed. Cadlaws has been pre-processed for machine translation and baseline Bilingual Evaluation Understudy (bleu), a score for comparing a candidate translation of text to a gold-standard translation of a neural machine translation system. To the best of our knowledge, this is the largest parallel corpus of texts which convey the same meaning in this language pair and is freely available for non-commercial use.


Sign in / Sign up

Export Citation Format

Share Document