Translation description for assessment and post-editing

Abstract This paper presents a corpus-based descriptive research procedure for the identification of significant divergences between original Spanish and Spanish translated from English. When considering the language pair English-Spanish, personal pronouns seem to be good markers of significant differences (anchor phenomena), since they must obligatorily occur in English, but not in Spanish. To test this hypothesis, empirical data have been extracted from a large reference corpus in Spanish (CREA) and from an English-Spanish parallel corpus (P-ACTRES), in both cases from the fiction subcorpora. Statistically significant differences have been found in some of the uses of personal pronouns, having textual and pragmatic implications in the target texts. The aim is to use the results obtained in the case of personal pronouns, together with results from other linguistic areas, to build a semi-automated tool for the post-editing of Spanish translations of texts written originally in English.

Download Full-text

English-Dogri Translation System using MOSES

Circulation in Computer Science ◽

10.22632/ccs-2016-251-25 ◽

2016 ◽

Vol 1 (1) ◽

pp. 45-49

Author(s):

Avinash Singh ◽

Asmeet Kour ◽

Shubhnandan S. Jamwal

Keyword(s):

Natural Language Processing ◽

Machine Translation ◽

Language Processing ◽

Statistical Machine Translation ◽

Translation System ◽

Parallel Corpus ◽

English System ◽

Machine Translation System ◽

Translation Machine ◽

Language Pair

The objective behind this paper is to analyze the English-Dogri parallel corpus translation. Machine translation is the translation from one language into another language. Machine translation is the biggest application of the Natural Language Processing (NLP). Moses is statistical machine translation system allow to train translation models for any language pair. We have developed translation system using Statistical based approach which helps in translating English to Dogri and vice versa. The parallel corpus consists of 98,973 sentences. The system gives accuracy of 80% in translating English to Dogri and the system gives accuracy of 87% in translating Dogri to English system.

Download Full-text

POLA PENERIMAAN MAHASISWA BARU BERBASIS SYARI’AH SEBAGAI RESPEK KONSISTENSI KE-ISLAMAN DAN KEBERAGAMAN DI UNIVERSITAS SUNAN GIRI SURABAYA

Jurnal Kajian Hukum Islam ◽

10.52166/jkhi.v7i1.13 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Ikhwanuddin Ikhwanuddin ◽

Muhammad Farid ◽

Ahmad Syafi'i

Keyword(s):

Qualitative Research ◽

Private Universities ◽

Islamic Education ◽

Natural Phenomena ◽

Human Engineering ◽

Initial State ◽

Descriptive Data ◽

Descriptive Research ◽

The People ◽

Research Procedure

Unsuri has a religious characteristic (culture aswaja), like a public campus characterized by “Islam". Characteristics like this, become one of the efforts to as well as a means to request support and cooperation with the community. Admittedly or not, Islamic education institutions or universities in several regions have until now felt that the community often dominated private universities, this was also felt by Unsuri, that most of the people around Unsuri did not immediately choose Unsuri as their chosen campus, based on this phenomenon, the author feels interested in knowing the initial state of the committee's strategy or the new Sunan Giri Surabaya student admission team, to find out the strategies and committee programs of Sunan Giri University Surabaya, this type of research is descriptive qualitative research, namely data collected in the form of words words, images, not numbers, according to Bagda and Taylor, as quoted by Lexy J. Moleong, qualitative research is a research procedure that produces descriptive data in the form of written or oral words from people and observed behavior. Meanwhile descriptive research is a form of research that is shown to describe or describe phenomena that exist both natural phenomena and human engineering.

Download Full-text

Domains, text types, aspect marking and English-Chinese translation

Languages in Contrast ◽

10.1075/lic.2.2.05mce ◽

1999 ◽

Vol 2 (2) ◽

pp. 211-229 ◽

Cited By ~ 8

Author(s):

Tony McEnery ◽

Richard Xiao

Keyword(s):

Chinese Translation ◽

Parallel Corpus ◽

Text Type ◽

Text Types ◽

Reference Corpus ◽

Aspect Markers ◽

Chinese Texts

This paper uses an English-Chinese parallel corpus, an L1 Chinese comparable corpus, and an L1 Chinese reference corpus to examine how aspectual meanings in English are translated into Chinese and explore the effects of domains, text types and translation on aspect marking. We will show that while English and Chinese both mark aspect grammatically, the aspect system in the two languages differs considerably. Even though Chinese, as an aspect language, is rich in aspect markers, covert marking (LVM) is a frequent and important strategy in Chinese discourse. The distribution of aspect markers varies significantly across domain and text type. The study also sheds new light on the translation effect by contrasting aspect marking in translated Chinese texts and L1 Chinese texts.

Download Full-text

Functionally-defined recurrent multi-word units in English-to-Polish translation

Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics ◽

10.1075/resla.19037.gra ◽

2021 ◽

Author(s):

Łukasz Grabowski ◽

Nicholas Groom

Keyword(s):

European Parliament ◽

Rater Agreement ◽

Parallel Corpus ◽

Polish Language ◽

Language Pair

Abstract This study uses both parallel and comparable reference corpora in the English-Polish language pair to explore how translators deal with recurrent multi-word items performing specific discoursal functions. We also consider whether the observed tendencies overlap with those found in native texts, and the extent to which the discoursal functions realised by the multi-word items under scrutiny are “preserved” in translation. Capitalizing on findings from earlier research (Granger, 2014; Grabar & Lefer, 2015), we analyzed a pre-selected set of phrases signaling stance-taking and those functioning as textual, discourse-structuring devices originally found in the European Parliament proceedings corpus (Koehn, 2005) and included in the English-Polish parallel corpus Paralela (Pęzik, 2016). Since our goal was to explore whether and to what extent English functionally-defined phrases reflect the same level of formulaicity and regularity in both Polish translations and native Polish texts, the findings provided insights into the translation tendencies of such items, and revealed – using inter-rater agreement metrics – that the discoursal functions of recurrent n-grams may change in translation.

Download Full-text

Using a Parallel Corpus to Study the Translation of Personal Pronouns

FORUM ◽

10.1075/forum.10.2.09wen ◽

2012 ◽

Vol 10 (2) ◽

pp. 187-204 ◽

Cited By ~ 1

Author(s):

Ting-hui Wen

Keyword(s):

Personal Pronouns ◽

Parallel Corpus

Download Full-text

Using a Parallel Corpus to Validate Independent Claims

Languages in Contrast ◽

10.1075/lic.2.1.07san ◽

1999 ◽

Vol 2 (1) ◽

pp. 115-130 ◽

Cited By ~ 2

Author(s):

Diana Santos ◽

Signe Oksefjell

Keyword(s):

Boundary Crossing ◽

Parallel Corpus ◽

Perception Verbs ◽

Language Pair

This paper examines the results from two corpus-based contrastive studies. Both studies offer cross-linguistic claims about the language pair English-Portuguese. We attempt to replicate the studies and check the findings against a different corpus, viz. the English—Portuguese part of the English—Norwegian Parallel Corpus, to see whether the regularities observed in the original corpora can be confirmed. After a brief presentation of each study, we describe how we gathered equivalent data, present our findings in the new corpus, and discuss some possible reasons for discrepancies in relation to the earlier studies. The topics investigated are boundary-crossing movement descriptions (after Slobin 1997) and perception verbs (after Santos 1998).

Download Full-text

Contextualising the Notion of Context in Jurilinguistic Studies

International Journal for the Semiotics of Law - Revue internationale de Sémiotique juridique ◽

10.1007/s11196-020-09701-0 ◽

2020 ◽

Vol 33 (3) ◽

pp. 637-656 ◽

Cited By ~ 1

Author(s):

Edyta Więcławska

Keyword(s):

Distribution Patterns ◽

Source Text ◽

General Reference ◽

Parallel Corpus ◽

Descriptive Research ◽

Contextual Variation ◽

Starting Point ◽

Sociocultural Approach ◽

Legal Communication ◽

The Way

Abstract Context is a notion that is commonly invoked in many linguistic studies, either with very general reference or, more specifically, in the light of one of a number of research approaches which assign distinct definitions to context, ranging from factors that can be recovered from a text, through social parameters serving as an index for the appropriation of discursive performance, to factors that bring texts into being and give them meaning. This exploratory and descriptive research problematises the notion of context specifically on the grounds of English/Polish translation of corporate documentation processed in company registration proceedings, touching upon factors that are presumed to be discursively relevant in this communicative situation. The study is conducted from the perspective of the sociocultural approach and it adopts the parallel corpus methodology. The author discusses the concept of context on the ground of legal communication and secondarily presents a corpus-based description of the context categories that are idiosyncratic and potentially discursively relevant for the said communicative situation in the cross-linguistic perspective. The contextual variation is tested for its capacity to affect translation performance. The results reveal specific tendencies as regards the distribution patterns in the values corresponding to the investigated context categories. They point to some divergencies in translation output caused by the source text variantivity and they pave the way and directions for further research. Already at this stage the findings may have significant pedagogical value and they constitute a solid starting point for sociolinguistic research on discourse variantivity.

Download Full-text

Target Text Contraction in English-into-Korean Translations: A Contradiction of Presumed Translation Universals?*

Meta Journal des traducteurs ◽

10.7202/013261ar ◽

2006 ◽

Vol 51 (2) ◽

pp. 343-367 ◽

Cited By ~ 2

Author(s):

Ho-Jeong Cheong

Keyword(s):

Specific Language ◽

Parallel Corpus ◽

Korean And English ◽

Measurement Units ◽

Language Pair

Abstract This paper contradicts the prevailing assumptions among the advocates of translation universals (TU’s) that explicitation, a translation behavior which consists of spelling things out rather than leaving them implicit in translation, is a potential TU, irrespective of the specific language pairs involved in the process of translation. Specifically, via a study employing a newly built 517,609-word parallel corpus, it is shown that implicitation and the subsequent TT contraction as well as explicitation and TT expansion entailed were both observed in translations involving Korean and English. The significance of the direction of language combinations in translations employing the same language pair was identified, together with the introduction and verification of the validity of the four measurement units devised for this study to capture diverse aspects of explicitation/implicitation which in turn entail TT expansion/contraction.

Download Full-text

Analysis of Mathematical Representation Process Standard in Learning Mathematics on Relation Material

Formatif Jurnal Ilmiah Pendidikan MIPA ◽

10.30998/formatif.v10i2.5259 ◽

2020 ◽

Vol 10 (2) ◽

Author(s):

Fashihah Fashihah ◽

Abd Qohar

Keyword(s):

Junior High School ◽

Mathematics Learning ◽

School Level ◽

Mathematical Representation ◽

High School Level ◽

Descriptive Research ◽

Learning Mathematics ◽

Research Procedure ◽

Junior High School Level ◽

Almost All

Representation is the interpretation of students' thinking to a problem in the form of words or verbal, text, tables, graphs, or symbols of mathematics as an aid instrument to solve problems. Representation is an important process standard in mathematics learning. There are 3 representation process standards in learning. One of the materials that requires the ability of representation at junior high school level is relation. This study aimed at describing the mathematical representation process standard in learning mathematics in relation material. This study used a qualitative approach to the type of descriptive research. The research procedures in this article were (1) designing learning instruments, (2) implementing learning design in peer teaching, (3) conducting learning analysis related to the abilities. The results of the study showed that the first process standard was seen in apperception activity, Activity 1 on LKS (student worksheet), Activity 2 on LKS (student worksheet), group presentations, and exercises. The second process standard was seen prominently in the exercise activities, while the third process standard was seen prominently in Activity 2 on LKS (student worksheet). Based on the results of the study, it can be concluded that the three process standards have been seen in almost all learning activities

Download Full-text

Cadlaws – An English–French Parallel Corpus of Legally Equivalent Documents

Mutatis Mutandis Revista Latinoamericana de Traducción ◽

10.17533/udea.mut.v14n2a10 ◽

2021 ◽

Vol 14 (2) ◽

pp. 494-508

Author(s):

Francina Sole-Mauri ◽

Pilar Sánchez-Gijón ◽

Antoni Oliver

Keyword(s):

Machine Translation ◽

Translation System ◽

Neural Machine Translation ◽

Parallel Corpus ◽

Legal Documents ◽

Legal Traditions ◽

Corpus Construction ◽

Machine Translation System ◽

French Corpus ◽

Language Pair

This article presents Cadlaws, a new English–French corpus built from Canadian legal documents, and describes the corpus construction process and preliminary statistics obtained from it. The corpus contains over 16 million words in each language and includes unique features since it is composed of documents that are legally equivalent in both languages but not the result of a translation. The corpus is built upon enactments co-drafted by two jurists to ensure legal equality of each version and to reflect the concepts, terms and institutions of two legal traditions. In this article the corpus definition as a parallel corpus instead of a comparable one is also discussed. Cadlaws has been pre-processed for machine translation and baseline Bilingual Evaluation Understudy (bleu), a score for comparing a candidate translation of text to a gold-standard translation of a neural machine translation system. To the best of our knowledge, this is the largest parallel corpus of texts which convey the same meaning in this language pair and is freely available for non-commercial use.

Download Full-text