Large-scale Word Alignment Using Soft Dependency Cohesion Constraints

Dependency cohesion refers to the observation that phrases dominated by disjoint dependency subtrees in the source language generally do not overlap in the target language. It has been verified to be a useful constraint for word alignment. However, previous work either treats this as a hard constraint or uses it as a feature in discriminative models, which is ineffective for large-scale tasks. In this paper, we take dependency cohesion as a soft constraint, and integrate it into a generative model for large-scale word alignment experiments. We also propose an approximate EM algorithm and a Gibbs sampling algorithm to estimate model parameters in an unsupervised manner. Experiments on large-scale Chinese-English translation tasks demonstrate that our model achieves improvements in both alignment quality and translation quality.

Download Full-text

Parents Perspective on Translation Quality of Children Bilingual Storybooks

International Journal of English and Applied Linguistics (IJEAL) ◽

10.47709/ijeal.v1i2.1017 ◽

2021 ◽

Vol 1 (2) ◽

pp. 45-51

Author(s):

Dina Maharani ◽

Chusna Apriyanti ◽

Agustina Sri Hafidah

Keyword(s):

Elementary Schools ◽

Quantitative Research ◽

Target Language ◽

Data Display ◽

Source Language ◽

Translation Quality ◽

Bahasa Indonesia ◽

Learning English ◽

At Home

Parents believe that bilingual storybooks for children can be used as media for children in learning English. However, not all bilingual books have good quality in their translation. This research aims to know the parents’ perspective on the quality of translation in children’s bilingual storybooks. This is descriptive quantitative research. The data were gathered by using a questionnaire through implementing Google Form for 52 parents as respondents. Some considerations in choosing the respondents were applied, such as the parents have kindergarten and elementary schools level students and the parents use bilingual storybooks at home. The storybooks in this research consist of English and Indonesia, with Bahasa Indonesia as the source language and English as a target language. The research was conducted from April to June 2021. After being collected, the data are presented as the data display stage, and the researchers conclude. The result shows that there are 48 parents of 52 parents who consider bilingual storybooks as media. Fifty parents also buy bilingual storybooks for their children. Among the respondents, 37 parents check the language of the books, and 15 parents do not match. Forty-two parents believe that the books are qualified for learning English. There are five considerations for parents in buying books: story, picture, language, price, and publishers/authors.

Download Full-text

Constructing a Large-Scale English-Persian Parallel Corpus

Meta Journal des traducteurs ◽

10.7202/029804ar ◽

2009 ◽

Vol 54 (1) ◽

pp. 181-188 ◽

Cited By ~ 10

Author(s):

Tayebeh Mosavi Miangah

Keyword(s):

Large Scale ◽

Target Language ◽

Translation Memory ◽

Web Documents ◽

Parallel Corpus ◽

Translation Quality ◽

Text Corpora ◽

Develop Software ◽

General Translation ◽

The Web

Abstract In recent years the exploitation of large text corpora in solving various kinds of linguistic problems, including those of translation, is commonplace. Yet a large-scale English-Persian corpus is still unavailable, because of certain difficulties and the amount of work required to overcome them. The project reported here is an attempt to constitute an English-Persian parallel corpus composed of digital texts and Web documents containing little or no noise. The Internet is useful because translations of existing texts are often published on the Web. The task is to find parallel pages in English and Persian, to judge their translation quality, and to download and align them. The corpus so created is of course open; that is, more material can be added as the need arises. One of the main activities associated with building such a corpus is to develop software for parallel concordancing, in which a user can enter a search string in one language and see all the citations for that string in it and corresponding sentences in the target language. Our intention is to construct general translation memory software using the present English-Persian parallel corpus.

Download Full-text

Translation Analysis of Taxis in “The Old Man and the Sea” Novel (Systemic Functional Linguistics Approach)

Theory and Practice in Language Studies ◽

10.17507/tpls.0902.16 ◽

2019 ◽

Vol 9 (2) ◽

pp. 245

Author(s):

Arso Setyaji ◽

Sri Samiati Tarjana ◽

M. R. Nababan ◽

Tri Wiratno

Keyword(s):

Ernest Hemingway ◽

Target Language ◽

Systemic Functional Grammar ◽

Source Language ◽

Translation Quality ◽

Translation Analysis ◽

Translation Techniques ◽

Logical Semantics ◽

Clause Complex

The Old Man and the Sea is a literature work by Ernest Hemingway. It has been translated into many languages even in Indonesian by Deera Army. Hemingway used more clause complex in producing his works. It causes problems in translation such as: translators should give more attention to the translation techniques used, readability decrease, and etc. On the other hand, Deera Army solved those problems by splitting the clause complex into shorter one. It is needed to conduct a study in how to make translation of complex clause. This study can be clearly conducted by using Systemic Functional Grammar (SFG) approach. In addition, this study is aimed at: (1) describing how can be interdependency and logical semantics of complex clause in source language realized into interdependency and logical semantics of complex clause in target language of The Old Man and the Sea Novel (2) describing what translation techniques on taxis markers are used in translating from source language to target language (3) describing translation quality of clause complex translation in target language. The result of the analysis showed that there are 400 sentences which have been broken into 701 clauses. Based on the analysis, there are paratactic and hypotactic form. Paratactic took 65.30% and hypotactic, 34.50%. All of them affect translation quality. Based on the analysis, the average of accuration takes up 2.89, naturalness with 2.96 and readibility with 2.97. The writer suggests that the next researcher can conduct the same research in the deeper way.

Download Full-text

KAJIAN TEKNIK, METODE, IDEOLOGI PENERJEMAHAN PADA KOMIK BABY BLUES SIAGA SATU ANAK PERTAMA KARYA RICK KIRKMAN DAN JERRY SCOTT DAN PENGARUHNYA TERHADAP KUALITAS TERJEMAHAN

PARAFRASE Jurnal Kajian Kebahasaan & Kesastraan ◽

10.30996/parafrase.v17i1.1361 ◽

2018 ◽

Vol 17 (1) ◽

Author(s):

Hosnol Wafa’ ◽

Indra Tjahyadi

Keyword(s):

Speech Act ◽

Target Language ◽

Source Language ◽

Form And Function ◽

Translation Quality ◽

Translation Technique ◽

Baby Blues ◽

And Function ◽

Data Medium

Abstract. The objectives of this study are Analysis of techniques, methods, and ideologiesused by translator on translation form and function directive illocutionary of speech act and to assess the quality of translation form and function directive illocutionary of speech act used in bilingual comic Baby Blues siaga satu anak pertama from accuracy, acceptability, and readability of translation aspects. This research was a descriptive, qualitative, and embedded research of translation. The finding of this study shows; first, 273 data of directive illocutionary utterances applied 11 function, such as commanding, asking, asserting, inviting, requesting, ordering, advicing, suggesting, urgeing, rejecting, forbiding, recommending, reminding, and convinceing. Second, 273 data of directive illocutionary utterances analyzed, translation technique identified 248 data oriented to source language and 163 data oriented to target language. Thirth, concerning with translation quality of directive illocutionary speech act utterance in comic Baby Blues siaga satu anak pertama can be concluded that the translation is accurate. In this case is showed from 255 data constitute the translation accurate, 17 data less accurate, and 1 data not accurate, 254 data acceptability, 18 data less acceptability, and 1 data not acceptability, then 161 data high readability, 97 data medium readability, and 15 data low readability translation. Keywords: Directiveillocutionary, Techniques, Methods, Ideologies, Translation quality

Download Full-text

Two approaches to compilation of bilingual multi-word terminology lists from lexical resources

Natural Language Engineering ◽

10.1017/s1351324919000615 ◽

2020 ◽

Vol 26 (4) ◽

pp. 455-479

Author(s):

Branislava Šandrih ◽

Cvetana Krstev ◽

Ranka Stanković

Keyword(s):

Information Science ◽

Target Language ◽

Support Vector ◽

Word Alignment ◽

Lexical Resources ◽

Terminology Extraction ◽

Source Language ◽

Term Extraction ◽

Two Parameters ◽

Shallow Parser

AbstractIn this paper, we present two approaches and the implemented system for bilingual terminology extraction that rely on an aligned bilingual domain corpus, a terminology extractor for a target language, and a tool for chunk alignment. The two approaches differ in the way terminology for the source language is obtained: the first relies on an existing domain terminology lexicon, while the second one uses a term extraction tool. For both approaches, four experiments were performed with two parameters being varied. In the experiments presented in this paper, the source language was English, and the target language Serbian, and a selected domain was Library and Information Science, for which an aligned corpus exists, as well as a bilingual terminological dictionary. For term extraction, we used the FlexiTerm tool for the source language and a shallow parser for the target language, while for word alignment we used GIZA++. The evaluation results show that for the first approach the F1 score varies from 29.43% to 51.15%, while for the second it varies from 61.03% to 71.03%. On the basis of the evaluation results, we developed a binary classifier that decides whether a candidate pair, composed of aligned source and target terms, is valid. We trained and evaluated different classifiers on a list of manually labeled candidate pairs obtained after the implementation of our extraction system. The best results in a fivefold cross-validation setting were achieved with the Radial Basis Function Support Vector Machine classifier, giving a F1 score of 82.09% and accuracy of 78.49%.

Download Full-text

Corpus-based contrastive analysis and translation universals

Babel ◽

10.1075/babel.55.4.01rab ◽

2009 ◽

Vol 55 (4) ◽

pp. 303-328 ◽

Cited By ~ 14

Author(s):

Rosa Rabadán ◽

Belén Labrador ◽

Noelia Ramón

Keyword(s):

Foreign Language ◽

Language Teaching ◽

Foreign Language Teaching ◽

Target Language ◽

Contrastive Analysis ◽

Source Language ◽

Translation Quality ◽

Translation Practice ◽

Translator Training ◽

The University

Project) developed at the University of León (Spain) for identifying instances of low-quality rendering of grammatical features when translating from English into Spanish using translation universals. The analysis provides information about: i) the resources available (or absence thereof) in each of the languages to express a given meaning and their relative centrality; ii) the solutions favored by translators to bridge the cross-linguistic disparities and/or gaps; iii) the erroneous or non-existent uses and structures transferred from the source language into the target language. These results can be systematized in terms of simplification, interference, or unique grammatical features. Additional areas that can benefit from this type of research are translation practice, translator training and foreign language teaching (FLT). Assessing translation quality is generally seen as a difficult task because of the inadequacy of the tools available. The aim of this paper is to demonstrate the usefulness of a corpus-based contrastive methodology (ACTRES

Download Full-text

Bridging Pragmatic Gap in Translation Process through Developing Pragmatic Awareness

Journal for the Study of English Linguistics ◽

10.5296/jsel.v4i1.9667 ◽

2016 ◽

Vol 4 (1) ◽

pp. 98

Author(s):

Vahid Rafieyan

Keyword(s):

Undergraduate Students ◽

The Other ◽

Target Language ◽

Source Text ◽

Translation Process ◽

Source Language ◽

Translation Quality ◽

Pragmatic Awareness ◽

Positive Effect

In order for the translator to be able to translate the source text into the target language in a relevant way, the strata of the translated text through which relevance can be obtained (pragmatic, pragmatic-semantic, and semantic strata) should be equalized to that of the source text (Li & Luo, 2004). The translator can achieve this by raising his/her awareness of the source and target language pragmatic perspectives. To investigate the actual effect of developing knowledge of pragmatic perspectives of the source language and the target language on the quality of translation of culture-bound texts, the current study was conducted on 64 Iranian undergraduate students of English translation. The study consisted of three phases: 1) administering a culture-bound text to be translated by all participants, 2) dividing participants into two groups: one merely receiving translation exercises while the other receiving metapragmatic discussions of the pragmatic perspectives of the source language along with translation exercises, and 3) assessing the translation quality of both groups immediately and two months following the treatment. The study revealed the significant positive effect of pragmatic instruction on improving the quality of translation of culture-bound texts and maintaining the obtained knowledge. The pedagogical implications of the findings suggested incorporating the pragmalinguistic and sociopragmatic perspectives of the source language and their distinctions with the pragmalinguistic and sociopragmatic perspectives of the target language into translation classes as an integral part of translation classes.

Download Full-text

On Mean Dependency Distance as a Metric of Translation Quality Assessment

Indian Journal of Language and Linguistics ◽

10.54392/ijll2143 ◽

2021 ◽

Vol 2 (4) ◽

pp. 23-30

Author(s):

Chenliang Zhou

Keyword(s):

Quality Assessment ◽

Target Language ◽

Efl Learners ◽

Source Language ◽

Translation Quality ◽

Automated Translation ◽

Distance Data ◽

Senior Students ◽

Junior Students ◽

Different Levels

This paper has adopted a quantitative approach to carry out a linguistic study, within the theoretical framework of dependency grammar. Translation is a process where source language and target language interact with each other. The present study aims at exploring the feasibility of mean dependency distance as a metric for automated translation quality assessment. The current research hypothesized that different levels of translation are significantly different in the aspect of mean dependency distance. Data of this study were based on the written translation in Parallel Corpus of Chinese EFL Learners which was composed of translations from Chinese EFL learners in various topic. The translations were human-scored to determine the levels of translation, according to which the translations were categorized. Our results indicated that: (1) senior students perform better in translation than junior students, and mean dependency distance of translations from senior group is significantly shorter than the junior; (2) high quality translations yield shorter mean dependency distance than the low quality translations; (3) mean dependency distance of translations is moderately correlated with the human score. The resultant implication suggests the potential for mean dependency distance in differentiating translations of different quality.

Download Full-text

Maintaining Sentiment Polarity in Translation of User-Generated Content

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0010 ◽

2017 ◽

Vol 108 (1) ◽

pp. 73-84 ◽

Cited By ~ 2

Author(s):

Pintu Lohar ◽

Haithem Afli ◽

Andy Way

Keyword(s):

Empirical Evaluation ◽

Sentiment Classification ◽

Target Language ◽

User Generated Content ◽

Translation Process ◽

Source Language ◽

Translation Quality ◽

Per Se ◽

Positive Sentiment ◽

Share Information

Abstract The advent of social media has shaken the very foundations of how we share information, with Twitter, Facebook, and Linkedin among many well-known social networking platforms that facilitate information generation and distribution. However, the maximum 140-character restriction in Twitter encourages users to (sometimes deliberately) write somewhat informally in most cases. As a result, machine translation (MT) of user-generated content (UGC) becomes much more difficult for such noisy texts. In addition to translation quality being affected, this phenomenon may also negatively impact sentiment preservation in the translation process. That is, a sentence with positive sentiment in the source language may be translated into a sentence with negative or neutral sentiment in the target language. In this paper, we analyse both sentiment preservation and MT quality per se in the context of UGC, focusing especially on whether sentiment classification helps improve sentiment preservation in MT of UGC. We build four different experimental setups for tweet translation (i) using a single MT model trained on the whole Twitter parallel corpus, (ii) using multiple MT models based on sentiment classification, (iii) using MT models including additional out-of-domain data, and (iv) adding MT models based on the phrase-table fill-up method to accompany the sentiment translation models with an aim of improving MT quality and at the same time maintaining sentiment polarity preservation. Our empirical evaluation shows that despite a slight deterioration in MT quality, our system significantly outperforms the Baseline MT system (without using sentiment classification) in terms of sentiment preservation. We also demonstrate that using an MT engine that conveys a sentiment different from that of the UGC can even worsen both the translation quality and sentiment preservation.

Download Full-text

KUALITAS HASIL TERJEMAHAN GOOGLE TRANSLATE DARI BAHASA ARAB KE BAHASA INDONESIA

Al Mi'yar: Jurnal Ilmiah Pembelajaran Bahasa Arab dan Kebahasaaraban ◽

10.35931/am.v3i1.205 ◽

2020 ◽

Vol 3 (1) ◽

pp. 127

Author(s):

Hidayatul Khoiriyah

Keyword(s):

Key Words ◽

Machine Translation ◽

Human Life ◽

Target Language ◽

Arabic Text ◽

Grammatical Structure ◽

Source Language ◽

Translation Quality ◽

Bahasa Indonesia

The development of technology has a big impact on human life. The existence of a machine translation is the result of technological advancements that aim to facilitate humans in translating one language into another. The focus of this research is to examine the quality of the google translate machine in terms of vocabulary accuracy, clarity, and reasonableness of meaning. Data of mufradāt taken from several Arabic translation dictionaries, while the text is taken from the phenomenal work of Dr. Aidh Qorni in the book Lā Tahzan. The method used in this research is the translation critic method. The results showed that in terms of the accuracy of vocabulary and terms, Google Translate has a good translation quality. In terms of clarity and reasonableness of meaning, google translate has not been able to transmit ideas from the source language well into the target language. Furthermore, in grammatical, the results of the google translate translation do not have a grammatical arrangement, the results of the google translate translation do not have a good grammatical structure and are by following the rules that applied in the target Indonesian language.From the data, it shows that google translate should not be used as a basis for translating an Arabic text into Indonesian, especially in translating verses of the Qur'ān and Hadīts. A beginner translator should prefer a dictionary rather than using google translate to effort and improve the ability to translate.Key Words: Translation, Google Translate, Arabic

Download Full-text

Large-scale Word Alignment Using Soft Dependency Cohesion Constraints

Parents Perspective on Translation Quality of Children Bilingual Storybooks

Constructing a Large-Scale English-Persian Parallel Corpus

Translation Analysis of Taxis in “The Old Man and the Sea” Novel (Systemic Functional Linguistics Approach)

KAJIAN TEKNIK, METODE, IDEOLOGI PENERJEMAHAN PADA KOMIK BABY BLUES SIAGA SATU ANAK PERTAMA KARYA RICK KIRKMAN DAN JERRY SCOTT DAN PENGARUHNYA TERHADAP KUALITAS TERJEMAHAN

Two approaches to compilation of bilingual multi-word terminology lists from lexical resources

Corpus-based contrastive analysis and ­translation universals

Bridging Pragmatic Gap in Translation Process through Developing Pragmatic Awareness

On Mean Dependency Distance as a Metric of Translation Quality Assessment

Maintaining Sentiment Polarity in Translation of User-Generated Content

KUALITAS HASIL TERJEMAHAN GOOGLE TRANSLATE DARI BAHASA ARAB KE BAHASA INDONESIA

Corpus-based contrastive analysis and translation universals