Parallel Corpora in Translation Studies: Issues in Corpus Design and Analysis

2017 ◽  
pp. 105-118 ◽  
Author(s):  
Federico Zanettin
2013 ◽  
Vol 27 ◽  
pp. 23-42 ◽  
Author(s):  
Bruno Cartoni ◽  
Sandrine Zufferey ◽  
Thomas Meyer

Europarl is a large multilingual corpus containing the minutes of the debates at the European Parliament. This article presents a method to extract different corpora from Europarl: monolingual and multilingual comparable corpora, as well as parallel corpora. Using state-of-the-art measures of homogeneity, we show that these corpora are very similar. In addition, we argue that they present many advantages for research in various fields of linguistics and translation studies, and we also discuss some of their limitations. We conclude by reviewing a number of previous studies that made use of these corpora, emphasizing in each case the possibilities offered by Europarl.


Babel ◽  
2017 ◽  
Vol 63 (1) ◽  
pp. 43-64 ◽  
Author(s):  
Miriam Seghiri

Abstract The sources of information that translators may use are extremely varied, ranging from oral consultation with an expert to a search using specialised dictionaries and glossaries. Nowadays, however, one the most relevant documentation activities in the field of Translation involves the use of Internet resources and, closely related to this, the compilation and management of virtual corpora. For this reason, in the present paper we present a systematic methodology for extracting bilingual and bidirectional glossaries (English-Spanish/Spanish-English) based on parallel corpora to translate TV User Manuals. In fact, according to art. 5 of the Council Resolution of 17 December 1998 on operating instructions for technical consumer goods (98/C 411/01) it is essential to control the quality when writing and translating these manuals. In order to illustrate this methodology we focus on corpus design (according to the skopo) and on the compilation protocol (in four steps: searching, downloading, text formatting and saving data) in order to ensure quality. As for the quantity, we check the quantitative representativeness with the ReCor software (cfr. Seghiri 2006: 387). Once the corpus is representative from the qualitative and the quantitative points of view, it can be managed with a concordance program. So, we illustrate how to extract the terms semiautomatically in order to build a bilingual and bidirectional glossary with a parallel concordance named ParaConc. Thus, in the present paper we combine the main resource for researchers (cfr. Bowker 1998; Varantola 2000; Seghiri 2011) within the Translation field: corpora, in order to ensure quality; and the main documentation resource for prospective translators (cfr. Corpas et al. 2001): bilingual glossaries.


2017 ◽  
Vol 22 (2) ◽  
pp. 270-297 ◽  
Author(s):  
Maïté Dupont ◽  
Sandrine Zufferey

Abstract The recent emergence of large parallel corpora has represented a leap ahead for cross-linguistic and translation studies. However, the specificities of these corpora and their influence on the nature of observed linguistic phenomena remain underexplored, especially in the field of contrastive linguistics. In this study, we compare the translation equivalences of four concessive adverbial connectives in English and in French across three corpora varying along three dimensions: register, directionality of the translation and translator expertise. Our results indicate that these dimensions affect the cross-linguistic equivalences observed between connectives. We conclude that, in future work, translation-based claims about cross-linguistic equivalences should be balanced according to the type of data analysed. We also identify a pressing need for more rigorously-documented parallel corpora for the English-French language pair.


2010 ◽  
Vol 55 (2) ◽  
pp. 387-408 ◽  
Author(s):  
Chunshen Zhu ◽  
Po-Ching Yip

This article presents a report on a pilot project designed to construct a platform for large-scale teaching of translation or bilingual training at tertiary level. The programme, ClinkNotes, has the potential of accommodating parallel corpora of any language pairs, although the primary data used in this project are in English and Chinese. The report begins with a brief overview of the development of corpus-based approach to translation studies in relation to that of translation teaching as a profession. It then proceeds to describe the actual design (i.e., the theoretical framework, the methodology of annotation, and the simple execution of the software programme), and how it helps to cater to the pressing needs of the profession. The prospects of further development of the programme are also discussed.


Sign in / Sign up

Export Citation Format

Share Document