scholarly journals Motion to and Motion through: Evidence from a Multilingual Corpus

Author(s):  
Thomas Egan ◽  
Keyword(s):  
2004 ◽  
Vol 4 (1) ◽  
pp. 13-43 ◽  
Author(s):  
Anne-Marie Simon-Vandenbergen ◽  
Karin Aijmer

The study of of course presented in this article has an applied, a descriptive and a theoretical aim. Since of course proves to be very frequent in English, learners will need to know what meanings the item has and in what pragmatic contexts it is used. It has indeed been shown that some learners tend to use of course in contexts where it is felt by native speakers to be inappropriate. In order to explain such inappropriate uses we need detailed descriptions of the semantics and pragmatics of of course. From a theoretical point of view such multifunctional items raise the question of whether semantic polysemy or pragmatic polysemy is the best explanatory account. It is argued in this paper that empirical cross-linguistic work can contribute to providing answers to all three research questions. First, the study of correspondences and differences between languages with regard to the meanings and uses of pragmatic markers is a necessary step in the explanation of learner problems. Second, the bidirectional approach to equivalents, which involves going back and forth from sources to translations, enables us to show to what extent the equivalents have partially overlapping pragmatic functions. An in-depth comparison of the semantic fields in which the translation equivalents operate is the ultimate goal. Third, the translation method helps to see to what extent a core meaning account is justified. In this paper three languages are brought into the picture, viz. English, Swedish and Dutch. The cross-linguistic data have been gathered from three translation corpora, i.e. the English-Swedish Parallel Corpus, the Oslo Multilingual Corpus and the Namur Triptic Corpus.


2015 ◽  
Vol 76 (3) ◽  
pp. 4123-4139 ◽  
Author(s):  
Yuping Lin ◽  
Yonghong Song ◽  
Yingyu Li ◽  
Fang Wang ◽  
Kai He

2007 ◽  
Author(s):  
Filip Deprez ◽  
Jan Odijk ◽  
Jan De Moortel

2017 ◽  
Vol 26 (1) ◽  
pp. 45 ◽  
Author(s):  
Matheus Rigobelo Chaud ◽  
Ariani Di Felippo

Multilingual Multi-Document Summarization aims at ranking the sentences of a cluster with (at least) 2 news texts (1 in the user’s language and 1 in a foreign language), and select the top-ranked sentences for a summary in the user’s language. We explored three concept-based statistics and one superficial strategy for sentence ranking. We used a bilingual corpus (Brazilian Portuguese-English) encoded in UNL (Universal Network Language) with source and summary sentences aligned based on content overlap. Our experiment shows that “concept frequency normalized by the number of concepts in the sentence” is the measure that best ranks the sentences selected by humans. However, it does not outperform the superficial strategy based on the position of the sentences in the texts. This indicates that the most frequent concepts are not always contained in first sentences, usually selected by humans to build the summaries because they convey the main information of the collection.Keywords: content selection; concept; statistical measure; multilingual corpus; multi-document summarization.


Author(s):  
Rohan Nanda ◽  
Llio Humphreys ◽  
Lorenzo Grossio ◽  
Adebayo Kolawole John

This paper presents a multilingual legal information retrieval system for mapping recitals to articles in European Union (EU) directives and normative provisions in national legislation. Such a system could be useful for purposive interpretation of norms. A previous work on mapping recitals and normative provisions was limited to EU legislation in English and only one lexical text similarity technique. In this paper, we develop state-of-the-art text similarity models to investigate the interplay between directive recitals, directive (sub-)articles and provisions of national implementing measures (NIMs) on a multilingual corpus (from Ireland, Italy and Luxembourg). Our results indicate that directive recitals do not have a direct influence on NIM provisions, but they sometimes contain additional information that is not present in the transposed directive sub-article, and can therefore facilitate purposive interpretation.


Corpora ◽  
2020 ◽  
Vol 15 (3) ◽  
pp. 273-290
Author(s):  
Matylda Włodarczyk ◽  
Joanna Kopaczyk ◽  
Michał Kozak

This paper introduces the Electronic Repository of Greater Poland Oaths, eROThA (1386–1446), a digitisation project of a diplomatic edition of mediaeval land court oaths recorded in Latin and Old Polish, resulting in a small, lightly tagged specialised bilingual corpus. We present the background, aims, design and methodology of the project. We also discuss the problems and limitations entrenched in turning a printed diplomatic edition into a machine-readable diplomatic edition equipped with a new interpretative layer that is sensitive to the switches between Latin and Old Polish. In addition to the automatic annotation of code-switched items on the basis of typographic characteristics of the printed edition, flexible coding of recurrent language and discourse boundary phenomena has been introduced manually to account for linguistically ambiguous or neutral forms. The project offers a fully multilingual corpus, as well as customised Polish-only and Latin-only datasets, and enables filtered metadata searches in the online front-end. Overall, the report presents a methodology for constructing multilingual corpora in the context of legal cultures in medieval Central Europe that may be extrapolated to datasets originating in other periods and regions.


Author(s):  
Tobias Warner

This chapter sketches the beginnings of literary modernity in Senegal through an analysis of a remarkable nineteenth-century collection of textual artifacts.This collection includes a multilingual corpus of poetry, calligraphy, folktales, and songs, as well as the textual components of several leather-bound protective amulets that for centuries Europeans called “fetishes.” The collection was assembled by David Boilat, a mixed-race priest, who pasted his findings into the pages of a notebook before sending them to anthropologists in Paris. Boilat’s notebook reframes the residues of many different textual practices and performance genres as texts that can be quotable, transportable, and readable in new ways. This subsumes collected artifacts into a new textual order, founded on the principle of readability. Nearly a century later, a young Léopold Senghor would incorporate some of Boilat’s collections into an early anthology of African writing in French, thereby consecrating them as literature.


Sign in / Sign up

Export Citation Format

Share Document