Institutional Collaboration in the Creation of Digital Linguistic Resources: the Case of the British Telecom Correspondence Corpus

Author(s):  
Ralph Morton ◽  
Hilary Nesi
Author(s):  
Ralph Morton ◽  
Hilary Nesi

This chapter discusses the creation of the British Telecom Correspondence Corpus (BTCC), a searchable database of letters taken from the public archives of British Telecom (BT) that were written by nearly 400 authors on a wide variety of topics between 1853 and 1982. It first discusses some experiences working on the New Connections project, funded by Jisc (formerly the Joint Information Systems Committee) and a collaboration between Coventry University, BT Heritage, and The National Archives, focusing particularly on the methodological issues encountered. The corpus was created to address a gap in existing corpus resources, and so that researchers (primarily linguists) could access and, crucially, engage with the language of the letters. Since the completion of the BTCC there have been efforts to expand the corpus to include correspondence written to and from the Post Office, an institution with many historical links to BT. This chapter addresses issues surrounding institutional collaboration in both phases of this ongoing research.


2017 ◽  
Vol 68 (2) ◽  
pp. 169-178
Author(s):  
Leonid Iomdin

Abstract Microsyntax is a linguistic discipline dealing with idiomatic elements whose important properties are strongly related to syntax. In a way, these elements may be viewed as transitional entities between the lexicon and the grammar, which explains why they are often underrepresented in both of these resource types: the lexicographer fails to see such elements as full-fledged lexical units, while the grammarian finds them too specific to justify the creation of individual well-developed rules. As a result, such elements are poorly covered by linguistic models used in advanced modern computational linguistic tasks like high-quality machine translation or deep semantic analysis. A possible way to mend the situation and improve the coverage and adequate treatment of microsyntactic units in linguistic resources is to develop corpora with microsyntactic annotation, closely linked to specially designed lexicons. The paper shows how this task is solved in the deeply annotated corpus of Russian, SynTagRus.


Author(s):  
Nancy Farriss

Continuities in written doctrinal language contrast with semantic shifts within the indigenous speech community, revealed through petitions, testaments, trial testimony, and other records, as well as modern oral evidence. As the Mesoamerican cultural matrix has itself been modified by Christian practice and visual symbols, new associations have become attached to traditional linguistic resources. At the same time the Indians have reformulated and reinterpreted the Christian message along lines consonant with traditional cosmology and moral theology. Thus cultural gaps, and along with them linguistic gaps, have narrowed through the process of religious syncretism. Mutually reinforcing influences have converged in the creation of the particular variety of religious devotion defined as Mexican Christianity.


2016 ◽  
Vol 13 (1) ◽  
pp. 15-29
Author(s):  
Agata Križan

Football is probably the world’s most popular game, with a huge number of fans. There are numerous ways in which football fans express dedication to their club and the feelings they have for their team, for example, wearing certain colours, waving banners and flags, and singing. Football anthems are nothing new for football fans, and many clubs have a long-established tradition of them. In this paper, I will address and compare the language in some popular British and Slovene football anthems, and attempt to explain its contribution to the creation of fan identity, to the fans’ sense of belonging, unity, and motivation. The linguistic analysis identities the linguistic resources used in football anthems to express attitudes, form bonds and create identities.


2020 ◽  
Vol 6 (2) ◽  
pp. 178
Author(s):  
Innocent Sourou Koutchadé

In most African writings, it is commonly noticed that culture and linguistic background affect the creation of literary idiolects. African writers use the English language in accordance with the situation in which they find themselves; they also make use of multilingual features, thus combining the English language with the linguistic resources they draw from their mother tongue. This paper aims to explore patterns of multilingualism in Mopelola: The Tale of a Beauty Goddess, a play produced by a Nigerian writer, Ayoade Okedokun. The paper mainly focuses on the linguistic and cultural influence of Yoruba that reflect the use of multilingualism features in the play. The analysis shows that there are various instances of borrowing, code-switching and transliteration representing the cultural interferences which are used to accommodate some elements of the writer’s native culture and language into the English language.


Author(s):  
Bilous O ◽  
◽  
Mishchenko A ◽  
Datska T ◽  
Ivanenko N ◽  
...  

How often students use IT resources is a key factor in the acquisition of skills associated to the new technologies. Strategies aimed at increasing student autonomy need to be developed and should offer resources that encourage them to make use of computing tools in class hours. The analysis of the modern linguistic technologies, concerning intellectual language processing necessary for the creation and function of the highly effective technologies of knowledge operation was considered in the paper under consideration. Computerization of the information sphere has triggered extensive search for solving the problem of the use of natural language mechanisms in automated systems of various types. One of them was creating Controlled languages based on a set of features which made machine translation more refined. Triggered by the economic demand, they are not artificial languages like Esperanto, but natural simplified languages, in terms of vocabulary, grammatical and syntactic structures. More than ever, the tasks of modern computer linguistics behold creating software for natural language processing, information retrieval in large data sets, support of technical authors in the process of creating professional texts and users of computer technology, hence creating new translation tools. Such powerful linguistic resources as corpora of texts, terminology databases and ontologies may facilitate more efficient use of modern multilingual information technology. Creating and improving all methods considered will help make the job of a translator more efficient. One of the programs, CLAT does not aim at producing machine translation, but allows technical editors to create flawless, sequential professional texts through integrated punctuation and spelling modules. Other programs under consideration are to be implemented in Ukrainian translation departments. Moreover, the databases considered in the paper enable studying of the dynamics of the linguistic system and developing areas of applied research such as terminography, terminology, automated data processing etc. Effective cooperation of developers, translators and declarative institutes in the creation of innovative linguistic technologies will promote further development of translation and applied linguistics.


2017 ◽  
Vol 46 (2) ◽  
pp. 207-230 ◽  
Author(s):  
Marco Santello

AbstractThis article examines the intersections between migrant experiences, multilingual practices, and the creation of space. It does so by focusing on Italians who migrated to Tasmania, a group that has long been isolated from the rest of the Italian diaspora. Using an ethnographic approach within a constructivist framework, this research shows that when experiences of movement are recounted in interaction they bring about spaces of speech that are possible thanks to the articulation of local and transnational ‘centres’, which in turn are intertwined with a rich set of linguistic resources. These resources include code-choice, codeswitching, and intentional exposure of phonological variation, and are variously combined to allow the emergence of spaces for people to move through. Spaces of speech are thus situated interactional spaces where acts of (re)telling are related to centres as spatial resources through which not only social meaning is created but also location and locution are mutually constitutive. (Spaces of speech, centres, cultural presence, Italian, Tasmania)*


2020 ◽  
Vol 15 (1) ◽  
pp. 13
Author(s):  
Anne Ferger ◽  
Hanna Hedeland

This paper describes the development of a systematic approach to the creation, management and curation of linguistic resources, particularly spoken language corpora. It also presents first steps towards a framework for continuous quality control to be used within external research projects by non-technical users, and discuss various domain and discipline specific problems and individual solutions. The creation of spoken language corpora is not only a time-consuming and costly process, but the created resources often represent intangible cultural heritage, containing recordings of, for example, extinct languages or historical events. Since high quality resources are needed to enable re-use in as many future contexts as possible, researchers need to be provided with the necessary means for quality control. We believe that this includes methods and tools adapted to Humanities researchers as non-technical users, and that these methods and tools need to be developed to support existing tasks and goals of research projects.


Sign in / Sign up

Export Citation Format

Share Document