Exploring register with The Prime Machine

2021 ◽  
Author(s):  
Stephen Jeaco

Abstract Corpus approaches underpin a range of postgraduate studies and professional work in language, linguistics, translation and beyond. Awareness of the influences of contextual features on language choice is important for many activities: exploring new text varieties; finding relationships between social factors and language patterning; considering choices for post-editing machine translation; and understanding the very nature of language. Work on register relies on corpus methods, but more support and direction could be offered to help undergraduates gain earlier insights into the power of such corpus analysis. This paper introduces some ways register differences can be revealed through The Prime Machine corpus tool (Jeaco 2017a) and describes the design of a practically-oriented undergraduate module which uses this concordancer. Software features include the organization of texts and presentation of source information for readymade corpora, and methods which can be used to reveal useful starting points for register analysis of do-it-yourself corpora.

1990 ◽  
Vol 89-90 ◽  
pp. 49-64
Author(s):  
Tunde Ajiboye

Abstract Studies in language use have become all the more relevant to Africa since they shifted from unilingual to multilingual situations. Multilingualism which until the 60's was not considered worthy of too serious a study by linguists has since attracted a lot of attention especially in the narrower field of sociolinguistics where attempts are being made to meet some of the challenges posed by the multiplicity of languages in otherwise homogeneous communities. African countries harbour a lot of examples : Nigeria, Kenya and Uganda, among others. There are two main ways in which the present study is different from earlier studies in multilingualism, even though, as we shall see later, the results are basically the same. In the first place, we are dealing here with a temporary situation of multilingualism in the strict sense that subjects are neither i-migrants nor natives but birds of passage whose length of stay is pre-determined (by their mission). The languages included in the interaction should therefore be seen as such. Secondly, while the study of language choice by analysts like A. Tabouret-Keller (1968), Gumperz and Eduardo (1971) and Stark (1989) seems to emphasize the connection between language use and "a variety of social factors such as ethnic identity, age, and sex..., degree of solidarity or confidentiality," (Gumperz et al. 1971:122) the nature of our samples (students) tends to demonstrate that in analysing the pattern of language choice, the effect of an external, super-imposed trigger such as the need to pass an examination may not be over-looked.


2018 ◽  
Vol 6 ◽  
pp. 145-157 ◽  
Author(s):  
Zaixiang Zheng ◽  
Hao Zhou ◽  
Shujian Huang ◽  
Lili Mou ◽  
Xinyu Dai ◽  
...  

Existing neural machine translation systems do not explicitly model what has been translated and what has not during the decoding phase. To address this problem, we propose a novel mechanism that separates the source information into two parts: translated Past contents and untranslated Future contents, which are modeled by two additional recurrent layers. The Past and Future contents are fed to both the attention model and the decoder states, which provides Neural Machine Translation (NMT) systems with the knowledge of translated and untranslated contents. Experimental results show that the proposed approach significantly improves the performance in Chinese-English, German-English, and English-German translation tasks. Specifically, the proposed model outperforms the conventional coverage model in terms of both the translation quality and the alignment error rate.


2017 ◽  
Vol 18 (4) ◽  
pp. 480-495 ◽  
Author(s):  
Gavin D’Northwood

Purpose The purpose of this paper is to examine the statements by the chairman and CEO in BP plc’s Annual Report 2010 for linguistic evidence of reader positioning. This is based on the premise that reputational fallout from the Deepwater Horizon oil spill would have heightened the need for such positioning to repair the company’s legitimacy. Design/methodology/approach Applying Halliday’s systemic functional linguistics (SFL) framework, a comparative register analysis was undertaken of the respective statements of the chairman and CEO of BP plc. This was informed by corpus analysis of these statements, of comparative statements from industry competitors and of two larger-scale corpora constructed from the chairman and CEO statements extracted from the annual reports of 25 FTSE100 companies. Findings The findings suggest that readers’ perceptions are likely to be shaped by the statements of the chairman and CEO of BP plc in the company’s 2010 annual report, but similarities and differences are apparent in the way this positioning is engineered. Broader corpus analysis hints that these similarities and differences are not localised to BP plc. Research limitations/implications The analysis relies on the assumptions that the chairman and CEO are the writers of each piece. As with prior research, questions of intent on the part of the corporate authors and impact upon target readers remain unanswered. Practical implications This paper demonstrates and highlights the issue of reader positioning through lexico-grammatical choices in corporate disclosures. Originality/value This paper makes a contribution to the literature by demonstrating how reader positioning may be engineered through lexico-grammatical choices in corporate disclosures. This paper further responds to a call from Sydserff and Weetman (1999, 2002) for interdisciplinary approaches to investigating corporate narrative reports involving linguistics, through foregrounding Halliday’s SFL framework as an analytical tool.


2019 ◽  
Vol 4 (1) ◽  
pp. 9-66
Author(s):  
April D. DeConick

Abstract This paper owes debt to the field of study known as sociology of knowledge, which is interested in the social location of groups and their constructions of knowledge and reality. This project, however, is not about ordinary knowledge, but how gnosis, the direct knowledge of a transcendent God beyond the traditional Gods, became the foundation of a new form of spirituality in antiquity, and how this form of Gnostic spirituality has reemerged in modern America, impacting traditional religious communities and fostering new religious movements. Several social factors are involved in the emergence of Gnostic spirituality, including the dislocation of the founders and collaborators of Gnostic movements, the prominence of the seeker response, the revelatory milieu in which they find themselves, their reliance on revelatory authority, their push for alternative legitimation, and their flip-and-reveal and do-it-yourself constructions of new knowledge. Gnostic countercultures arise when Gnostic spirituality is mobilized. Much of religion and society are overturned so that we find constructions of the counter-self, calls for counter-conduct, the establishment of counter-cult, the deployment of counter-media, and the emergence of modes of Gnostic esoterization. The final section turns to the awakening, transport, and occulturation of Gnostic spirituality into modernity in America via artifact migration and alpha channels like Blavatsky.


2012 ◽  
Vol 27 (1) ◽  
pp. 1-21
Author(s):  
Im Tobin

While participatory e-government is increasingly advocated, few studies have investigated whether it is feasible across all national contexts. This study investigates how certain contextual features influence the success of participatory applications of e-government. In particular, it assesses how the political, economic, and social context in which a particular government operates influence the introduction of participatory e-government, and compares participatory e-government applications in Romania and South Korea. These nations possess important similarities and differences in their political, social, and economic contexts. The study results suggest that the success of participatory e-government projects is to a large extent contingent upon political and economic factors and less related to social factors.


Informatics ◽  
2019 ◽  
Vol 6 (3) ◽  
pp. 41 ◽  
Author(s):  
Jennifer Vardaro ◽  
Moritz Schaeffer ◽  
Silvia Hansen-Schirra

This study aims to analyse how translation experts from the German department of the European Commission’s Directorate-General for Translation (DGT) identify and correct different error categories in neural machine translated texts (NMT) and their post-edited versions (NMTPE). The term translation expert encompasses translator, post-editor as well as revisor. Even though we focus on neural machine-translated segments, translator and post-editor are used synonymously because of the combined workflow using CAT-Tools as well as machine translation. Only the distinction between post-editor, which refers to a DGT translation expert correcting the neural machine translation output, and revisor, which refers to a DGT translation expert correcting the post-edited version of the neural machine translation output, is important and made clear whenever relevant. Using an automatic error annotation tool and the more fine-grained manual error annotation framework to identify characteristic error categories in the DGT texts, a corpus analysis revealed that quality assurance measures by post-editors and revisors of the DGT are most often necessary for lexical errors. More specifically, the corpus analysis showed that, if post-editors correct mistranslations, terminology or stylistic errors in an NMT sentence, revisors are likely to correct the same error type in the same post-edited sentence, suggesting that the DGT experts were being primed by the NMT output. Subsequently, we designed a controlled eye-tracking and key-logging experiment to compare participants’ eye movements for test sentences containing the three identified error categories (mistranslations, terminology or stylistic errors) and for control sentences without errors. We examined the three error types’ effect on early (first fixation durations, first pass durations) and late eye movement measures (e.g., total reading time and regression path durations). Linear mixed-effects regression models predict what kind of behaviour of the DGT experts is associated with the correction of different error types during the post-editing process.


English Today ◽  
2006 ◽  
Vol 22 (1) ◽  
pp. 3-9 ◽  
Author(s):  
Kehinde A. Ayoola

THIS PAPER recounts the challenges faced by young Nigerian writers in a climate that is hostile to new authors. The experience presented and discussed here epitomises both the dilemma and the experiences of the new generation of creative writers. The problem of language choice – English or a mother tongue – is re-examined, while exploring the various reasons, noble and not so noble, behind such matters as: choice of genre, the new writer’s response to democracy and globalization, the problem of audience recognition, and the failure of do-it-yourself publishing and marketing.


Author(s):  
Muhammad Hermawan ◽  
Herry Sujaini ◽  
Novi Safriadi

The diversity of languages makes the need for translation so that communication between individuals of different languages can be appropriately established. The statistical translator engine (SMT) was a translator engine based on a statistical approach to parallel corpus analysis. One crucial part of SMT was language modeling (LM). LM was the calculation of word probability from a corpus-based on n-grams. There was a smoothing algorithm in LM where this algorithm will bring up the probability of a word whose value was zero. This study compares the use of the best smoothing algorithm from each of the three LM according to the standard Moses, namely KenLM, SRILM, and IRSTLM. For SRILM using smoothing algorithm interpolation with Witten-bell and interpolation with Ristads natural discounting, for KenLM using interpolation with modified Kneser-ney smoothing algorithm, and for IRSTLM using modified Kneser-ney and Witten-bell algorithm which was referenced based on previously researched. This study uses a corpus of 10,000 sentences. Tests carried out by BLEU and testing by Melayu Sambas linguists. Based on the results of BLEU testing and linguist testing, the best smoothing algorithm was chosen, namely modified Kneser-ney in KenLM LM, where the average results of automated testing, for Indonesian-Melayu Sambas and vice versa were 41. 6925% and 46. 66%. Moreover, for testing linguists, the accuracy of the Indonesian-Melayu Sambas language and vice versa was 77. 3165% and 77. 9095%


Sign in / Sign up

Export Citation Format

Share Document