Review of Díaz-Negrillo, Ballier & Thompson (2013): Automatic Treatment and Analysis of Learner Corpus Data

2015 ◽  
Vol 1 (1) ◽  
pp. 172-177
Author(s):  
Gerold Schneider
2016 ◽  
Vol 2 (1) ◽  
Author(s):  
Peter Crosthwaite ◽  
Lavigne L.Y. Choy ◽  
Yeonsuk Bae

AbstractWe present an Integrated Contrastive Model of non-numerical quantificational NPs (NNQs, i.e. ‘some people’) produced by L1 English speakers and Mandarin and Korean L2 English learners. Learner corpus data was sourced from the ICNALE (Ishikawa, 2011, 2013) across four L2 proficiency levels. An average 10% of L2 NNQs were specific to L2 varieties, including noun number mismatches (*‘many child’), omitting obligatory quantifiers after adverbs (*‘almost people’), adding unnecessary particles (*‘all of people’) and non-L1 English-like quantifier/noun agreement (*‘many water’). Significantly fewer ‘openclass’ NNQs (e.g a number of people) are produced by L2 learners, preferring ‘closed-class’ single lexical quantifiers (following L1-like use). While such production is predictable via L1 transfer, Korean L2 English learners produced significantly more L2-like NNQs at each proficiency level, which was not entirely predictable under a transfer account. We thus consider whether positive transfer of other linguistic forms (i.e. definiteness marking) aids the learnability of other L2 forms (i.e. expression of quantification).


2012 ◽  
Vol 56 (4) ◽  
pp. 998-1021 ◽  
Author(s):  
Miguel ángel Jiménez-Crespo ◽  
Maribel Tercedor

Localization is increasingly making its way into translation training programs at university level. However, there is still a scarce amount of empirical research addressing issues such as defining localization in relation to translation, what localization competence entails or how to best incorporate intercultural differences between digital genres, text types and conventions, among other aspects. In this paper, we propose a foundation for the study of localization competence based upon previous research on translation competence. This project was developed following an empirical corpus-based contrastive study of student translations (learner corpus), combined with data from a comparable corpus made up of an original Spanish corpus and a Spanish localized corpus. The objective of the study is to identify differences in production between digital texts localized by students and professionals on the one hand, and original texts on the other. This contrastive study allows us to gain insight into how localization competence interrelates with the superordinate concept of translation competence, thus shedding light on which aspects need to be addressed during localization training in university translation programs.


2021 ◽  

This is the first book to investigate the field of phraseology from a learner corpus perspective. It includes cutting-edge studies which analyse a wide range of multiword units and extensive learner corpus data to provide the reader with a comprehensive theoretical, methodological and applied perspective onto L2 use in a wide range of situations.


2021 ◽  
pp. 162-177
Author(s):  
Antra Kļavinska ◽  

Several text corpora have been created in Latvia, including learner corpora. One of the latest projects is the Latvian Language Learner Corpus (LaVA), which contains the works of international students studying in Latvian higher education institutions who are learning Latvian as a foreign language. The texts are morphologically tagged automatically, and learner errors are tagged manually. A sufficient scope of publications is available, which provides the theoretical basis for the creation of Latvian language learner corpora; however, there is a lack of studies or practical methodological guidelines concerning the opportunities for their application, and there is little data about the use of text corpora in language acquisition. The aim of this study is to explain from the theoretical perspective for what purposes learner corpus data may be used, as well as to illustrate the methodological groundwork with examples from the LaVA corpus. Analysis of theoretical literature has demonstrated the functions and meaning of learner corpora in research, and experience with the use of corpora in acquiring a foreign language has been analysed. Examples of the use of the LaVA corpus as a didactic resource have been prepared using Corpus Linguistics methods. The study was conducted within the state research programme project “The Latvian Language”. After studying the functions of learner corpora from the theoretical perspective, it was concluded that the target audience of the LaVA corpus mainly includes teachers of Latvian as a foreign language (LATS), authors of teaching materials, as well as Latvian language learners. To facilitate the use of the LaVA corpus, it is important to have basic knowledge of Corpus Linguistics, an understanding of the theory of language, as well as an understanding of foreign language teaching methodology. LATS teachers can use the LaVA corpus data in the creation of curricula and teaching materials, in the preparation of language proficiency tests, etc. Using the inductive approach in language acquisition, language learners can also become language researchers, can analyse the errors of other learners, etc. Undeniably, the LaVA corpus can be used in broader linguistic research, for example, in contrastive interlanguage analysis, comparing the data of language learners with the data of native speakers or the data of different groups of language learners.


2020 ◽  
pp. 39-49
Author(s):  
Vitalija Kazlauskienė

The noun phrase (NP), one of the key elements of a sentence, can reveal the characteristics of a learner’s linguistic competence. The present study focuses on actual decomposition of NP in the predicate construction. This is when the attribute elements of a noun phrase are included in the predicate construct as factor-actualized determinants. In this position, the copula verb is merely a grammatical means of allowing the attribute to become a predicate (Gaulmyn, Basset 1991: 177).The aim of the study being a more thorough investigation of the criteria and peculiarities of producing this type of predicate constructs in learner language, the present research is based on the empirical material from the Lithuanian learner corpus. The paper briefly discusses the concept of the predicate construction and describes the process of compiling the learner corpus as well as the principles of data selection. The analysis of the NP in a predicate construction is then presented, and the characteristic cases from the corpus data are examined. The main limitation of the present study is related to the scope of the learner corpus. Having summarized the results of the quantitative and qualitative research, the following conclusions were formulated. The learner language is dominated by NP predicate constructions with an adjective more often than a noun. As to the verb conjunction, the typical attribute verb être ‘to be’ was the only one widely used in the corpus under investigation. The analysis of the corpus data also revealed a number of specific errors that are typical of the learners’ written language. In general, the predictive constructions in the learners’ language are characterized by the omission of redundancy tags in the grammatical categories of gender and number. The analysis of the learner corpus provided a broader look at the NP in a predicate construction, highlighting the simplicity, conciseness and repetition of this construct. The results of the study are important for a more comprehensive description of the learners’ language, for solving problems in foreign language analysis, and for contributing to the quality of teaching and learning the French language.


2018 ◽  
Vol 1 (2) ◽  
pp. 277-309 ◽  
Author(s):  
Stefan Th. Gries

Abstract This paper critically discusses how corpus linguistics in general, but learner corpus research in particular, has been dealing with all sorts of frequency data in general, but over- and underuse frequencies in particular. I demonstrate on the basis of learner corpus data the pitfalls of using aggregate data and lacking statistical control that much work is unfortunately characterized by. In fact, I will demonstrate that monofactorial methods have very little to offer at all to research on observational data. While this paper is admittedly very didactic and methodological, I think the discussion of the empirical data offered here – a reanalysis of previously published work – shows how misleading many studies potentially and provides far-reaching implications for much of corpus linguistics and learner corpus research. Ideally/maximally, this paper together with Paquot & Plonsky (2017, Intntl. J. of Learner Corpus Research) would lead to a complete revision of how learner corpus linguists use quantitative methods and study over-/underuse; minimally, this paper would stimulate a much-needed discussion of currently lacking methodological sophistication.


Sign in / Sign up

Export Citation Format

Share Document