learner corpus
Recently Published Documents


TOTAL DOCUMENTS

399
(FIVE YEARS 150)

H-INDEX

20
(FIVE YEARS 2)

Author(s):  
Barry Kavanagh

This study aims to explore potential reasons why the use of the tools and methods of corpus linguistics are not prevalent in English teaching in Norway, using the research question What do in-service English teachers in Norway find useful about corpora and what do they find challenging? The study provides interview data from in-service teachers, contributing to our understanding of the in-service perspective on corpora. The research design consists of teaching corpus use in seminars for in-service English teachers (featuring LancsLex, the concordancer AntConc and the OANC), integrated into a language course that is part of a further education programme, and semi-structured interviews with four of the students who took the course, during which they also interacted with Netspeak, SKELL and COCA. As with previous research, the in-service teachers found corpora particularly useful for teaching and learning vocabulary, and found challenges to use which are categorized here as usability (criticism of AntConc), IT challenges (a lack of IT skills among teachers), learner-corpus interaction challenges (the complexity of software and concordance lines for pupils; pupil uninterest in language), and lack of teacher need (mistakes being “obvious” to teachers in the lower years). The article discusses some implications of these findings. Keywords: English language teaching, pedagogical corpus application, corpora           


2021 ◽  
Vol 11 (4) ◽  
pp. 607-627
Author(s):  
Erdem Akbaş ◽  
Zeynep Ölçü-Dinçer

The present study empirically scrutinizes the fixed natural order of grammatical morphemes relying on a manual analysis of an EFL learner corpus. Specifically, we test whether the accuracy order of L2 grammatical morphemes in the case of L1 Turkish speakers of English deviates from Krashen’s (1977) natural order and whether proficiency levels play a role in the order of acquisition of these morphemes. With this in mind, we focus on the (in)accuracy of nine English grammatical morphemes with 2883 cases manually tagged by the UAM Corpus Tool in the written exam scripts of Turkish learners of English. The results based on target-like use scores provide evidence for deviation from what is widely believed to be a set order of acquisition of these grammatical morphemes by second language learners. In light of such findings, we challenge the view that the internally driven processes of mastering grammatical morphemes in English for interlanguage users are largely independent of their L1. Regardless of L2 grammar proficiency in our data, the observed accuracy of some morphemes ranked low in comparison with the so-called natural order. These grammatical morphemes were almost exclusively non-existent features in participants’ mother tongue (e.g., third person singular –s, articles and the irregular past tense forms), thus suggesting the influence of L1 in this respect.


2021 ◽  
Vol 27 (4) ◽  
pp. 144-156
Author(s):  
Shazila Abdullah ◽  
Roslina Abdul Aziz ◽  
Rafidah Kamaruddin

2021 ◽  
Author(s):  
James Algie

Accuracy in written L2 production can be influenced by many factors, including: (a) the relative similarity of the target structure to equivalent structure in the learner’s L1, and (b) the complexity of the target structure itself. The question of which of these two factors plays a stronger role is fundamental to theories of L2 acquisition. This written learner corpus study uses the English genitive alternation – s-genitives (‘the country's future’) and of-genitives (‘the future of the country’) – to attempt to shed light on this issue. L1 Spanish speakers lag behind L1 Japanese speakers in terms of accuracy rates when the target structure is an s-genitive. This L1 influence appears secondary to structural complexity effects; learners in both groups consistently use the simpler of-genitive with far higher accuracy. Both L1 and complexity effects are stronger in plural possessor contexts, with the plural feature apparently exacerbating learner difficulties with the s-genitive.


Author(s):  
A. S. Vyrenkova ◽  
I. Yu. Smirnov

Learner corpora serve as one of the most valuable sources of statistical data on learners' errors. For instance, data from foreign-language learners’ corpora can be used for the Second Language Acquisition research. However, corpora representativity strongly depends on the quality of its error markup, which is most frequently carried out manually and thus presents a time-consuming and painstaking routine for the annotators. To make annotation process easier, additional tools, such as spellcheckers, are usually used. This paper focuses on developing a program for automatic correction of derivational errors made by learners of Russian as a foreign language. Derivational errors, which are not common for adult Russian native speakers (L1), but occur quite often in written texts or speech of Russian as foreign language learners (L2) [Chernigovskaya, Gor, 2000], were chosen as scope of our research because correction of such mistakes presents a formidable challenge for existing spellcheckers. Using the data from the Russian Learner Corpus (http://www.web-corpora.net/RLC/), we tested two already existing approaches to solve such kind of problems. The first one is based on a finite state automaton principle developed by Dickinson and Herring 2008, and it was test-ed as algorithm for derivational errors detection. The second one which relies on the Noisy Channel model by Brill and Moore, 2000, was used for studying errors correction. After we analyzed effectiveness of these tests, we developed our own system for autocorrection of derivational errors. In our program the algorithm of Dickinson and Herring was used as word-formation error detection module. The Noisy Channel model has been rejected, and we decided to use instead the Continuous Bag of Words FastText model, based on Harris distributional semantics theory [1954]. In addition, filtering rules have been developed for correcting frequent errors that the model is unable to handle. To restore automatically the correct grammatical word form, dictionary of word paradigms is used. Model results were validated on the data of Russian Learner Corpus.


Author(s):  
Cristóbal Lozano ◽  
Joana Teixeira ◽  
Ana Madeira

This paper presents the L1 Portuguese – L2 Spanish subcorpus of Corpus Escrito del Español L2 (CEDEL2), a new methodological resource for second language acquisition (SLA) research, which is freely searchable and downloadable (http://cedel2.learnercorpora.com). CEDEL2 is a large-scale, multi-L1 learner corpus of L2 Spanish which contains written productions from learners at all proficiency levels as well as 6 native control subcorpora (total size: over 1,100,000 words from over 4,000 participants). CEDEL2 follows strict corpus design criteria (Sinclair, 2005) and learner corpus design recommendations (Tracy-Ventura & Paquot, 2021a). In its current version (CEDEL2 v. 2), its Portuguese component includes an L1 Portuguese – L2 Spanish subcorpus, with 21,662 words written by 164 participants, and an L1 Portuguese native subcorpus, with 3,500 words from 16 L1 speakers of European Portuguese. Thanks to their design features (e.g., same design across subcorpora, inclusion of metadata about SLA-relevant variables, dual native control subcorpora) and freely available web interface, CEDEL2 and its Portuguese subcorpora allow researchers to investigate a wide range of topics in SLA.


2021 ◽  
Vol 11 (2) ◽  
pp. 261-278
Author(s):  
Anne Golden

Abstract In this article I investigate to what extent the use of metaphorical expressions in language learners’ texts vary according to the topic they have chosen to write about. The data come from the Norwegian learner corpus ASK, where the texts are from written assignments produced by adult second-language learners as part of an official Norwegian test and texts. Texts from two different prompts are selected, which are related to friendship and nature. Metaphors are defined according to conceptual metaphor theory and a triangulation of methods is used, alternating between a manual and an automatic extraction method. The results confirm the hypothesis that the two different prompts given to the learners in a language test not only triggers different metaphorical expressions but also influences the amount of metaphor used in the learners’ writing. This knowledge is important to researchers for comparing the use of metaphors between different groups, such as between different learners or between students in different stages of education. It is also important for test designers who decide on topics to be used in tests and teachers who help learners prepare for their tests. In addition, it is of interest for researchers, educators in general and the learners themselves who are interested in the effect the use of metaphors in texts have on raters’ evaluations in high-stake tests.


2021 ◽  
Vol 7 (2) ◽  
pp. 275-289
Author(s):  
Nadine Herry-Bénit ◽  
Stéphanie Lopez ◽  
Takeki Kamiyama ◽  
Jeff Tennant

Abstract This article presents the IPCE-IPAC corpus, an ongoing project, which has been collected in France, Italy, Spain and China since 2014. The data is collected to investigate the acquisition of segmental and suprasegmental phenomena by L2 learners of English, with a focus on phonemes. The article discusses the methods for the compilation of this original spoken learner corpus, designed to study L2 “interphonology” (Detey, Racine, Kawaguchi, & Zay, 2016), or interlanguage phonology.


Sign in / Sign up

Export Citation Format

Share Document