Subcategorization frame identification for learner English

Author(s):  
Yan Huang ◽  
Akira Murakami ◽  
Theodora Alexopoulou ◽  
Anna Korhonen

Abstract As large-scale learner corpora become increasingly available, it is vital that natural language processing (NLP) technology is developed to provide rich linguistic annotations necessary for second language (L2) research. We present a system for automatically analyzing subcategorization frames (SCFs) for learner English. SCFs link lexis with morphosyntax, shedding light on the interplay between lexical and structural information in learner language. Meanwhile, SCFs are crucial to the study of a wide range of phenomena including individual verbs, verb classes and varying syntactic structures. To illustrate the usefulness of our system for learner corpus research and second language acquisition (SLA), we investigate how L2 learners diversify their use of SCFs in text and how this diversity changes with L2 proficiency.

ReCALL ◽  
2007 ◽  
Vol 19 (3) ◽  
pp. 252-268 ◽  
Author(s):  
Sylviane Granger ◽  
Olivier Kraif ◽  
Claude Ponton ◽  
Georges Antoniadis ◽  
Virginie Zampa

AbstractLearner corpora, electronic collections of spoken or written data from foreign language learners, offer unparalleled access to many hitherto uncovered aspects of learner language, particularly in their error-tagged format. This article aims to demonstrate the role that the learner corpus can play in CALL, particularly when used in conjunction with web-based interfaces which provide flexible access to error-tagged corpora that have been enhanced with simple NLP techniques such as POS-tagging or lemmatization and linked to a wide range of learner and task variables such as mother tongue background or activity type. This new resource is of interest to three main types of users: teachers wishing to prepare pedagogical materials that target learners' attested difficulties; learners themselves for editing or language awareness purposes and NLP researchers, for whom it serves as a benchmark for testing automatic error detection systems.


Author(s):  
Cristóbal Lozano ◽  
Joana Teixeira ◽  
Ana Madeira

This paper presents the L1 Portuguese – L2 Spanish subcorpus of Corpus Escrito del Español L2 (CEDEL2), a new methodological resource for second language acquisition (SLA) research, which is freely searchable and downloadable (http://cedel2.learnercorpora.com). CEDEL2 is a large-scale, multi-L1 learner corpus of L2 Spanish which contains written productions from learners at all proficiency levels as well as 6 native control subcorpora (total size: over 1,100,000 words from over 4,000 participants). CEDEL2 follows strict corpus design criteria (Sinclair, 2005) and learner corpus design recommendations (Tracy-Ventura & Paquot, 2021a). In its current version (CEDEL2 v. 2), its Portuguese component includes an L1 Portuguese – L2 Spanish subcorpus, with 21,662 words written by 164 participants, and an L1 Portuguese native subcorpus, with 3,500 words from 16 L1 speakers of European Portuguese. Thanks to their design features (e.g., same design across subcorpora, inclusion of metadata about SLA-relevant variables, dual native control subcorpora) and freely available web interface, CEDEL2 and its Portuguese subcorpora allow researchers to investigate a wide range of topics in SLA.


2019 ◽  
Vol 39 ◽  
pp. 74-92 ◽  
Author(s):  
Tony McEnery ◽  
Vaclav Brezina ◽  
Dana Gablasova ◽  
Jayanti Banerjee

AbstractIn this article we explore the relationship between learner corpus and second language acquisition research. We begin by considering the origins of learner corpus research, noting its roots in smaller scale studies of learner language. This development of learner corpus studies is considered in the broader context of the development of corpus linguistics. We then consider the aspirations that learner corpus researchers have had to engage with second language acquisition research and explore why, to date, the interaction between the two fields has been minimal. By exploring some of the corpus building practices of learner corpus research, and the theoretical goals of second language acquisition studies, we identify reasons for this lack of interaction and make proposals for how this situation could be fruitfully addressed.


Author(s):  
Aicha Rahal

Given the fact that there is a constant debate among monolinguists and pluralists, this chapter aims to explore the main developments in learner language. It focuses on the changes from second language research to learner corpus research. It is an attempt to present second language theories. Then, the chapter draws a particular attention to the limitations of second language acquisition. The discussion turns to learner corpus research to show how language changes from heterogeneinity to diversity. Language is no longer seen as monolithic entity or a standard variety but a multilingual entity.


2021 ◽  
Author(s):  
Anna Siyanova ◽  
S Spina

© 2019 Language Learning Research Club, University of Michigan In the present study, we sought to advance the field of learner corpus research by tracking the development of phrasal vocabulary in essays produced at two different points in time. To this aim, we employed a large pool of second language (L2) learners (N = 175) from three proficiency levels—beginner, elementary, and intermediate—and focused on an underrepresented L2 (Italian). Employing mixed-effects models, a flexible and powerful tool for corpus data analysis, we analyzed learner combinations in terms of five different measures: phrase frequency, mutual information, lexical gravity, delta Pforward, and delta Pbackward. Our findings suggest a complex picture, in which higher proficiency and greater exposure to the L2 do not result in more idiomatic and targetlike output, and may, in fact, result in greater reliance on low frequency combinations whose constituent words are non-associated or mutually attracted.


2019 ◽  
Vol 9 (4) ◽  
pp. 737-744
Author(s):  
Paweł Scheffler

In a large scale survey of teachers’ perceptions of the challenges they face in teaching English to young primary school learners (Copland, Garton, & Burns, 2014), some of the key issues that are identified are as follows: teaching speaking, using only English in the classroom, enhancing motivation, maintaining discipline, catering for different individual needs (including special educational needs), dealing with parents, and teaching grammar as well as reading and writing. The relevance of Early Instructed Second Language Acquisition, edited by Rokita-Jaśkow and Ellis, is clearly shown by the fact that it addresses most of these central issues.


Sign in / Sign up

Export Citation Format

Share Document