2 The Functions of N-grams in Bilingual and Learner Corpora: An Integrated Contrastive Approach

2021 ◽  
pp. 25-48
Author(s):  
Signe Oksefjell Ebeling ◽  
Hilde Hasselgård
Keyword(s):  
2015 ◽  
Vol Volume III (6) ◽  
pp. 1-11
Author(s):  
Anastasia Kamarauli ◽  
Mariam Kamarauli ◽  
Zakharia Pourtskhvandze
Keyword(s):  

Author(s):  
Shintaro Torigoe

This paper reports the second pilot study of the Portuguese Vocabulary Profile (PVP) project, a Portuguese vocabulary list for learners in Japan based on the Common European Framework of Reference for Languages. Inspired by the English Vocabulary Profile (Capel, 2010, 2012), the PVP takes a learner-centric approach. For this study, the author modified the first pilot version which was constructed solely from learner corpora (Torigoe, 2016a) by comparing it with a word list based on a corpus of Portuguese textbooks published in Japan. The result is a broadened vocabulary for both the elementary and intermediate levels. The major improvement is that some intuitively basic words, including numbers, months of the year, foods, and facilities, which had been previously categorized as intermediate or advanced level words or which were missing from the first version due to their low frequency were correctly categorized as the elementary level words. However, the norm of word classification remains somewhat arbitrary given that the small size of both the input (learner corpora) and the comparative data (textbook corpus) does not allow for the use of statistical methods with less frequent words.


Author(s):  
Nicolas Ballier ◽  
Philippe Martin
Keyword(s):  

2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Katrin Wisniewski

Abstract This contribution focuses on the use of the multifunctional German word form es in the learner corpora MERLIN and DISKO (1,452 texts; 3,700 manually annotated occurrences of es). These corpora cover a wide proficiency range (A1-C1), and they include an L1 control group. Due to its multiple functions, using es is assumed to be challenging for learners. After laying out its main functional features, this paper first addresses the question of whether the frequency patterns of es actually differ between L1 und L2 texts, which is shown to be true only for beginning learners, and whether differences related to learners’ L1 can be observed, which seems to be the case. Secondly, the study links the emerging use of different es types and their relative frequencies to CEFR proficiency levels. A third focus regards the accuracy of es usage, which is generally high but differs among the various es functions, with anaphoric es presenting the greatest challenge for learners. A closer look at interlanguage structures reveals that learners often omit compulsory es and that they use redundant es in peculiar syntactic slots. Furthermore, the use of anaphoric es without clear textual reference regularly encumbers the reading process of the texts.


Sign in / Sign up

Export Citation Format

Share Document