scholarly journals Utilizing lexical data from a Web-derived corpus to expand productive collocation knowledge

ReCALL ◽  
2010 ◽  
Vol 22 (1) ◽  
pp. 83-102 ◽  
Author(s):  
Shaoqun Wu ◽  
Ian H. Witten ◽  
Margaret Franken

AbstractCollocations are of great importance for second language learners, and a learner’s knowledge of them plays a key role in producing language fluently (Nation, 2001: 323). In this article we describe and evaluate an innovative system that uses a Web-derived corpus and digital library software to produce a vast concordance and present it in a way that helps students use collocations more effectively in their writing. Instead of live search we use an off-line corpus of short sequences of words, along with their frequencies. They are preprocessed, filtered, and organized into a searchable digital library collection containing 380 million five-word sequences drawn from a vocabulary of 145,000 words. Although the phrases are short, learners can browse more extended contexts because the system automatically locates sample sentences that contain them, either on the Web or in the British National Corpus. Two evaluations were conducted: an expert user tested the system to see if it could generate suitable alternatives for given text fragments, and students used it for a particular exercise. Both suggest that, even within the constraints of a limited study, the system could and did help students improve their writing.

2015 ◽  
Vol 9 (1) ◽  
pp. 209
Author(s):  
Bei Yang

<p>As an important yet intricate linguistic feature in English language, synonymy poses a great challenge for second language learners. Using the 100 million-word British National Corpus (BNC) as data and the software Sketch Engine (SkE) as an analyzing tool, this article compares the usage of <em>learn</em> and <em>acquire </em>used in natural discourse by conducting the analysis of concordance, collocation, word sketches and sketch difference. The results show that different functions of SkE can make different contributions to the discrimination of <em>learn</em> and <em>acquire</em>. Pedagogical implications are discussed when the results are introduced into the classroom.</p>


Author(s):  
Shaoqun Wu ◽  
Ian H. Witten

We use digital library technology to help language learners express themselves by capitalizing on the human-generated text available on the Web. From a massive collection of n-grams and their occurrence frequencies we extract sequences that begin with the word “I”, sequences that begin a question, and sequences containing statistically significant collocations. These are preprocessed, filtered, and organized as a digital library collection using the Greenstone software. Users can search the collection to see how particular words are typically used and browse by syntactic class. The digital library is richly interconnected to other resources. It includes links to external vocabularies and thesauri so that users can retrieve words related to any term of interest, and links the collection to the web by locating sample sentences containing these patterns and presenting them to the user. We have conducted an evaluation of how useful the system is in helping students, and the impact it has on their writing. Finally, language activities generated from the digital library content have been designed to help learners master important emotion related vocabulary and expressions. We predict that the application of digital library technology to assist language students will revolutionize second language learning.


2018 ◽  
Vol 9 (3) ◽  
pp. 104
Author(s):  
H. Gülru Yüksel ◽  
Suzan Kavanoz

Metadiscourse is essential in establishing pragmatically effective academic written communication. However, little is known about how metadiscourse is used in written texts produced by tertiary level second language learners. This corpus-based linguistic research study aims to explore the frequencies and usages of metadiscourse markers in student essays written by Turkish learners of English and investigate the divergences from native speaker norms. As reference corpora, British Academic Written English (BAWE) and British National Corpus (BNC) were used. We found that in academic discourse, regardless of experience in writing (novice or expert) and L1 language background, interpersonal metadiscourse markers are used more frequently than textual metadiscourse markers. The commonalities between novice non-native and expert native writers together with differences between two native speaker groups suggest that pragmatic competence, particularly metadiscourse use, develops by experience regardless of L1 background.


2010 ◽  
Author(s):  
Katherine J. Midgley ◽  
Laura N. Soskey ◽  
Phillip J. Holcomb ◽  
Jonathan Grainger

Sign in / Sign up

Export Citation Format

Share Document