Korean lexical bundles in conversation and academic texts

Corpora ◽  
2009 ◽  
Vol 4 (2) ◽  
pp. 135-165 ◽  
Author(s):  
YouJin Kim

Over the past few decades, there has been growing interest in phraseology and the extent to which natural language use consists of prefabricated linguistic chunks ( Biber et al., 2004 ; Ellis, 1996 ; Howarth, 1998 ; Moon, 1997 ; and Wray, 2002 ). One approach to this issue has involved the analytical construct of ‘lexical bundles’, which are defined simply as the most common recurrent sequences of words in a register. This study explores the use of multi-word sequences in Korean based on Biber et al.'s (1999) frequency-driven approach to the identification of lexical bundles. Building on previous lexical bundle studies in English and Spanish ( Biber et al., 2004 ; Cortes, 2004 ; and Tracy-Ventura et al., 2007 ), the study examines lexical bundles in a large corpus of Korean texts consisting of academic prose and conversation. The analysis focusses on three major characteristics: the overall distribution of bundles, their typical structures, and their typical functions in discourse. More lexical bundles were identified in academic prose than in conversation. The findings support the notion that lexical bundles should be regarded as a basic linguistic construct with important functions for the construction of discourse for different languages.

2017 ◽  
Vol 3 (1) ◽  
pp. 43
Author(s):  
Chih-Wei Kuo

An ongoing discussion on the disciplinary nature of educational technology has been taking place for years. Some view this discipline from the perspective of instructional design and implementation, whereas others conceptualize it from the perspective of media, tool, and system. This study examined educational technology from the perspective of language use by empirically investigating a special sequence of words, referred to as lexical bundles, in educational technology research articles. It aims to capture the distinctive nature of educational technology as soft technology and examine possible associations of educational technology with relevant disciplines. Employing a text analysis tool of AntConc 3.4.3, the researcher compiled a corpus encompassing 323 research articles from six journals with approximately 2.1 million words to identify lexical bundles. All identified bundles were analyzed and further compared with past relevant studies based on the number of different bundles, the content of bundles, and the grammatical structure of bundles. It was found that educational technology as an inter-discipline resembles much more soft science fields in terms of the content and structural categories of bundles. This study not only contributes to a better conceptual understanding of the nature of educational technology but offers a pedagogically beneficial bundle list for informing academic writing instruction in this field.


Corpora ◽  
2019 ◽  
Vol 14 (3) ◽  
pp. 327-349
Author(s):  
Craig Frayne

This study uses the two largest available American English language corpora, Google Books and the Corpus of Historical American English (coha), to investigate relations between ecology and language. The paper introduces ecolinguistics as a promising theme for corpus research. While some previous ecolinguistic research has used corpus approaches, there is a case to be made for quantitative methods that draw on larger datasets. Building on other corpus studies that have made connections between language use and environmental change, this paper investigates whether linguistic references to other species have changed in the past two centuries and, if so, how. The methodology consists of two main parts: an examination of the frequency of common names of species followed by aspect-level sentiment analysis of concordance lines. Results point to both opportunities and challenges associated with applying corpus methods to ecolinguistc research.


2014 ◽  
Vol 10 (1) ◽  
pp. 81-93
Author(s):  
Laurel Smith Stvan

Examination of the term stress in naturally occurring vernacular prose provides evidence of three separate senses being conflated. A corpus analysis of 818 instances of stress from non-academic texts in the Corpus of Contemporary American English (COCA) and the Corpus of American Discourses on Health (CADOH) shows a negative prosody for stress, which is portrayed variously as a source outside the body, a physical symptom within the body and an emotional state. The data show that contemporary speakers intermingle the three senses, making more difficult a discussion between doctors and patients of ways to ‘reduce stress’, when stress might be interpreted as a stressor, a symptom, or state of anxiety. This conflation of senses reinforces the impression that stress is pervasive and increasing. In addition, a semantic shift is also refining a new sense for stress, as post-traumatic stress develops as a specific subtype of emotional stress whose use has increased in circulation in the past 20 years.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Mo Li ◽  
Xiaotian Zhang ◽  
Barry Lee Reynolds

Abstract The use of formulaic language in written discourse is an important indicator of language competence. Nonetheless, the features of lexical bundles used by lower proficiency English as a Foreign Language learners have received little attention. The present study addressed this gap by employing a corpus-based method to investigate the quantity, function, and quality of four-word lexical bundles produced by low proficiency L2 English writers with 11 different L1 backgrounds in response to a timed English writing assessment. The investigation was specifically anchored on the data extracted from 1,330 essays using Wordsmith 7.0. Results of the investigation showed (1) an over dependence on writing topic related bundles; (2) an Indo-European L1 language background positively influencing lexical bundle production; (3) an overuse of stance expressions and discourse organizers at the expense of referential expression usage; (4) L1 Japanese, Korean, and Telugu writers producing more accurate lexical bundles and L1 German writers producing fewer accurate lexical bundles; and (5) the frequent use of lexical bundles not leading to highly accurate and appropriate use of lexical bundles. The implications of these results were discussed in connection with foreign language education.


2012 ◽  
Vol 35 (2) ◽  
pp. 169-195 ◽  
Author(s):  
Torben Juel Jensen ◽  
Marie Maegaard

The article presents a real-time study of standardization and regionalization processes with respect to the use of past participles of strong verbs in the western part of Denmark. Analyses of a large corpus of recordings of informants from two localities show that the use of the dialectalenform of the past participle suffix has been in decline during the last 30 years. Theenforms are replaced by three other forms, one of which is (partly) dialectal, one regional and one standard Danish. The study indicates that a regionalization process has taken place prior to the time period studied, but that it has now been overtaken by a Copenhagen-based standardization process. The study also shows interesting differences between the two localities, arguably due to the geographical location and size, and to the status of the different participle forms in the traditional local dialects.


2021 ◽  
Vol 27 (1) ◽  
pp. 46-63
Author(s):  
Gilberto Gomes

External negation of conditionals occurs in sentences beginning with ‘It is not true that if’ or similar phrases, and it is not rare in natural language. A conditional may also be denied by another with the same antecedent and opposite consequent. Most often, when the denied conditional is implicative, the denying one is concessive, and vice versa. Here I argue that, in natural language pragmatics, ‘If $A$, $\sim B$’ entails ‘$\sim$(if $A, B$)’, but ‘$\sim$(if $A, B$)’ does not entail ‘If $A$, $\sim B$’. ‘If $A, B$’ and ‘If $A$, $\sim B$’ deny each other, but are contraries, not contradictories. Truth conditions that are relevant in human reasoning and discourse often depend not only on semantic but also on pragmatic factors. Examples are provided showing that sentences having the forms ‘$\sim$(if $A, B$)’ and ‘If $A$, $\sim B$’ may have different pragmatic truth conditions. The principle of Conditional Excluded Middle, therefore, does not apply to natural language use of conditionals. Three squares of opposition provide a representation the aforementioned relations.


2019 ◽  
Vol 7 ◽  
Author(s):  
Tomáš Hlava

In English language instruction in Slovakia, a strong preference for declarative knowledge at the expense of procedural knowledge development has been reported over the last two decades. However, the cognitive aspects of language attainment predict no impact of instructional efforts, since mental representations of language to be attained are told to be supported by different cognitive systems than associative learning develops. Language variation materializes differences among languages based on differences in digitalizing the experience and thus understanding the world. For Slovak learners, the English present perfect is one such anomaly in categorization. This paper aims to answer what the specific interactions between past simple and present perfect are and how the predicted cognitive aspects of language attainment influence the use of different types of knowledge. A proficiency test focusing on declarative knowledge and language use without context and in context was distributed to 600 Slovak learners of English at the ISCED3a level. In Past simple conditions, students proved highly proficiency in all 3 types of tasks. In present perfect conditions, declarative knowledge strongly dominated over language use in context. In Present perfect conditions, substitutions by past simple were significantly more frequent than substitutions of present perfect by past simple. Cognitive funneling was recognized as a process inhibiting fast proceduralization of the English present perfect compared to fast and reliable proceduralization of the past simple.


Author(s):  
Xiaohan Guan ◽  
Jianhui Han ◽  
Zhi Liu ◽  
Mengmeng Zhang

Many tasks of natural language processing such as information retrieval, intelligent question answering, and machine translation require the calculation of sentence similarity. The traditional calculation methods used in the past could not solve semantic understanding problems well. First, the model structure based on Siamese lack of interaction between sentences; second, it has matching problem which contains lacking position information and only using partial matching factor based on the matching model. In this paper, a combination of word and word’s dependence is proposed to calculate the sentence similarity. This combination can extract the word features and word’s dependency features. To extract more matching features, a bi-directional multi-interaction matching sequence model is proposed by using word2vec and dependency2vec. This model obtains matching features by convolving and pooling the word-granularity (word vector, dependency vector) interaction sequences in two directions. Next, the model aggregates the bi-direction matching features. The paper evaluates the model on two tasks: paraphrase identification and natural language inference. The experimental results show that the combination of word and word’s dependence can enhance the ability of extracting matching features between two sentences. The results also show that the model with dependency can achieve higher accuracy than these models without using dependency.


1995 ◽  
Vol 107-108 ◽  
pp. 89-111
Author(s):  
Jan Daugaard ◽  
Sabine Kirchmeier-Andersen ◽  
Lene Schøsler

Abstract The above research team has for the past 4 years been working on a database of valency schemes for 4,000 Danish verbs. First we present the underlying theoretical assumptions for the creation of valency schemes. Then the tools to perform automatic extraction of valency information from corpora are described. Finally, the results are presented. Keywords: natural language parsing, Danish, lexical valency, the Pronominal Approach, corpus analysis.


Sign in / Sign up

Export Citation Format

Share Document