scholarly journals A new approach to (key) keywords analysis: Using frequency, and now also dispersion

2021 ◽  
Vol 9 (2) ◽  
pp. 1-33
Author(s):  
Stefan Th. Gries

A widely-used method in corpus-linguistic approaches to discourse analysis, register/text type/genre analysis, and educational/curriculum questions is that of keywords analysis, a simple statistical method aiming to identify words that are key to, i.e. characteristic for, certain discourses, text types, or topic domains. The vast majority of keywords analyses relied on the same statistical measure that most collocation studies are using, the log-likelihood ratio, which is performed on frequencies of occurrence in two corpora under consideration. In a recent paper, Egbert and Biber (2019) advocated a different approach, one that involves computing log-likelihood ratios for word types based on the range of their distribution rather than their frequencies in the target and reference corpora under consideration. In this paper, I argue that their approach is a most welcome addition to keywords analysis but can still be profitably extended by utilizing both frequency and dispersion for keyness computations. I am presenting a new two-dimensional approach to keyness and exemplifying it on the basis of the Clinton-Trump Corpus and the British National Corpus.

2021 ◽  
Vol 18 (48) ◽  
pp. 203-2018
Author(s):  
Nataša Milivojević ◽  

The paper focuses on contrastive semantics of phase verbs or aspectualizers in English and Serbian, taking into account both typical and atypical phase verbs. Following Piper (Piper et al. 2005), we adopt the class of atypical aspectalizers in Serbian which are primarily lexical verbs but yield an aspectual meaning when combined with an aspectual complement. We specifically consider phase verbs BEGIN and START in English and their Serbian equivalents POČETI and KRENUTI. Alternatively to both traditional and more contemporary linguistic approaches to phase verbs in English and Serbian, we claim that the true overall linguistic equivalent of the English phase verb START is not Serbian phase verb POČETI, but another, atypical aspectualizer KRENUTI. We base this claim on the equivalency of contrastive syntactic complementation of the inspected aspectualizers, as well as their argument structure, taking into account Freed’s (Freed 1979: 31) traditional view of the aspectual event, where the event is segmental, containing the onset, the nucleus, and the coda. Freed’s account is combined with the lexical-projectionist model proposed by Levin (Levin 1993) alongside the grammar of constructions (Goldberg 1995, 2006). Additionally, alternatively to the generally accepted claim that all phase verbs in Serbian as a rule take imperfective verbs as their complements (Ivić 1970: 44), we claim that KRENUTI with additional, phase-related meanings frequently and productively allows for perfec- tive complementation. The present analysis is backed up by a parallel corpus of English and Serbian sentences compiled from the British National Corpus, the Corpus of Contemporary American English, and the Corpus of Contemporary Serbian Language.


2018 ◽  
Vol 14 (1) ◽  
pp. 133-167 ◽  
Author(s):  
Punjaporn Pojanapunya ◽  
Richard Watson Todd

AbstractKeyword analysis is used in a range of sub-disciplines of applied linguistics from genre analyses to critically-oriented studies for different purposes ranging from producing a general characterization of a genre to identifying text-specific ideological issues. This study compares the use of log-likelihood (LL), a probability statistic, and odds ratio (OR), an effect size statistic, for keyword identification and argues that the two methods produce different keywords applicable to research focusing on different purposes. Through two case studies, keyword analyses of advance fee scams against the British National Corpus and research articles in applied linguistics against research articles from other academic disciplines, we show that both the LL and OR keywords concern the aboutness of the corpus, but differ in their specificity and pervasiveness through the corpus. LL highlights words which are relatively common in general use serving genre purposes, whereas OR highlights more specialized words serving critically-oriented purposes. Methodological and practical contributions to keyword analysis are discussed.


2020 ◽  
pp. 007542422097914
Author(s):  
Karin Aijmer

Well has a long history and is found as an intensifier already in older English. It is argued that diachronically well has developed from its etymological meaning (‘in a good way’) on a cline of adverbialization to an intensifier and to a discourse marker. Well is replaced by other intensifiers in the fourteenth century but emerges in new uses in Present-Day English. The changes in frequency and use of the new intensifier are explored on the basis of a twenty-year time gap between the old British National Corpus (1994) and the new Spoken British National Corpus (2014). The results show that well increases in frequency over time and that it spreads to new semantic types of adjectives and participles, and is found above all in predicative structures with a copula. The emergence of a new well and its increase in frequency are also related to social factors such as the age, gender, and social class of the speakers, and the informal character of the conversation.


2014 ◽  
Vol 12 (4) ◽  
pp. 319-340
Author(s):  
Anu Koskela

This paper explores the lexicographic representation of a type of polysemy that arises when the meaning of one lexical item can either include or contrast with the meaning of another, as in the case of dog/bitch, shoe/boot, finger/thumb and animal/bird. A survey of how such pairs are represented in monolingual English dictionaries showed that dictionaries mostly represent as explicitly polysemous those lexical items whose broader and narrower readings are more distinctive and clearly separable in definitional terms. They commonly only represented the broader readings for terms that are in fact frequently used in the narrower reading, as shown by data from the British National Corpus.  


2017 ◽  
Vol 11 (5) ◽  
pp. 515-538 ◽  
Author(s):  
Zahra Mustafa-Awad ◽  
Monika Kirner-Ludwig

This article reports on the first stage of a research project on German university students’ conceptualization of Arab women and to what extent it is affected by the latters’ representation in the Western press during the Arab Spring. We combined discourse analysis and corpus-linguistic approaches to investigate the relationship between lexical items used by the students to express their attitudes toward Arab women and those featuring in news headlines about them published in British, American, and German news media. Results show that the portrayal of Arab women in Western news headlines has a clear impact on German students’ opinions of them. The findings also show that our participants tend to be aware of this effect, which could be partly due to their familiarity with discourse analysis as students of linguistics. These results have implications for incorporating media education systematically in general university courses.


2021 ◽  
Vol 3 (1) ◽  
pp. 9-21
Author(s):  
Namkil Kang

The ultimate goal of this paper is to provide a comparative analysis of rely on and depend on in the Corpus of Contemporary American English and the British National Corpus. The COCA clearly shows that the expression rely on government is the most preferred by Americans, followed by rely on people, and rely on data. The COCA further indicates that the expression depend on slate is the most preferred by Americans, followed by depend on government, and depend on people. The BNC shows, on the other hand, that the expression rely on others is the most preferred by the British, followed by rely on people, and rely on friends. The BNC further indicates that depend on factors and depend on others are the most preferred by the British, followed by depend on age, and depend on food. Finally, in the COCA, the nouns government, luck, welfare, people, information, state, fossil, water, family, oil, food, and things are linked to both rely on and depend on, but many nouns are not still linked to both of them. On the other hand, in the BNC, only the nouns state, chance, government, and others are linked to both rely on and depend on, but many nouns are not still linked to both rely on and depend on. It can thus be inferred from this that rely on is slightly different from depend on in its use.


Sign in / Sign up

Export Citation Format

Share Document