Social Differentiation in the Use of English Vocabulary

Paul Rayson; Geoffrey N. Leech; Mary Hodges

doi:10.1075/ijcl.2.1.07ray

Social Differentiation in the Use of English Vocabulary

International Journal of Corpus Linguistics ◽

10.1075/ijcl.2.1.07ray ◽

1997 ◽

Vol 2 (1) ◽

pp. 133-152 ◽

Cited By ~ 62

Author(s):

Paul Rayson ◽

Geoffrey N. Leech ◽

Mary Hodges

Keyword(s):

Social Group ◽

Geographical Region ◽

Social Differentiation ◽

Future Research ◽

Analysis Tool ◽

Spoken English ◽

Transcription System ◽

Group A ◽

British National Corpus ◽

National Corpus

In this article, we undertake selective quantitative analyses of the demographi-cally-sampled spoken English component of the British National Corpus (for brevity, referred to here as the ''Conversational Corpus"). This is a subcorpus of c. 4.5 million words, in which speakers and respondents (see I below) are identified by such factors as gender, age, social group, and geographical region. Using a corpus analysis tool developed at Lancaster, we undertake a comparison of the vocabulary of speakers, highlighting those differences which are marked by a very high X2 value of difference between different sectors of the corpus according to gender, age, and social group. A fourth variable, that of geographical region of the United Kingdom, is not investigated in this article, although it remains a promising subject for future research. (As background we also briefly examine differences between spoken and written material in the British National Corpus [BNC].) This study is illustrative of the potentiality of the Conversational Corpus for future corpus-based research on social differentiation in the use of language. There are evident limitations, including (a) the reliance on vocabulary frequency lists and (b) the simplicity of the transcription system employed for the spoken part of the BNC The conclusion of the article considers future advances in the research paradigm illustrated here.

Download Full-text

Teenage swearing in the UK

English World-Wide ◽

10.1075/eww.00040.dru ◽

2020 ◽

Vol 41 (1) ◽

pp. 59-88 ◽

Cited By ~ 1

Author(s):

Rob Drummond

Keyword(s):

Young People ◽

Key Words ◽

Future Research ◽

Specific Context ◽

Meaning Structure ◽

Mainstream School ◽

British National Corpus ◽

Linguistic Behaviour ◽

The Uk ◽

National Corpus

Abstract This article describes the swearing practices of a group of young people aged 14–16 in the UK. The young people are in a specific context – a Pupil Referral Unit catering for pupils who have been excluded from mainstream school. The study’s narrow focus builds on existing knowledge by providing a level of precision in terms of speaker and context not usually found in swearing research. 13 key words are examined in terms of meaning, structure, frequency, and use between genders. Shit and fuck, as the most common terms, are explored in more detail, with use of the latter compared to existing accounts based on the British National Corpus. Examining the swearing practices of this group of people adds detail to our knowledge of a particular style of English, paves the way for future research into the socio-pragmatic functions of teenage swearing, and helps us to better understand the linguistic behaviour of an often-marginalised section of society.

Download Full-text

FUNCTIONAL LOAD: TRANSCRIPTION AND ANALYSIS OF THE 10,000 MOST FREQUENT WORDS IN SPOKEN ENGLISH

The Buckingham Journal of Language and Linguistics ◽

10.5750/bjll.v3i0.27 ◽

2010 ◽

Vol 3 ◽

pp. 135-162

Author(s):

Leah Gilner ◽

Frank Morales

Keyword(s):

Point Of View ◽

Subsequent Work ◽

Functional Load ◽

Spoken English ◽

Equal Importance ◽

British National Corpus ◽

National Corpus ◽

Language Description ◽

Fluent Speakers

Not all aspects of a language have equal importance for speakers or for learners. From the point of view of language description, functional load is a construct that attempts to establish quantifiable hierarchies of relevance among elements of a linguistic class. This paper makes use of analyses conducted on the 10-million-word spoken subcorpus of the British National Corpus in order to characterize what amounts to approximately 97% of the phonological forms and components heard and produced by fluent speakers in a range of contexts. Our aim is to provide segmental, sequential, and syllabic level rankings of spoken English that can serve as the basis for reference and subsequent work by language educators and researchers.

Download Full-text

Swearing in informal spoken English: 1990s–2010s

Text & Talk - An Interdisciplinary Journal of Language Discourse Communication Studies ◽

10.1515/text-2020-0051 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Robbie Love

Keyword(s):

Large Scale ◽

Economic Status ◽

Age Groups ◽

British English ◽

Spoken English ◽

Gender And Age ◽

The Social ◽

Social Distribution ◽

British National Corpus ◽

National Corpus

Abstract This paper investigates changes in swearing usage in informal speech using large-scale corpus data, comparing the occurrence and social distribution of swear words in two corpora of informal spoken British English: the demographically-sampled part of the Spoken British National Corpus 1994 (BNC1994) and the Spoken British National Corpus 2014 (BNC2014); the compilation of the latter has facilitated large-scale, diachronic analyses of authentic spoken data on a scale which has, until now, not been possible. A form and frequency analysis of a set of 16 ‘pure’ swear word lemma forms is presented. The findings reveal that swearing occurrence is significantly lower in the Spoken BNC2014 but still within a comparable range to previous studies. Furthermore, FUCK is found to overtake BLOODY as the most popular swear word lemma. Finally, the social distribution of swearing across gender and age groups generally supports the findings of previous research: males still swear more than females, and swearing still peaks in the twenties and declines thereafter. However, the distribution of swearing according to socio-economic status is found to be more complex than expected in the 2010s and requires further investigation. This paper also reflects on some of the methodological challenges associated with making comparisons between the two corpora.

Download Full-text

How large a vocabulary is needed for reading and listening?

10.26686/wgtn.12552221.v1 ◽

2020 ◽

Author(s):

Paul Nation

Keyword(s):

Vocabulary Size ◽

Written Text ◽

Modern Language ◽

Spoken English ◽

Spoken Text ◽

British National Corpus ◽

National Corpus

This article has two goals: to report on the trialling of fourteen 1,000 word-family lists made from the British National Corpus, and to use these lists to see what vocabulary size is needed for unassisted comprehension of written and spoken English. The trialling showed that the lists were properly sequenced and there were no glaring omissions from the lists. If 98% coverage of a text is needed for unassisted comprehension, then a 8,000 to 9,000 word-family vocabulary is needed for comprehension of written text and a vocabulary of 6,000 to 7,000 for spoken text. © 2006 The Canadian Modern Language Review/La Revue canadienne des langues vivantes.

Download Full-text

A Quantitative Study on English Polyfunctional Words

Glottometrics ◽

10.53482/2021_50_387 ◽

2021 ◽

pp. 42-56

Author(s):

Lu Wang ◽

Yahui Guo ◽

Chengcheng Ren

Keyword(s):

Frequency Distribution ◽

Quantitative Study ◽

Quantitative Research ◽

Previous Investigation ◽

Parts Of Speech ◽

Part Of Speech ◽

Group A ◽

British National Corpus ◽

Group Words ◽

National Corpus

This paper reports quantitative research on the parts of speech of English words using the data from British National Corpus. Most of the part-of-speech investigations focus on the rank-frequency distribution. However, in English and many other languages, we can find that partd of speech can be ambiguous. For example, hope can be a noun and a verb. Such words are called polyfunctional words, while other words, which belong to only one part of speech, are called monofunctional words. The number of parts of speech that a word belongs to is referred to as polyfunctionality. First, we study polyfunctionality distribution of English words and find that the Shenton-Skees-geometric and the Waring distributions capture the data very well. Then, we group words according to their part of speech,e.g., monofunctional nouns, like Saturday, and polyfunctional nouns, like hope (noun, verb) compose noun group, and try to work out a general model for all the groups. The result is that the extended positive binomial distribution captures all the groups except the article group, because of the sparsity of the data. Last, we study the diversification variants. Since there are polyfunctional words in each group, e.g., in a noun group, a polyfunctional noun may also be a verb, we consider the verb function as a diversification variant and try to model the rank-frequency distribution of variants with the Popescu-Altmann function, as used in the previous investigation. The results show very good fit for all groups exzept conjunction group.

Download Full-text

On the interchangeability of actually and really in spoken English: quantitative and qualitative evidence from corpora

English Language and Linguistics ◽

10.1017/s1360674311000323 ◽

2012 ◽

Vol 16 (1) ◽

pp. 151-170 ◽

Cited By ~ 3

Author(s):

MARK GRAY

Keyword(s):

Political Discussion ◽

Spoken Discourse ◽

Spoken English ◽

Qualitative Evidence ◽

Current Thinking ◽

Quantitative Analyses ◽

British National Corpus ◽

Test Current ◽

Semantic Properties ◽

National Corpus

Much of the research that has been carried out into the functions of actually and – to a lesser extent – really has focused on their so-called ‘discourse functions’. However, when they appear medially both actually and really are usually classified as intensifiers, and it has been argued that they are often interchangeable (see for example Lenk 1998; Oh 2000; Taglicht 2001). The purpose of this article is to test current thinking on this question by casting further light on the way medial actually and really are used in spoken discourse. Two complementary approaches are taken. Firstly, the interchangeability hypothesis is assessed on the basis of quantitative analyses of data from the British National Corpus. Secondly, the question of the extent to which actually and/or really function as intensifiers in preverbal position is addressed via a detailed qualitative analysis of data from a small corpus of recent BBC radio broadcasts of the panel-based political discussion programme Any Questions. The analyses presented here suggest that the interchangeability hypothesis is untenable and that the two adverbs have different core meanings, with any intensifying function being largely the result of interplay between the distinct semantic properties of each adverb and the discourse context.

Download Full-text

The word on the street

English Today ◽

10.1017/s0266078400008415 ◽

1995 ◽

Vol 11 (3) ◽

pp. 29-35

Author(s):

Michael Rundell

Keyword(s):

Spoken English ◽

British National Corpus ◽

National Corpus

New insights on spoken English from the British National Corpus

Download Full-text

Words We Would Want: Comparison of Three Pre-programmed Vocabulary Sets With Frequently Used Words in English

Perspectives on Augmentative and Alternative Communication ◽

10.1044/aac17.4.156 ◽

2008 ◽

Vol 17 (4) ◽

pp. 156-164

Author(s):

Bruce Helmbold

Keyword(s):

Descriptive Study ◽

Spoken English ◽

British National Corpus ◽

Word Frequencies ◽

National Corpus

Abstract In this descriptive study, three pre-programmed vocabulary sets—Picture WordPower 45 location (Inman Innovations), Unity 45 Full vs. 4.06 (Prentke-Romich Company), and Gateway 60 vs. 1.06.18 (Dynavox Technologies)—were examined for word-based vocabulary content and keystrokes per word. The vocabulary contents of the each set were then compared to the thousand most common words as identified by two different listings apiece, that published in Word Frequencies in Written and Spoken English based on the British National Corpus (BNC), and Wiktionary TV/Movie Frequency Lists (2006). The pre-programmed vocabulary set best representing these frequency lists was Unity 45 Full, followed by Gateway 60 and Picture WordPower. The vocabulary sets using the fewest average keystrokes per word, based on frequency lists, were Picture WordPower and Gateway 60 followed by Unity 45 Full. Results provide an aid for evaluating the comparative merits of pre-programmed vocabulary sets, such as inclusion of frequently used English words and relative keystroke savings.

Download Full-text

Asynchronous grammaticalization

Languages in Contrast ◽

10.1075/lic.15.1.03leu ◽

2015 ◽

Vol 15 (1) ◽

pp. 34-64 ◽

Cited By ~ 3

Author(s):

Torsten Leuschner ◽

Daan Van den Nest

Keyword(s):

Future Research ◽

Special Role ◽

Subordinate Clauses ◽

British National Corpus ◽

National Corpus

The present paper contrasts verb-first (V1-) conditionals in written usage in present-day English and German. Based on the hypothesis that V1-protases originated in independent interrogatives and then grammaticalized as conditional subordinate clauses in an asynchronous fashion in both languages, we use data from the British National Corpus (BNC) and the Deutsches Referenzkorpus (DeReKo) to investigate the lexical overlap of V1-protases with interrogatives and their functional overlap with ‘if-/wenn’-conditionals. The results show, inter alia, that English V1-conditionals are highly divergent from polar interrogatives and occupy a functional niche with respect to ‘if-’conditionals, with their German counterparts showing more transitional characteristics in both respects; they also suggest a special role for V1-protases with ‘should/sollte’ in expressing a subtype of neutral, rather than tentative, conditionality. Finally, prospects are discussed for future research regarding possible synchronic (i.e. discourse-functional) and diachronic (i.e. systemic) motivations for the differences and similarities observed between V1-conditionals in the two present-day languages.

Download Full-text

Based on Research Connecting Word Corpus of Spoken English

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1030-1032.2689 ◽

2014 ◽

Vol 1030-1032 ◽

pp. 2689-2692

Author(s):

Yong Mei Peng ◽

Yun Hua Qu

Keyword(s):

Native Speakers ◽

Chinese Students ◽

Spoken Word ◽

English Teaching ◽

Spoken English ◽

English Majors ◽

Reference Corpus ◽

Native Speakers Of English ◽

British National Corpus ◽

National Corpus

This paper examines our spoken English Majors used to connect words and characteristics. Corpus used the "Chinese students Spoken and Written English Corpus (SWECCL2.0)" in the spoken corpus SECCL2.0, reference corpus used in the British National Corpus BNC spoken corpus BNC Spoken Corpus (BNC / S). The study found that of native speakers of English majors and English spoken words using both common connections are also differences. Meanwhile, China's English Majors spoken word there are multiple connections with the situation misuse. Based on the findings, the article on spoken English teaching some suggestions.

Download Full-text