The Spoken British National Corpus 2014 – a new initiative launched by Lancaster University and Cambridge University Press

English Today ◽  
2018 ◽  
Vol 35 (1) ◽  
pp. 54-58
Author(s):  
Daria Bębeniec

The British National Corpus (BNC) has been available to the research community for more than two decades. Over the course of its three editions to date, this 100-million-word database, containing samples of both transcribed speech and written texts representing British English of the 1990s and earlier, has established itself as a valuable resource used around the world in a wide range of language-related applications.

2020 ◽  
pp. 007542422097914
Author(s):  
Karin Aijmer

Well has a long history and is found as an intensifier already in older English. It is argued that diachronically well has developed from its etymological meaning (‘in a good way’) on a cline of adverbialization to an intensifier and to a discourse marker. Well is replaced by other intensifiers in the fourteenth century but emerges in new uses in Present-Day English. The changes in frequency and use of the new intensifier are explored on the basis of a twenty-year time gap between the old British National Corpus (1994) and the new Spoken British National Corpus (2014). The results show that well increases in frequency over time and that it spreads to new semantic types of adjectives and participles, and is found above all in predicative structures with a copula. The emergence of a new well and its increase in frequency are also related to social factors such as the age, gender, and social class of the speakers, and the informal character of the conversation.


Author(s):  
Małgorzata Brożyna Reczko

LOVE in English and PolishThe paper presents a sample contrastive analysis of the linguistic picture of love in English and Polish. The material used in the survey is drawn from lexicographic data, including the British National Corpus and Narodowy Korpus Języka Polskiego [National Corpus of Polish]. The paper focuses on the similarities and differences in conceptualizing the abstract concept of love in the English and Polish languages. An analytical method, developed by Bartmiński and associates, serves as the theoretical basis for the reconstruction of the linguistic picture of the world. MIŁOŚĆ w języku angielskim i polskimNiniejszy artykuł to próba kontrastywnego porównania językowego obrazu świata MIŁOŚCI w języku angielskim i polskim. Materiał badawczy pochodzi głównie ze źródeł leksykograficznych: słowników oraz korpusów (Narodowego Korpusu Języka Polskiego oraz z korpusu języka angielskiego British National Corpus). Celem badania było poszukiwanie podobieństw i różnic w konceptualizacji MIŁOŚCI w tych dwóch językach. Metoda badawcza została zaczerpnięta z prac J. Bartmińskiego i dotyczy rekonstrukcji językowego obrazu świata różnych pojęć.


2017 ◽  
Vol 22 (3) ◽  
pp. 375-402 ◽  
Author(s):  
Jacqueline Laws ◽  
Chris Ryder ◽  
Sylvia Jaworska

Abstract The aim of this paper is to ascertain the degree to which lexical diversity, density and creativity in everyday spoken British English have changed over a 20-year period, as a function of age and gender. Usage patterns of four verb-forming suffixes, -ate, -en, -ify and -ize, were compared in contemporary speech from the Spoken British National Corpus 2014 Sample (Spoken BNC2014S) with its 20-year old counterpart, the BNC1994’s demographically-sampled component (the Spoken BNC1994DS). Frequency comparisons revealed that verb suffixation is denser in the Spoken BNC2014S than in the Spoken BNC1994DS, with the exception of the -en suffix, the use of which has decreased, particularly among female and younger speakers in general. Male speakers and speakers in the 35–59 age range showed the greatest type diversity; there is evidence that this peak is occurring earlier in the more recent corpus. Contrary to expectations, female rather than male speakers produced the largest number of neologisms and rare forms.


Author(s):  
Robbie Love

Abstract This paper investigates changes in swearing usage in informal speech using large-scale corpus data, comparing the occurrence and social distribution of swear words in two corpora of informal spoken British English: the demographically-sampled part of the Spoken British National Corpus 1994 (BNC1994) and the Spoken British National Corpus 2014 (BNC2014); the compilation of the latter has facilitated large-scale, diachronic analyses of authentic spoken data on a scale which has, until now, not been possible. A form and frequency analysis of a set of 16 ‘pure’ swear word lemma forms is presented. The findings reveal that swearing occurrence is significantly lower in the Spoken BNC2014 but still within a comparable range to previous studies. Furthermore, FUCK is found to overtake BLOODY as the most popular swear word lemma. Finally, the social distribution of swearing across gender and age groups generally supports the findings of previous research: males still swear more than females, and swearing still peaks in the twenties and declines thereafter. However, the distribution of swearing according to socio-economic status is found to be more complex than expected in the 2010s and requires further investigation. This paper also reflects on some of the methodological challenges associated with making comparisons between the two corpora.


Author(s):  
Dr. Hamad Abdullah H Aldawsari

Many people use pause fillers such as um, erm, and er in order to signal to the other person that they have not finished speaking yet. This paper aims to investigate pause fillers and their relationship with the two sociolinguistic variables of age and gender. The data-driven analysis is based on the British National Corpus (BNC). The results show that the sociolinguistic variables of age and gender influence the use of pause fillers among British English speakers, which is proposed to be linked to the advancement of age and an improved fluency among female speakers.


MANUSYA ◽  
2007 ◽  
Vol 10 (3) ◽  
pp. 4-17 ◽  
Author(s):  
Wirote Aroonmanakun

This paper reports on the progress of Thai National Corpus development. The TNC is designed as a general corpus of standard Thai. Only written texts are collected in the first phase. It aims to include at least eighty million words. Various text types produced by various authors are included in the TNC so that it would closely represent written language in general. Texts are word segmented and tagged following the Text Encoding Initiative (TEl) guidelines on text encoding. The TNC was designed as a resource for general applications, such as lexicography, language teaching, and linguistic research. In addition, the TNC is designed to be comparable to the British National Corpus so that a comparative study between the two languages is also possible.


Corpora ◽  
2010 ◽  
Vol 5 (1) ◽  
pp. 45-74 ◽  
Author(s):  
Soili Nokkonen

This paper explores need to, a semi-modal of obligation and necessity, and its semantic variation in connection with the sociolinguistic variables of gender, age and social class in the spoken demographic part of the British National Corpus. The semantic/pragmatic uses of need to include internal, deontic, dynamic and epistemic domains based both on traditional concepts and cross-linguistic studies. The sociolinguistic analysis applies the generalisations by Labov, but pays attention to the interactional styles and the communicative needs of the various social groups as well. The results reveal that need to is undergoing change. It shows monotonic distribution among adults, but it is slightly more common among men than women, and, in terms of social class, the upper middle class takes the lead. The semantic variation corroborates these findings – older speakers stick to the more traditional domains – but also reflects the gendered life stages and discourse styles of the speaker groups.


1990 ◽  
Vol 13 (2) ◽  
pp. 187-199
Author(s):  
Kim Plunkett

The Child Language Data Exchange System — CHILDES — is the largest child language archive in the world. The archive includes a wide range of languages covering both normal and abnormal populations. The database is freely accessible to the research community and the user is supported with guidelines for carrying out transcription work and software packages for the automatic analysis of transcriptions. The article provides a brief overview of the CHAT transcription notation and the CLAN programs that can be used to analyse transcripts written in CHAT format. Current drawbacks of the CHILDES system are discussed and some pointers to future developments higlighted.


2020 ◽  
pp. 1-26 ◽  
Author(s):  
PAULA RAUTIONAHO ◽  
ROBERT FUCHS

The spread of the progressive from dynamic to stative verbs started in the seventeenth century, and slowed down in the late twentieth century. The present study investigates recent change in the use of stative progressives in conversational British English from the early 1990s to the early 2010s. The analysis focuses on a total of 100 stative verb lemmata in the spoken, demographic sections of the original and new British National Corpus, restricted to a variable context where a progressive could potentially occur. Results indicate that overall, stative progressives have not become more frequent in the last twenty years, and that the group of stative verbs is highly heterogeneous. However, particular verbs, such as expect and think, do indeed combine more frequently with the progressive now, which could be the cause of the popular impression of the continuing spread of stative progressives. In addition to a frequency-based analysis, a distinctive collexeme analysis offers a more fine-grained analysis of the collostructional preferences of individual verb lemmata and semantic classes of stative verbs. This analysis reveals that the stative verbs are heterogenous and that the lemmata most distinctly associated with the progressive belong to the group of stance verbs.


2004 ◽  
Vol 13 (3) ◽  
pp. 235-268 ◽  
Author(s):  
Anthony McEnery ◽  
Zhonghua Xiao

Swearing is a part of everyday language use. To date it has been infrequently studied, though some recent work on swearing in American English, Australian English and British English has addressed the topic. Nonetheless, there is still no systematic account of swear-words in English. In terms of approaches, swearing has been approached from the points of view of history, lexicography, psycholinguistics and semantics. There have been few studies of swearing based on sociolinguistic variables such as gender, age and social class. Such a study has been difficult in the absence of corpus resources. With the production of the British National Corpus (BNC), a 100,000,000-word balanced corpus of modern British English, such a study became possible. In addition to parts of speech, the corpus is richly annotated with metadata pertaining to demographic features such as age, gender and social class, and textual features such as register, publication medium and domain. While bad language may be related to religion (e.g. Jesus, heaven, hell and damn), sex (e.g. fuck), racism (e.g. nigger), defecation (e.g. shit), homophobia (e.g. queer) and other matters, we will, in this article, examine only the pattern of uses of fuck and its morphological variants, because this is a typical swear-word that occurs frequently in the BNC. This article will build and expand upon the examination of fuck by McEnery et al. (2000) by examining the distribution pattern of fuck within and across spoken and written registers.


Sign in / Sign up

Export Citation Format

Share Document