Informativeness is a determinant of compound stress in English

There have been claims in the literature that the variability of compound stress assignment in English can be explained with reference to the informativeness of the constituents (e.g. Bolinger 1972, Ladd 1984). Until now, however, large-scale empirical evidence for this idea has been lacking. This paper addresses this deficit by investigating a large number of compounds taken from the British National Corpus. It is the first study of compound stress variability in English to show that measures of informativeness (the morphological family sizes of the constituents and the constituents' degree of semantic specificity) are indeed highly predictive of prominence placement. Using these variables as predictors, in conjunction with other factors believed to be relevant (see Plag et al. 2008), we build a probabilistic model that can successfully assign prominence to a given construction. Our finding, that the more informative constituent of a compound tends to be most prominent, fits with the general propensity of speakers to accentuate important information, and can therefore be interpreted as evidence for an accentual theory of compound stress.

Download Full-text

Detecting innovations in a parsed corpus of learner English

International Journal of Learner Corpus Research ◽

10.1075/ijlcr.2.2.03sch ◽

2016 ◽

Vol 2 (2) ◽

pp. 177-204 ◽

Cited By ~ 7

Author(s):

Gerold Schneider ◽

Gaëtanelle Gilquin

Keyword(s):

Large Scale ◽

Data Driven ◽

Phrasal Verbs ◽

Scale Method ◽

T Score ◽

Reference Corpus ◽

Similarities And Differences ◽

British National Corpus ◽

Qualitative Analyses ◽

National Corpus

In research on L2 English, recent corpus-based studies indicate that some non-standard forms are shared by indigenized (ESL) and foreign (EFL) varieties of English, which challenges the idea of a clear dichotomy between innovation and error. We present a data-driven large-scale method to detect innovations, test it on verb + preposition structures (including phrasal verbs) and adjective + preposition structures, and describe similarities and differences between EFL and ESL. We use a dependency-parsed version of the International Corpus of Learner English to automatically extract potential innovations, defined as patterns of overuse compared to the British National Corpus as reference corpus. We measure overuse by means of collocation measures like O/E or T-score, and compare our results with similar results for ESL. In both quantitative and qualitative analyses, we detect similarities between the two varieties (e.g. discuss about) and dissimilarities (e.g. accuse for, only distinctive for EFL). We report more verb/adjective + preposition combinations than previous studies and discuss the roles of analogy and transfer.

Download Full-text

Swearing in informal spoken English: 1990s–2010s

Text & Talk - An Interdisciplinary Journal of Language Discourse Communication Studies ◽

10.1515/text-2020-0051 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Robbie Love

Keyword(s):

Large Scale ◽

Economic Status ◽

Age Groups ◽

British English ◽

Spoken English ◽

Gender And Age ◽

The Social ◽

Social Distribution ◽

British National Corpus ◽

National Corpus

Abstract This paper investigates changes in swearing usage in informal speech using large-scale corpus data, comparing the occurrence and social distribution of swear words in two corpora of informal spoken British English: the demographically-sampled part of the Spoken British National Corpus 1994 (BNC1994) and the Spoken British National Corpus 2014 (BNC2014); the compilation of the latter has facilitated large-scale, diachronic analyses of authentic spoken data on a scale which has, until now, not been possible. A form and frequency analysis of a set of 16 ‘pure’ swear word lemma forms is presented. The findings reveal that swearing occurrence is significantly lower in the Spoken BNC2014 but still within a comparable range to previous studies. Furthermore, FUCK is found to overtake BLOODY as the most popular swear word lemma. Finally, the social distribution of swearing across gender and age groups generally supports the findings of previous research: males still swear more than females, and swearing still peaks in the twenties and declines thereafter. However, the distribution of swearing according to socio-economic status is found to be more complex than expected in the 2010s and requires further investigation. This paper also reflects on some of the methodological challenges associated with making comparisons between the two corpora.

Download Full-text

Making meaning with be able to: modality and actualisation

English Language and Linguistics ◽

10.1017/s1360674320000489 ◽

2021 ◽

pp. 1-22

Author(s):

BENOÎT LECLERCQ ◽

ILSE DEPRAETERE

Keyword(s):

Data Analysis ◽

Qualitative Analysis ◽

Empirical Evidence ◽

General Assumption ◽

Conversational Implicature ◽

Making Meaning ◽

Truth Conditional ◽

British National Corpus ◽

National Corpus ◽

Main Distinguishing Feature

This article sheds new light on the usage constraints of be able to, by combining empirical evidence from the British National Corpus (BNC, Davies 2004–) with theoretical insights on the semantics–pragmatics interface. First, we show that be able to does not, contrary to the general assumption, express only ‘ability’ but it shares most of the root meanings usually associated with the possibility modals can and could (Coates 1983: 124). The data analysis shows that what is called ‘opportunity’ in Depraetere & Reed's (2011) taxonomy is the most frequent meaning of be able to. We then turn to the notion of actualisation, which is often claimed to be the main distinguishing feature between be able to and can/could. The qualitative analysis of the BNC dataset provides the empirical evidence, lacking in previous research, for the claim that actualisation is indeed a defining property of the modal periphrastic form. Starting from a reassessment of the semantics–pragmatics interface in terms of a fourfold distinction, we argue that actualisation is a generalised conversational implicature and constitutes conventional pragmatic meaning (that is, conventional non-truth-conditional meaning).

Download Full-text

“That’s well good”: A Re-emergent Intensifier in Current British English

Journal of English Linguistics ◽

10.1177/0075424220979143 ◽

2020 ◽

pp. 007542422097914

Author(s):

Karin Aijmer

Keyword(s):

Social Class ◽

Fourteenth Century ◽

Social Factors ◽

British English ◽

Discourse Marker ◽

Time Gap ◽

British National Corpus ◽

Semantic Types ◽

Over Time ◽

National Corpus

Well has a long history and is found as an intensifier already in older English. It is argued that diachronically well has developed from its etymological meaning (‘in a good way’) on a cline of adverbialization to an intensifier and to a discourse marker. Well is replaced by other intensifiers in the fourteenth century but emerges in new uses in Present-Day English. The changes in frequency and use of the new intensifier are explored on the basis of a twenty-year time gap between the old British National Corpus (1994) and the new Spoken British National Corpus (2014). The results show that well increases in frequency over time and that it spreads to new semantic types of adjectives and participles, and is found above all in predicative structures with a copula. The emergence of a new well and its increase in frequency are also related to social factors such as the age, gender, and social class of the speakers, and the informal character of the conversation.

Download Full-text

On the origin of variable structures in the winds of hot luminous stars

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stt2102 ◽

2013 ◽

Vol 440 (1) ◽

pp. 2-9 ◽

Cited By ~ 9

Author(s):

Yannick J. L. Michaux ◽

Anthony F. J. Moffat ◽

André-Nicolas Chené ◽

Nicole St-Louis

Keyword(s):

Magnetic Field ◽

Empirical Evidence ◽

Temporal Variability ◽

Large Scale ◽

Small Scale ◽

Ionization Zone ◽

Corotating Interaction Regions ◽

Wind Variability ◽

Global Magnetic Field ◽

Interaction Regions

Abstract Examination of the temporal variability properties of several strong optical recombination lines in a large sample of Galactic Wolf–Rayet (WR) stars reveals possible trends, especially in the more homogeneous WC than the diverse WN subtypes, of increasing wind variability with cooler subtypes. This could imply that a serious contender for the driver of the variations is stochastic, magnetic subsurface convection associated with the 170 kK partial-ionization zone of iron, which should occupy a deeper and larger zone of greater mass in cooler WR subtypes. This empirical evidence suggests that the heretofore proposed ubiquitous driver of wind variability, radiative instabilities, may not be the only mechanism playing a role in the stochastic multiple small-scaled structures seen in the winds of hot luminous stars. In addition to small-scale stochastic behaviour, subsurface convection guided by a global magnetic field with localized emerging loops may also be at the origin of the large-scale corotating interaction regions as seen frequently in O stars and occasionally in the winds of their descendant WR stars.

Download Full-text

Inclusion, Contrast and Polysemy in Dictionaries: The Relationship between Theory, Language Use and Lexicographic Practice

Research in Language ◽

10.1515/rela-2015-0001 ◽

2014 ◽

Vol 12 (4) ◽

pp. 319-340

Author(s):

Anu Koskela

Keyword(s):

Language Use ◽

Lexical Item ◽

British National Corpus ◽

Lexical Items ◽

The Relationship ◽

National Corpus

This paper explores the lexicographic representation of a type of polysemy that arises when the meaning of one lexical item can either include or contrast with the meaning of another, as in the case of dog/bitch, shoe/boot, finger/thumb and animal/bird. A survey of how such pairs are represented in monolingual English dictionaries showed that dictionaries mostly represent as explicitly polysemous those lexical items whose broader and narrower readings are more distinctive and clearly separable in definitional terms. They commonly only represented the broader readings for terms that are in fact frequently used in the narrower reading, as shown by data from the British National Corpus.

Download Full-text

A Corpora-Based Analysis of Rely on and Depend on

Journal of Critical Studies in Language and Literature ◽

10.46809/jcsll.v3i1.119 ◽

2021 ◽

Vol 3 (1) ◽

pp. 9-21

Author(s):

Namkil Kang

Keyword(s):

Comparative Analysis ◽

American English ◽

The Other ◽

Information State ◽

Other Hand ◽

British National Corpus ◽

National Corpus

The ultimate goal of this paper is to provide a comparative analysis of rely on and depend on in the Corpus of Contemporary American English and the British National Corpus. The COCA clearly shows that the expression rely on government is the most preferred by Americans, followed by rely on people, and rely on data. The COCA further indicates that the expression depend on slate is the most preferred by Americans, followed by depend on government, and depend on people. The BNC shows, on the other hand, that the expression rely on others is the most preferred by the British, followed by rely on people, and rely on friends. The BNC further indicates that depend on factors and depend on others are the most preferred by the British, followed by depend on age, and depend on food. Finally, in the COCA, the nouns government, luck, welfare, people, information, state, fossil, water, family, oil, food, and things are linked to both rely on and depend on, but many nouns are not still linked to both of them. On the other hand, in the BNC, only the nouns state, chance, government, and others are linked to both rely on and depend on, but many nouns are not still linked to both rely on and depend on. It can thus be inferred from this that rely on is slightly different from depend on in its use.

Download Full-text

The Correlations Between Combinational Arrangements and Semantic Implications of Utterly in the British National Corpus

The Journal of Humanities and Social sciences 21 ◽

10.22143/hss21.12.6.25 ◽

2021 ◽

Vol 12 (6) ◽

pp. 349-360

Author(s):

Jungyull Lee

Keyword(s):

British National Corpus ◽

National Corpus

Download Full-text

The Usage of CAUSE in Three Branches of Science

Higher Education Studies ◽

10.5539/hes.v6n2p109 ◽

2016 ◽

Vol 6 (2) ◽

pp. 109

Author(s):

Bei Yang ◽

Bin Chen

Keyword(s):

Social Science ◽

Language Learning ◽

Academic Writing ◽

Applied Science ◽

Human Beings ◽

Pure Science ◽

Research Findings ◽

British National Corpus ◽

Semantic Prosody ◽

National Corpus

<p>Semantic prosody is a concept that has been subject to considerable criticism and debate. One big concern is to what extent semantic prosody is domain or register-related. Previous studies reach the agreement that CAUSE has an overwhelmingly negative meaning in general English. Its semantic prosody remains controversial in academic writing, however, because of the size and register of the corpus used in different studies. In order to minimize the role that corpus choice has to play in determining the research findings, this paper uses sub-corpora from the British National Corpus to investigate the usage of CAUSE in different types of scientific writing. The results show that the occurrence of CAUSE is the highest in social science, less frequent in applied science, and the lowest in natural and pure science. Its semantic prosody is overwhelmingly negative in social science and applied science, and mainly neutral in natural and pure science. It seems that the verb CAUSE lacks its normal negative semantic prosody in contexts that do not refer to human beings. The implications of the findings for language learning are also discussed.</p>

Download Full-text

Social Differentiation in the Use of English Vocabulary

International Journal of Corpus Linguistics ◽

10.1075/ijcl.2.1.07ray ◽

1997 ◽

Vol 2 (1) ◽

pp. 133-152 ◽

Cited By ~ 62

Author(s):

Paul Rayson ◽

Geoffrey N. Leech ◽

Mary Hodges

Keyword(s):

Social Group ◽

Geographical Region ◽

Social Differentiation ◽

Future Research ◽

Analysis Tool ◽

Spoken English ◽

Transcription System ◽

Group A ◽

British National Corpus ◽

National Corpus

In this article, we undertake selective quantitative analyses of the demographi-cally-sampled spoken English component of the British National Corpus (for brevity, referred to here as the ''Conversational Corpus"). This is a subcorpus of c. 4.5 million words, in which speakers and respondents (see I below) are identified by such factors as gender, age, social group, and geographical region. Using a corpus analysis tool developed at Lancaster, we undertake a comparison of the vocabulary of speakers, highlighting those differences which are marked by a very high X2 value of difference between different sectors of the corpus according to gender, age, and social group. A fourth variable, that of geographical region of the United Kingdom, is not investigated in this article, although it remains a promising subject for future research. (As background we also briefly examine differences between spoken and written material in the British National Corpus [BNC].) This study is illustrative of the potentiality of the Conversational Corpus for future corpus-based research on social differentiation in the use of language. There are evident limitations, including (a) the reliance on vocabulary frequency lists and (b) the simplicity of the transcription system employed for the spoken part of the BNC The conclusion of the article considers future advances in the research paradigm illustrated here.

Download Full-text