scholarly journals When missing NPs make double center-embedding sentences acceptable

2021 ◽  
Vol 6 (1) ◽  
pp. 37
Author(s):  
Nick Huang ◽  
Colin Phillips
Keyword(s):  
2019 ◽  
Vol 75 (10) ◽  
pp. 6324-6360 ◽  
Author(s):  
Ameni Hbaieb ◽  
Mahdi Khemakhem ◽  
Maher Ben Jemaa

2007 ◽  
Vol 43 (2) ◽  
pp. 365-392 ◽  
Author(s):  
FRED KARLSSON

A common view in theoretical syntax and computational linguistics holds that there are no grammatical restrictions on multiple center-embedding of clauses. Syntax would thus be characterized by unbounded recursion. An analysis of 119 genuine multiple clausal center-embeddings from seven ‘Standard Average European’ languages (English, Finnish, French, German, Latin, Swedish, Danish) uncovers usage-based regularities, constraints, that run counter to these and several other widely held views, such as that any type of multiple self-embedding (of the same clause type) would be possible, or that self-embedding would be more complex than multiple center-embedding of different clause types. The maximal degree of center-embedding in written language is three. In spoken language, multiple center-embedding is practically absent. Typical center-embeddings of any degree involve relative clauses specifying the referent of the subject NP of the superordinate clause. Only postmodifying clauses, especially relative clauses and that-clauses acting as noun complements, allow central self-embedding. Double relativization of objects (The rat the cat the dog chased killed ate the malt) does not occur. These corpus-based ‘soft constraints’ suggest that full-blown recursion creating multiple clausal center-embedding is not a central design feature of language in use. Multiple center-embedding emerged with the advent of written language, with Aristotle, Cicero, and Livy in the Greek and Latin stylistic tradition of ‘periodic’ sentence composition.


2021 ◽  
Author(s):  
Eric Martinez ◽  
Francis Mollica ◽  
Edward Gibson

Although contracts and other legal documents have long been known to cause processing difficulty in laypeople, the source and nature of this difficulty has remained unclear. To better understand this mismatch, we conducted a corpus analysis (~10 million words) to investigate to what extent difficult-to-process features that are reportedly common in contracts--such as center embedding, low-frequency jargon, passive voice and non-standard capitalization--are in fact present in contracts relative to normal texts. We found that all of these features were strikingly more prevalent in contracts relative to standard-English texts. We also conducted an experimental study ($n=108$ subjects) to determine to what extent such features cause processing difficulties for laypeople of different reading levels. We found that contractual excerpts containing these features were recalled and comprehended at a lower rate than excerpts without these features, even for experienced readers, and that center-embedded clauses led to greater decreases in recall than other features. These findings confirm long-standing anecdotal accounts of the presence of difficult-to-process features in contracts, and show that these features inhibit comprehension and recall of legal content for readers of all levels. Our findings also suggest such difficulties may largely result from working memory costs imposed by complex syntactic features--such as center-embedded clauses--as opposed to a mere lack of understanding of specialized legal concepts, and that removing these features would be both tractable and beneficial for society at large.


2015 ◽  
Vol 13 (5) ◽  
pp. 1661-1670 ◽  
Author(s):  
Eder Samir Correa ◽  
Luis Alejandro Fletscher ◽  
Juan Felipe Botero

2021 ◽  
Author(s):  
R. Thomas McCoy ◽  
Jennifer Culbertson ◽  
Paul Smolensky ◽  
Géraldine Legendre

Human language is often assumed to make "infinite use of finite means" - that is, to generate an infinite number of possible utterances from a finite number of building blocks. From an acquisition perspective, this assumed property of language is interesting because learners must acquire their languages from a finite number of examples. To acquire an infinite language, learners must therefore generalize beyond the finite bounds of the linguistic data they have observed. In this work, we use an artificial language learning experiment to investigate whether people generalize in this way. We train participants on sequences from a simple grammar featuring center embedding, where the training sequences have at most two levels of embedding, and then evaluate whether participants accept sequences of a greater depth of embedding. We find that, when participants learn the pattern for sequences of the sizes they have observed, they also extrapolate it to sequences with a greater depth of embedding. These results support the hypothesis that the learning biases of humans favor languages with an infinite generative capacity.


Sign in / Sign up

Export Citation Format

Share Document