Successive Cyclicity and the Syntax of Long-Distance Dependencies

Coppe van Urk

doi:10.1146/annurev-linguistics-011718-012318

Successive Cyclicity and the Syntax of Long-Distance Dependencies

Annual Review of Linguistics ◽

10.1146/annurev-linguistics-011718-012318 ◽

2020 ◽

Vol 6 (1) ◽

pp. 111-130

Author(s):

Coppe van Urk

Keyword(s):

Theoretical Approach ◽

Syntactic Structure ◽

Long Distance ◽

Unbounded Dependencies ◽

Full Consideration

Every major theoretical approach to syntactic structure incorporates a mechanism for generating unbounded dependencies. In this article, I distinguish between some of the most commonly entertained mechanisms by looking in detail at one of the most fundamental discoveries about long-distance dependencies, the fact that they are successive cyclic. Most of the mechanisms posited in order to generate long-distance dependencies capture this property, but make different predictions about what reflexes of successive cyclicity should be attested across languages. In particular, theories of long-distance dependencies can be distinguished according to whether they propose intermediate occurrences of the moving phrases (movement theories) or whether intermediate heads carry features relevant to displacement (featural theories). I show that a full consideration of the typology of successive cyclicity provides clear evidence that both components are part of the syntax of long-distance dependencies. In addition, reflexes of successive cyclicity are equally distributed across the CP and vP edge, suggesting that these are parallel domains.

Download Full-text

Attending to Entities for Better Text Understanding

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6254 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7554-7561

Author(s):

Pengxiang Cheng ◽

Katrin Erk

Keyword(s):

Large Scale ◽

Human Performance ◽

State Of The Art ◽

Syntactic Structure ◽

Semantic Knowledge ◽

Training Data ◽

Language Models ◽

Long Distance ◽

Future Directions ◽

Text Understanding

Recent progress in NLP witnessed the development of large-scale pre-trained language models (GPT, BERT, XLNet, etc.) based on Transformer (Vaswani et al. 2017), and in a range of end tasks, such models have achieved state-of-the-art results, approaching human performance. This clearly demonstrates the power of the stacked self-attention architecture when paired with a sufficient number of layers and a large amount of pre-training data. However, on tasks that require complex and long-distance reasoning where surface-level cues are not enough, there is still a large gap between the pre-trained models and human performance. Strubell et al. (2018) recently showed that it is possible to inject knowledge of syntactic structure into a model through supervised self-attention. We conjecture that a similar injection of semantic knowledge, in particular, coreference information, into an existing model would improve performance on such complex problems. On the LAMBADA (Paperno et al. 2016) task, we show that a model trained from scratch with coreference as auxiliary supervision for self-attention outperforms the largest GPT-2 model, setting the new state-of-the-art, while only containing a tiny fraction of parameters compared to GPT-2. We also conduct a thorough analysis of different variants of model architectures and supervision configurations, suggesting future directions on applying similar techniques to other problems.

Download Full-text

Unbounded Dependency Constructions

10.1093/oso/9780198784999.001.0001 ◽

2020 ◽

Author(s):

Rui P. Chaves ◽

Michael T. Putnam

Keyword(s):

Communication Systems ◽

Human Communication ◽

Psycholinguistic Research ◽

Long Distance ◽

Different Types ◽

Unbounded Dependencies ◽

The Subject ◽

Grammatical Structures ◽

Insight Into ◽

Shed Light

This book is about one of the most intriguing features of human communication systems: the fact that words which go together in meaning can occur arbitrarily far away from each other. The kind of long-distance dependency that this volume is concerned with has been the subject of intense linguistic and psycholinguistic research for the last half century, and offers a unique insight into the nature of grammatical structures and their interaction with cognition. The constructions in which these unbounded dependencies arise are difficult to model and come with a rather puzzling array of constraints which have defied characterization and a proper explanation. For example, there are filler-gap dependencies in which the filler phrase is a plural phrase formed from the combination of each of the extracted phrases, and there are filler-gap constructions in which the filler phrase itself contains a gap that is linked to another filler phrase. What is more, different types of filler-gap dependency can compound, in the same sentence. Conversely, not all kinds of filler-gap dependencies are equally licit; some are robustly ruled out by the grammar whereas others have a less clear status because they have graded acceptability and can be made to improve in ideal contexts and conditions. This work provides a detailed survey of these linguistic phenomena and extant accounts, while also incorporating new experimental evidence to shed light on why the phenomena are the way they are and what important research on this topic lies ahead.

Download Full-text

Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00115 ◽

2016 ◽

Vol 4 ◽

pp. 521-535 ◽

Cited By ~ 60

Author(s):

Tal Linzen ◽

Emmanuel Dupoux ◽

Yoav Goldberg

Keyword(s):

Language Processing ◽

Short Term Memory ◽

Structural Information ◽

Syntactic Structure ◽

Language Modeling ◽

Language Models ◽

Grammatical Structure ◽

Long Distance ◽

Target Number ◽

Statistical Regularities

The success of long short-term memory (LSTM) neural networks in language processing is typically attributed to their ability to capture long-distance statistical regularities. Linguistic regularities are often sensitive to syntactic structure; can such dependencies be captured by LSTMs, which do not have explicit structural representations? We begin addressing this question using number agreement in English subject-verb dependencies. We probe the architecture’s grammatical competence both using training objectives with an explicit grammatical target (number prediction, grammaticality judgments) and using language models. In the strongly supervised settings, the LSTM achieved very high overall accuracy (less than 1% errors), but errors increased when sequential and structural information conflicted. The frequency of such errors rose sharply in the language-modeling setting. We conclude that LSTMs can capture a non-trivial amount of grammatical structure given targeted supervision, but stronger architectures may be required to further reduce errors; furthermore, the language modeling signal is insufficient for capturing syntax-sensitive dependencies, and should be supplemented with more direct supervision if such dependencies need to be captured.

Download Full-text

Restructuring and the Proper Syntactic Analysis

Macrolinguistics ◽

10.26478/ja2021.9.14.1 ◽

2021 ◽

Vol 9 (14) ◽

pp. 1-32

Author(s):

Im Hong-Pin ◽

Keyword(s):

Argument Structure ◽

Syntactic Structure ◽

The Other ◽

Syntactic Analysis ◽

Long Distance ◽

Lexical Information ◽

Sentence Subject

This paper aims to make it clear that syntactic analysis should be based on the lexical information given in the lexicon. For this purpose, lexical information of the syntactic argument is to be taken the form like [VP NKP, _, DKP, AKP] for the ditransitive verb give in English. The argument structure projects to syntactic structure. The NKP in this structure becomes VP-subject, but there is another subject called S-subject (Sentence-Subject) below S node. This amounts to Two-Subject Hypothesis for English. Between these two subjects, there intervene Conjugation-Like Elements, enriched by close examination of English verbal conjugation. Two-Subject Hypothesis perfectly accounts for peculiarities of the Expletive There (ET)construction. Restructuring can also explain the so-called Long Distance Wh-interrogative without introducing Wh-movement, and it can also explain why the imperative verbs are taking the base forms. It can also explain the characteristics of adjective imperatives by the same principles as applied to verbal imperatives. We try to deal with the other subtle problems, to get fruitful results. Restructuring approach, we think, provides more convincing explanations than the movement one.

Download Full-text

Locality in Syntax

Oxford Research Encyclopedia of Linguistics ◽

10.1093/acrefore/9780199384655.013.318 ◽

2018 ◽

Cited By ~ 1

Author(s):

Adriana Belletti

Keyword(s):

Argument Structure ◽

Syntactic Structure ◽

Target Position ◽

Complex Object ◽

Left Periphery ◽

Long Distance ◽

Generative Syntax ◽

Phase Theory ◽

Operator Type ◽

The One

Phenomena involving the displacement of syntactic units are widespread in human languages. The term displacement refers here to a dependency relation whereby a given syntactic constituent is interpreted simultaneously in two different positions. Only one position is pronounced, in general the hierarchically higher one in the syntactic structure. Consider a wh-question like (1) in English: (1) Whom did you give the book to <whom> The phrase containing the interrogative wh-word is located at the beginning of the clause, and this guarantees that the clause is interpreted as a question about this phrase; at the same time, whom is interpreted as part of the argument structure of the verb give (the copy, in <> brackets). In current terms, inspired by minimalist developments in generative syntax, the phrase whom is first merged as (one of) the complement(s) of give (External Merge) and then re-merged (Internal Merge, i.e., movement) in the appropriate position in the left periphery of the clause. This peripheral area of the clause hosts operator-type constituents, among which interrogative ones (yielding the relevant interpretation: for which x, you gave a book to x, for sentence 1). Scope-discourse phenomena—such as, e.g., the raising of a question as in (1), the focalization of one constituent as in TO JOHN I gave the book (not to Mary)—have the effect that an argument of the verb is fronted in the left periphery of the clause rather than filling its clause internal complement position, whence the term displacement. Displacement can be to a position relatively close to the one of first merge (the copy), or else it can be to a position farther away. In the latter case, the relevant dependency becomes more long-distance than in (1), as in (2)a and even more so (2)b: (2) a Whom did Mary expect [that you would give the book to<whom >] b Whom do you think [that Mary expected [that you would give the book to <whom >]] 50 years or so of investigation on locality in formal generative syntax have shown that, despite its potentially very distant realization, syntactic displacement is in fact a local process. The audible position in which a moved constituent is pronounced and the position of its copy inside the clause can be far from each other. However, the long-distance dependency is split into steps through iterated applications of short movements, so that any dependency holding between two occurrences of the same constituent is in fact very local. Furthermore, there are syntactic domains that resist movement out of them, traditionally referred to as islands. Locality is a core concept of syntactic computations. Syntactic locality requires that syntactic computations apply within small domains (cyclic domains), possibly in the mentioned iterated way (successive cyclicity), currently rethought of in terms of Phase theory. Furthermore, in the Relativized Minimality tradition, syntactic locality requires that, given X . . . Z . . . Y, the dependency between the relevant constituent in its target position X and its first merge position Y should not be interrupted by any constituent Z which is similar to X in relevant formal features and thus intervenes, blocking the relation between X and Y. Intervention locality has also been shown to allow for an explicit characterization of aspects of children’s linguistic development in their capacity to compute complex object dependencies (also relevant in different impaired populations).

Download Full-text

Variation between singular and plural subject-verb agreement in German: A usage-based approach

Yearbook of the German Cognitive Linguistics Association ◽

10.1515/gcla-2014-0012 ◽

2014 ◽

Vol 2 (1) ◽

Author(s):

Juliana Goschler

Keyword(s):

Theoretical Approach ◽

Syntactic Structure ◽

Systematic Variation ◽

Subject Position ◽

Verb Agreement ◽

Determining Factors ◽

Proper Name ◽

The Subject ◽

Syntactic Properties ◽

Plural Subject

AbstractAt first glance, subject-verb-agreement seems to be straightforward in German: In the case of simplex NPs, the subject always agrees with the verb syntactically in person and number. However, with coordinated NPs in subject position, there is considerable variation in usage. If both conjuncts are singular NPs, the verb may display singular agreement - as would be expected, since coordinated structures inherit their syntactic properties from their individual components - but much more frequently, the verb displays plural agreement. On the basis of the LIMAS-corpus, a one-million-word corpus of written German, I will show that there is systematic variation between the two options. Among the determining factors are the position of the verb (preceding or following the subject), the type of NP (pronoun, proper name, lexical NP) and the internal syntactic structure of the subject (coordination of full NPs vs. coordination of partial NPs sharing a determiner, and definiteness vs. indefiniteness of the coordinated parts of the subject). I will discuss the results from the perspective of usage-based approaches and argue for an integration of semantic, pragmatic, and frequency factors in any theoretical approach to grammar.

Download Full-text

Improved Neural Machine Translation with Source Syntax

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/584 ◽

2017 ◽

Cited By ~ 6

Author(s):

Shuangzhi Wu ◽

Ming Zhou ◽

Dongdong Zhang

Keyword(s):

Machine Translation ◽

State Of The Art ◽

Syntactic Structure ◽

Phrase Structure ◽

Long Distance ◽

Neural Machine Translation ◽

Attention Model ◽

Word Level ◽

Source Sentence ◽

Dependency Structures

Neural Machine Translation (NMT) based on the encoder-decoder architecture has recently achieved the state-of-the-art performance. Researchers have proven that extending word level attention to phrase level attention by incorporating source-side phrase structure can enhance the attention model and achieve promising improvement. However, word dependencies that can be crucial to correctly understand a source sentence are not always in a consecutive fashion (i.e. phrase structure), sometimes they can be in long distance. Phrase structures are not the best way to explicitly model long distance dependencies. In this paper we propose a simple but effective method to incorporate source-side long distance dependencies into NMT. Our method based on dependency trees enriches each source state with global dependency structures, which can better capture the inherent syntactic structure of source sentences. Experiments on Chinese-English and English-Japanese translation tasks show that our proposed method outperforms state-of-the-art SMT and NMT baselines.

Download Full-text

Evidence of a configurational structure in Meskwaki

Proceedings of the Linguistic Society of America ◽

10.3765/plsa.v3i1.4309 ◽

2018 ◽

Vol 3 (1) ◽

pp. 64

Author(s):

Paul A. Morris

Keyword(s):

Free Surface ◽

Word Order ◽

Syntactic Structure ◽

Surface Form ◽

Long Distance ◽

Long Distance Movement ◽

Core Characteristics ◽

Free Word

Meskwaki, like many polysynthetic Algonquian languages, is often analyzed as having a non-configurational structure because it exhibits the three core characteristics of non-configurationality: free word order, discontinuous expressions, and null anaphora (Hale 1983). While free surface form word order is attributed to a preverbal discourse-based hierarchy, non-topic/focus NPs are in a post-verbal, non-hierarchical XP structure (Dahlstrom in progress). This paper posits that Meskwaki has an underlying configurational syntactic structure based on novel and prior data showing (1) discontinuous NP ordering restrictions with locality constraints, (2) superiority effects in multiple wh-phrases, and (3) long-distance movement and island effects.

Download Full-text

Differential polarization imaging of sickled cells

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100102390 ◽

1988 ◽

Vol 46 ◽

pp. 62-63

Author(s):

Marcos F. Maestre

Keyword(s):

Optical Properties ◽

Theoretical Approach ◽

Polarized Light ◽

Polarization Microscopy ◽

Spectroscopic Techniques ◽

Polarization Imaging ◽

Circularly Polarized Light ◽

Circularly Polarized ◽

Microscopic Scale ◽

Chiral Structures

Recently we have developed a form of polarization microscopy that forms images using optical properties that have previously been limited to macroscopic samples. This has given us a new window into the distribution of structure on a microscopic scale. We have coined the name differential polarization microscopy to identify the images obtained that are due to certain polarization dependent effects. Differential polarization microscopy has its origins in various spectroscopic techniques that have been used to study longer range structures in solution as well as solids. The differential scattering of circularly polarized light has been shown to be dependent on the long range chiral order, both theoretically and experimentally. The same theoretical approach was used to show that images due to differential scattering of circularly polarized light will give images dependent on chiral structures. With large helices (greater than the wavelength of light) the pitch and radius of the helix could be measured directly from these images.

Download Full-text

The Fine Structure of Phloem Cells

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100119867 ◽

1985 ◽

Vol 43 ◽

pp. 632-635

Author(s):

James Cronshaw

Keyword(s):

Structural Studies ◽

High Concentration ◽

Long Distance ◽

Phloem Tissue ◽

Sieve Elements ◽

Long Distance Transport ◽

P Protein ◽

Companion Cells ◽

Electron Microscopical ◽

Distance Transport

Long distance transport in plants takes place in phloem tissue which has characteristic cells, the sieve elements. At maturity these cells have sieve areas in their end walls with specialized perforations. They are associated with companion cells, parenchyma cells, and in some species, with transfer cells. The protoplast of the functioning sieve element contains a high concentration of sugar, and consequently a high hydrostatic pressure, which makes it extremely difficult to fix mature sieve elements for electron microscopical observation without the formation of surge artifacts. Despite many structural studies which have attempted to prevent surge artifacts, several features of mature sieve elements, such as the distribution of P-protein and the nature of the contents of the sieve area pores, remain controversial.

Download Full-text