probabilistic grammar Latest Research Papers

Cost of ungrammatical predictions during online sentence processing: evidence against surprisal

10.31219/osf.io/w7nfd ◽

2021 ◽

Author(s):

Apurva Apurva ◽

Samar Husain

Keyword(s):

Sentence Processing ◽

Critical Region ◽

Ecological Validity ◽

Online Processing ◽

Processing Cost ◽

Case Markers ◽

Prediction Mechanism ◽

The Cost ◽

Online Sentence Processing ◽

Probabilistic Grammar

The surprisal metric (Hale, 2001; Levy, 2008) successfully predicts syntactic complexity in a large number of online studies (e.g., Demberg and Keller, 2009; Levy and Keller, 2013). Surprisal assumes a probabilistic grammar that drives the expectation of upcoming linguistic material. Consequently, wrong predictions lead to a processing cost, presumably due to reranking related computations (Levy, 2013). Critically, surprisal assumes that the predicted parses generated by the probabilistic grammar are grammatical. However, it has been found that syntactic predictions can be ungrammatical (e.g., Apurva & Husain, 2018). Consequently, similar to reranking costs incurred due to incorrect (grammatical) predictions, a cost should also appear for ungrammatical predictions. Evidence for such a cost during comprehension will not be explained by the surprisal metric. To test the ecological validity of the surprisal metric, it becomes critical to investigate if ungrammatical predictions incur a cost. In this study, we investigate this issue in Hindi (a verb-final language) using a cloze task followed by a self-paced reading (SPR) study. All analyses were carried out in R using linear mixed models. Log RTs (reading time) were used for the RT analyses. In the cloze study (N=30), participants were asked to complete the sentences (such as 1a, 1b) meaningfully using the SPR paradigm. The two conditions differed in the case markers on the three nouns. 12 sets of experimental items along with 64 fillers were used. Participants’ responses were coded for the predicted verb class and the overall grammaticality of the completion (grammatical prediction vs ungrammatical prediction). 1a. hari-ne geeta-se umesh-ko…. Hari-ERG Geeta=ABL Umesh=ACC. 1b. hari-ko geeta-ne umesh-ko …. Hari-ACC Geeta-ERG Umesh-ACC. Grammaticality analysis of the completion data showed that participants make more ungrammatical completions in conditions (b) compared to (a) (z=5.25). The overall grammatical completions in condition (a) was 96% while in (b) it was 60%. In addition, the verb class analysis showed that in both conditions participants completed the sentences with a transitive non-finite verb followed by a ditransitive matrix verb (hereafter T.NF-DT.M) most frequently. T.NF-DT.M were predicted in 33% instance in condition (a) and 34% in condition (b) (z=0.18). Given the similar cloze probabilities, the surprisal metric will predict no difference in RT at T.NF-DT.M in the two conditions during online processing (cloze probabilities can be used to compute surprisal, see Levy and Keller, 2013). If the RTs at T.NF-DT.M in condition (a) is less than (b) that would be better explained by the higher cost due to the ungrammatical prediction. To ascertain this, we conducted an SPR study (n=50) using items similar to the ones used in the previous experiment (see, 2a and 2b). The critical region was T.NF-DT.M. 24 set of items along with 72 fillers were constructed. 2a hari-ne geeta-se umesh-ko milne ko kaha, Hari-ERG Geeta=ABL Umesh=ACC meet-inf(T.NF) told(DT.M) 2b hari-ko geeta-ne umesh-ko milne ko kaha , ... Hari-ACC Geeta=ERG Umesh=ACC meet-inf(T.NF) told(DT.M) While the prediction of T.NF-DT.M is the same in the two conditions, % ungrammatical predictions are more in (b) vs (a). Results show that the RT in (a) < (b) at the critical region (t=2.32). This goes against the surprisal metric and shows the cost incurred due to ungrammatical predictions. Our work establishes that the cost of ungrammatical predictions indeed appears during online processing. This processing cost is not predicted by a metric like surprisal and highlights its limitations. This study also provides evidence against the robust predictions in head-final languages. It suggests that the prediction mechanism in such languages is more nuanced and points to the need to study the nature of ungrammatical predictions during processing.

Download Full-text

A Probabilistic Grammar of Graphics

Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems ◽

10.1145/3313831.3376466 ◽

2020 ◽

Cited By ~ 1

Author(s):

Xiaoying Pu ◽

Matthew Kay

Keyword(s):

Probabilistic Grammar

Download Full-text

Progressive or simple? A corpus-based study of aspect in World Englishes

Corpora ◽

10.3366/cor.2020.0186 ◽

2020 ◽

Vol 15 (1) ◽

pp. 77-106 ◽

Cited By ~ 2

Author(s):

Marianne Hundt ◽

Paula Rautionaho ◽

Carolin Strobl

Keyword(s):

Relative Strength ◽

Point Of View ◽

World Englishes ◽

Predictor Variables ◽

The Core ◽

Newspaper Writing ◽

Probabilistic Grammar

Previous corpus-based research on the progressive (be +v- ing) investigated it from a diachronic point of view or from the angle of World Englishes (WEs). However, factors such as its propensity to occur with animate subjects or its preference for dynamic verbs have not been studied in relation to the choice between progressive and simple aspect. As the progressive has been extended to stative verbs, we argue that a variationist study of the construction in WEs needs to take simple vps into account systematically, too, and investigate whether there is interaction between predictor variables underlying the progressive:simple choice. We use a probabilistic grammar approach to study progressives in newspaper writing across a broad range of WEs. We apply a tree and forest analysis to gauge the relative strength of the predictor variables variety, animacy, tense/modality, verb type and voice. Our results show that the core grammar for the progressive:simple choice is shared across all Englishes. The extension of progressives to stative verbs, in particular, does not result in statistically detectable effects. We argue that they nevertheless serve to give a very ‘local’ flavour to contact varieties as they are salient against the backdrop of the core grammar.

Download Full-text

The Use of Context-Free Probabilistic Grammar to Anonymise Statistical Data

Cybernetics & Systems ◽

10.1080/01969722.2019.1705551 ◽

2020 ◽

Vol 51 (2) ◽

pp. 177-191

Author(s):

Zygmunt Mazur ◽

Janusz Pec

Keyword(s):

Statistical Data ◽

Context Free ◽

Probabilistic Grammar

Download Full-text

A Probabilistic Grammar of Graphics

10.31219/osf.io/dy8qv ◽

2020 ◽

Author(s):

Xiaoying Pu ◽

Matthew Kay

Keyword(s):

Machine Learning ◽

Probability Distribution ◽

Risk Communication ◽

Programming Languages ◽

Edit Distance ◽

Proof Of Concept ◽

Design Exploration ◽

Medical Risk ◽

Visual Elements ◽

Probabilistic Grammar

Visualizations depicting probabilities and uncertainty are used everywhere from medical risk communication to machine learning, yet these probabilistic visualizations are difficult to specify, prone to error, and their designs are cumbersome to explore. We propose a Probabilistic Grammar of Graphics (PGoG), an extension to Wilkinson’s original framework. Inspired by the success of probabilistic programming languages, PGoG makes probability expressions, such as P(A|B), a first-class citizen in the language. PGoG abstractions also reflect the distinction between probability and frequency framing, a concept from the uncertainty communication literature. It is expressive, encompassing product plots, density plots, icon arrays, and dotplots, among other visualizations. Its coherent syntax ensures correctness (that the proportions of visual elements and their spatial placement reflect the underlying probability distribution) and reduces edit distance between probabilistic visualization specifications, potentially support- ing more design exploration. We provide a proof-of-concept implementation of PGoG in R.

Download Full-text

Probabilistic Grammar Induction for Long Term Human Activity Parsing

2019 International Conference on Computational Science and Computational Intelligence (CSCI) ◽

10.1109/csci49370.2019.00061 ◽

2019 ◽

Author(s):

Samuel Dixon ◽

Raleigh Hansen ◽

Wesley Deneke

Keyword(s):

Human Activity ◽

Grammar Induction ◽

Probabilistic Grammar

Download Full-text

Probabilistic grammar-based neuroevolution for physiological signal classification of ventricular tachycardia

Expert Systems with Applications ◽

10.1016/j.eswa.2019.06.012 ◽

2019 ◽

Vol 135 ◽

pp. 237-248

Author(s):

Pak-Kan Wong ◽

Kwong-Sak Leung ◽

Man-Leung Wong

Keyword(s):

Ventricular Tachycardia ◽

Signal Classification ◽

Physiological Signal ◽

Probabilistic Grammar

Download Full-text

Probabilistic grammar-based deep neuroevolution

Proceedings of the Genetic and Evolutionary Computation Conference Companion on - GECCO '19 ◽

10.1145/3319619.3326778 ◽

2019 ◽

Author(s):

Pak-Kan Wong ◽

Man-Leung Wong ◽

Kwong-Sak Leung

Keyword(s):

Probabilistic Grammar

Download Full-text

3D scene reconstruction using a texture probabilistic grammar

Multimedia Tools and Applications ◽

10.1007/s11042-018-6052-z ◽

2018 ◽

Vol 77 (21) ◽

pp. 28417-28440 ◽

Cited By ~ 2

Author(s):

Dan Li ◽

Disheng Hu ◽

Yuke Sun ◽

Yingsong Hu

Keyword(s):

Scene Reconstruction ◽

3D Scene ◽

3D Scene Reconstruction ◽

Probabilistic Grammar

Download Full-text

Efficient probabilistic grammar induction for design

Artificial intelligence for engineering design analysis and manufacturing ◽

10.1017/s0890060417000464 ◽

2018 ◽

Vol 32 (2) ◽

pp. 177-188 ◽

Cited By ~ 3

Author(s):

Mark E. Whiting ◽

Jonathan Cagan ◽

Philip LeDuc

Keyword(s):

Machine Translation ◽

Three Dimensional ◽

Grammar Induction ◽

Data Types ◽

Design Data ◽

Design Models ◽

Step Process ◽

Translation Methods ◽

Probabilistic Grammar

AbstractThe use of grammars in design and analysis has been set back by the lack of automated ways to induce them from arbitrarily structured datasets. Machine translation methods provide a construct for inducing grammars from coded data which have been extended to be used for design through pre-coded design data. This work introduces a four-step process for inducing grammars from un-coded structured datasets which can constitute a wide variety of data types, including many used in the design. The method includes: (1) extracting objects from the data, (2) forming structures from objects, (3) expanding structures into rules based on frequency, and (4) finding rule similarities that lead to consolidation or abstraction. To evaluate this method, grammars are induced from generated data, architectural layouts and three-dimensional design models to demonstrate that this method offers usable grammars automatically which are functionally similar to grammars produced by hand.

Download Full-text

probabilistic grammar
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Cost of ungrammatical predictions during online sentence processing: evidence against surprisal

A Probabilistic Grammar of Graphics

Progressive or simple? A corpus-based study of aspect in World Englishes

The Use of Context-Free Probabilistic Grammar to Anonymise Statistical Data

A Probabilistic Grammar of Graphics

Probabilistic Grammar Induction for Long Term Human Activity Parsing

Probabilistic grammar-based neuroevolution for physiological signal classification of ventricular tachycardia

Probabilistic grammar-based deep neuroevolution

3D scene reconstruction using a texture probabilistic grammar

Efficient probabilistic grammar induction for design

Export Citation Format

probabilistic grammarRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Cost of ungrammatical predictions during online sentence processing: evidence against surprisal

A Probabilistic Grammar of Graphics

Progressive or simple? A corpus-based study of aspect in World Englishes

The Use of Context-Free Probabilistic Grammar to Anonymise Statistical Data

A Probabilistic Grammar of Graphics

Probabilistic Grammar Induction for Long Term Human Activity Parsing

Probabilistic grammar-based neuroevolution for physiological signal classification of ventricular tachycardia

Probabilistic grammar-based deep neuroevolution

3D scene reconstruction using a texture probabilistic grammar

Efficient probabilistic grammar induction for design

probabilistic grammar
Recently Published Documents