scholarly journals Testing a computational model of causative overgeneralizations: Child judgment and production data from English, Hebrew, Hindi, Japanese and K’iche’

2022 ◽  
Vol 1 ◽  
pp. 1
Author(s):  
Ben Ambridge ◽  
Laura Doherty ◽  
Ramya Maitreyee ◽  
Tomoko Tatsumi ◽  
Shira Zicherman ◽  
...  

How do language learners avoid the production of verb argument structure overgeneralization errors (*The clown laughed the man c.f. The clown made the man laugh), while retaining the ability to apply such generalizations productively when appropriate? This question has long been seen as one that is both particularly central to acquisition research and particularly challenging. Focussing on causative overgeneralization errors of this type, a previous study reported a computational model that learns, on the basis of corpus data and human-derived verb-semantic-feature ratings, to predict adults’ by-verb preferences for less- versus more-transparent causative forms (e.g., * The clown laughed the man vs The clown made the man laugh) across English, Hebrew, Hindi, Japanese and K’iche Mayan. Here, we tested the ability of this model (and an expanded version with multiple hidden layers) to explain binary grammaticality judgment data from children aged 4;0-5;0, and elicited-production data from children aged 4;0-5;0 and 5;6-6;6 (N=48 per language). In general, the model successfully simulated both children’s judgment and production data, with correlations of r=0.5-0.6 and r=0.75-0.85, respectively, and also generalized to unseen verbs. Importantly, learners of all five languages showed some evidence of making the types of overgeneralization errors – in both judgments and production – previously observed in naturalistic studies of English (e.g., *I’m dancing it). Together with previous findings, the present study demonstrates that a simple learning model can explain (a) adults’ continuous judgment data, (b) children’s binary judgment data and (c) children’s production data (with no training of these datasets), and therefore constitutes a plausible mechanistic account of the acquisition of verbs’ argument structure restrictions.

2021 ◽  
Vol 1 ◽  
pp. 1
Author(s):  
Ben Ambridge ◽  
Laura Doherty ◽  
Ramya Maitreyee ◽  
Tomoko Tatsumi ◽  
Shira Zicherman ◽  
...  

How do language learners avoid the production of verb argument structure overgeneralization errors (*The clown laughed the man c.f. The clown made the man laugh), while retaining the ability to apply such generalizations productively when appropriate? This question has long been seen as one that is both particularly central to acquisition research and particularly challenging. Focussing on causative overgeneralization errors of this type, a previous study reported a computational model that learns, on the basis of corpus data and human-derived verb-semantic-feature ratings, to predict adults’ by-verb preferences for less- versus more-transparent causative forms (e.g., *The clown laughed the man vs The clown made the man laugh) across English, Hebrew, Hindi, Japanese and K’iche Mayan. Here, we tested the ability of this model to explain binary grammaticality judgment data from children aged 4;0-5;0, and elicited-production data from children aged 4;0-5;0 and 5;6-6;6 (N=48 per language). In general, the model successfully simulated both children’s judgment and production data, with correlations of r=0.5-0.6 and r=0.75-0.85, respectively, and also generalized to unseen verbs. Importantly, learners of all five languages showed some evidence of making the types of overgeneralization errors – in both judgments and production – previously observed in naturalistic studies of English (e.g., *I’m dancing it). Together with previous findings, the present study demonstrates that a simple discriminative learning model can explain (a) adults’ continuous judgment data, (b) children’s binary judgment data and (c) children’s production data (with no training of these datasets), and therefore constitutes a plausible mechanistic account of the retreat from overgeneralization.


2018 ◽  
Vol 4 (1) ◽  
Author(s):  
Jona Sassenhagen ◽  
Ryan Blything ◽  
Elena V. M. Lieven ◽  
Ben Ambridge

How are verb-argument structure preferences acquired? Children typically receive very little negative evidence, raising the question of how they come to understand the restrictions on grammatical constructions. Statistical learning theories propose stochastic patterns in the input contain sufficient clues. For example, if a verb is very common, but never observed in transitive constructions, this would indicate that transitive usage of that verb is illegal. Ambridge et al. (2008) have shown that in offline grammaticality judgements of intransitive verbs used in transitive constructions, low-frequency verbs elicit higher acceptability ratings than high-frequency verbs, as predicted if relative frequency is a cue during statistical learning. Here, we investigate if the same pattern also emerges in on-line processing of English sentences. EEG was recorded while healthy adults listened to sentences featuring transitive uses of semantically matched verb pairs of differing frequencies. We replicate the finding of higher acceptabilities of transitive uses of low- vs. high-frequency intransitive verbs. Event-Related Potentials indicate a similar result: early electrophysiological signals distinguish between misuse of high- vs low-frequency verbs. This indicates online processing shows a similar sensitivity to frequency as off-line judgements, consistent with a parser that reflects an original acquisition of grammatical constructions via statistical cues. However, the nature of the observed neural responses was not of the expected, or an easily interpretable, form, motivating further work into neural correlates of online processing of syntactic constructions.


Corpora ◽  
2007 ◽  
Vol 2 (2) ◽  
pp. 157-185 ◽  
Author(s):  
Terry Shortall

Corpus linguists have argued that corpora allow us to present lexical and grammatical patterns to language learners as they occur in real language, thereby exposing the learner to authentic target language (Mindt, 1996; Biber et al., 2002; Sinclair, 2004). And there is now a growing body of empirical research into how corpus studies can benefit ELT materials design and development (Ljung, 1990, 1991; Römer, 2004, 2005). This study investigates how the present perfect is represented in a spoken corpus and in ELT textbooks. The objective is to see whether corpus frequency data can make textbook present perfect presentation represent reality more accurately, and also whether there are sometimes pedagogic aims that may override frequency considerations. Results show that textbooks fail to represent adequately how present perfect interacts with other verb forms to create hybrid tenses such the present perfect passive. Textbooks also over-represent the frequency of structures such as the present perfect continuous. Adverbs such as yet and already are much more frequent in textbooks than in the corpus. Textbook writers seem to deliberately exaggerate the frequency of such adverbs, and arguably use them as tense markers or flagging devices so that learners will expect to see present perfect when they see yet and already. This suggests that disregard for natural frequency data may be justifiable if pedagogic considerations of this kind are taken into account. So, while corpus data provides important and useful frequency data for the teaching of grammar, pedagogic objectives may sometimes require that frequency data is disregarded.


Author(s):  
Jeff MacSwan ◽  
Kara T. McAlister

AbstractThe authors discuss the merits of naturalistic and elicited data in the study of grammatical aspects of codeswitching. Three limitations of naturalistic data are discussed, including the problems of negative evidence, induction, and unidentified performance error. The authors recommend the use of language surveys as a tool for overcoming limitations of elicited grammaticality judgment data.


Author(s):  
Erla Hallsteinsdóttir

Multiword expressions – i.e. phraseological units – like idioms and collocations are one of the most interesting part of every language. In this article, I investigate phraseological units from a lexicographical point of view. I discuss the theoretical and methodological basis of phraseography as a discipline that includes aspects of lexicography, phraseology, corpus linguistics and theories of language learning. I demonstrate the importance of corpora as a source for the lexicographer and the use of corpus data. I also discuss the requirements for the lexicographical treatment of phraseological units by the compilation of a phraseological database for language learners in relation to their assumed needs that have already been described in detail.


2020 ◽  
Vol XVI (1) ◽  
pp. 723-756
Author(s):  
I. Bagirokova ◽  
◽  
D. Ryzhova ◽  
◽  

This paper describes the semantics of falling in Adyghe and Kuban Kabardian from a typological perspective. The analysis is based on corpus data, accompanied by the results of elicitation. Although they represent the same Circassian branch of the Northwest Caucasian family, Adyghe and Kabardian still demonstrate some differences in the way their predicates of falling are lexicalized: while in Adyghe we have a distributive system which includes special lexical means for different types of falling (verbal root -fe- for falling from above, wəḳʷerejə- for losing vertical orientation, -zǝfor detachment, and verbs from adjacent semantic domains such as -we- ‘beat’ for destruction), there is only one dominant (-xwe-) and several peripheral predicates in the Kabardian language. What is peculiar about these languages, when compared to the available typological data, is that the parameter of orientation to the initial (Source) vs. final point (Goal) of movement is of special importance in lexicalizing cases of falling. In Circassian languages, simultaneous surface expression of Source and Goal of movement within a clause is prohibited for morphosyntactic reasons, and the lexemes denoting falling are divided into Source- vs. Goal-oriented ones. For some verbal roots, this orientation is an intrinsic semantic property (cf. -zǝ- which is always Source-oriented); in other cases, it is marked with specifi c affi xes (cf. a locative combination je-…-xǝ ‘down’ which marks re-orientation to the Source of falling of the initially Goal-oriented Adyghe verb -fe-). Thus, our analysis of the material may not only help to contribute to the general typology of falling but may throw light on such a phenomenon in cognitive linguistics as the emphasis on the fi nal point of movement in opposition to the initial point, also known as goal bias


2009 ◽  
Vol 20 (5) ◽  
pp. 578-585 ◽  
Author(s):  
Michael C. Frank ◽  
Noah D. Goodman ◽  
Joshua B. Tenenbaum

Word learning is a “chicken and egg” problem. If a child could understand speakers' utterances, it would be easy to learn the meanings of individual words, and once a child knows what many words mean, it is easy to infer speakers' intended meanings. To the beginning learner, however, both individual word meanings and speakers' intentions are unknown. We describe a computational model of word learning that solves these two inference problems in parallel, rather than relying exclusively on either the inferred meanings of utterances or cross-situational word-meaning associations. We tested our model using annotated corpus data and found that it inferred pairings between words and object concepts with higher precision than comparison models. Moreover, as the result of making probabilistic inferences about speakers' intentions, our model explains a variety of behavioral phenomena described in the word-learning literature. These phenomena include mutual exclusivity, one-trial learning, cross-situational learning, the role of words in object individuation, and the use of inferred intentions to disambiguate reference.


PLoS ONE ◽  
2015 ◽  
Vol 10 (4) ◽  
pp. e0123723 ◽  
Author(s):  
Ben Ambridge ◽  
Amy Bidgood ◽  
Katherine E. Twomey ◽  
Julian M. Pine ◽  
Caroline F. Rowland ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document