scholarly journals Simulating phonological and semantic impairment of English tense inflection with linear discriminative learning

2020 ◽  
Vol 15 (3) ◽  
Author(s):  
Maria Heitmeier ◽  
R. Harald Baayen

Abstract This study applies the computational theory of the ‘discriminative lexicon’ (Baayen, Chuang, Blevins, 2019) to the modeling of the production of English verbs in aphasic speech. Under semantic impairment, speakers have been reported to have greater difficulties with irregular verbs, whereas speakers with phonological impairment are described as having greater problems with regulars. Joanisse and Seidenberg (1999) were able to model this dissociation, but only by adding noise to the semantic units of their model. We report two simulation studies in which topographically coherent regions of phonological and semantic networks were selectively damaged. Our model replicated the main findings, including the high variability in the consequences of brain lesions for speech production. Importantly, our model generated these results without having to lesion the semantic system more than the phonological system. The model’s success hinges on the use of a corpus-based distributional vector space for representing verbs’ meanings. Irregular verbs have denser semantic neighborhoods than do regular verbs (Baayen and Moscoso del Prado Martín, 2005). In our model this renders irregular verbs more fragile under semantic impairment. These results provide further support for the central idea underlying the discriminative lexicon: that behavioral patterns can, to a considerable extent, be understood as emerging from the distributional properties of a language and basic principles of human learning.

2020 ◽  
Author(s):  
Maria Heitmeier ◽  
R. H. Baayen

This study applies the computational theory of the ‘discriminative lexicon’ (Baayen et al., 2019) to the modeling of the production of regular and irregular English verbs in aphasic speech. Under impairment, speakers with memory loss have been reported to have greater difficulties with irregular verbs, whereas speakers with phonological impairment are described as having greater problems with regulars. Joanisse and Seidenberg (1999) were able to model this dissociation, but only by selectively adding noise to the semantic units of their model. We report two simulation studies in which topographically coherent regions of phonological and semantic networks were selectively damaged. Our model replicated the main findings, including the high variability in the consequences of brain lesions for speech production. Importantly, our model generated these results without having to lesion the semantic system more than the phonological system. The model’s success turns out to hinge on the use of a corpus-based distributional vector space for representing verbs’ meanings. Joanisse and Seidenberg (1999) used one-hot encoding for their semantic representation, under the assumption that semantically regular and irregular verbs do not differ in ways relevant to impairment in aphasia. However, irregular verbs have denser semantic neighborhoods than do regular verbs (Baayen and Moscoso del Prado Martín, 2005), and we show that in our model this greater density renders irregular verbs more fragile under semantic impairment. These results provide further support for the central idea underlying the discriminative lexicon: that behavioral patterns can, to a considerable extent, be understood as emerging from the distributional properties of a language and basic principles of human learning.


Author(s):  
Yu-Ying Chuang ◽  
R. Harald Baayen

Naive discriminative learning (NDL) and linear discriminative learning (LDL) are simple computational algorithms for lexical learning and lexical processing. Both NDL and LDL assume that learning is discriminative, driven by prediction error, and that it is this error that calibrates the association strength between input and output representations. Both words’ forms and their meanings are represented by numeric vectors, and mappings between forms and meanings are set up. For comprehension, form vectors predict meaning vectors. For production, meaning vectors map onto form vectors. These mappings can be learned incrementally, approximating how children learn the words of their language. Alternatively, optimal mappings representing the end state of learning can be estimated. The NDL and LDL algorithms are incorporated in a computational theory of the mental lexicon, the ‘discriminative lexicon’. The model shows good performance both with respect to production and comprehension accuracy, and for predicting aspects of lexical processing, including morphological processing, across a wide range of experiments. Since, mathematically, NDL and LDL implement multivariate multiple regression, the ‘discriminative lexicon’ provides a cognitively motivated statistical modeling approach to lexical processing.


2016 ◽  
Vol 46 (3) ◽  
pp. 779-799 ◽  
Author(s):  
Cuihong Yin ◽  
X. Sheldon Lin

AbstractThe Erlang mixture model has been widely used in modeling insurance losses due to its desirable distributional properties. In this paper, we consider the problem of efficient estimation of the Erlang mixture model. We present a new thresholding penalty function and a corresponding EM algorithm to estimate model parameters and to determine the order of the mixture. Using simulation studies and a real data application, we demonstrate the efficiency of the EM algorithm.


Symmetry ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 1343
Author(s):  
Abbas Mahdavi ◽  
Vahid Amirzadeh ◽  
Ahad Jamalizadeh ◽  
Tsung-I Lin

Multivariate skew-symmetric-normal (MSSN) distributions have been recognized as an appealing tool for modeling data with non-normal features such as asymmetry and heavy tails, rendering them suitable for applications in diverse areas. We introduce a richer class of MSSN distributions based on a scale-shape mixture of (multivariate) flexible skew-symmetric normal distributions, called the SSMFSSN distributions. This very general class of SSMFSSN distributions can capture various shapes of multimodality, skewness, and leptokurtic behavior in the data. We investigate some of its probabilistic characterizations and distributional properties which are useful for further methodological developments. An efficient EM-type algorithm designed under the selection mechanism is advocated to compute the maximum likelihood (ML) estimates of parameters. Simulation studies as well as applications to a real dataset are employed to illustrate the usefulness of the presented methods. Numerical results show the superiority of our proposed model in comparison to several existing competitors.


2011 ◽  
Vol 11 (2) ◽  
pp. 295-328 ◽  
Author(s):  
R. Harald Baayen

Three classifiers from machine learning (the generalized linear mixed model, memory based learning, and support vector machines) are compared with a naive discriminative learning classifier, derived from basic principles of error-driven learning characterizing animal and human learning. Tested on the dative alternation in English, using the Switchboard data from (BRESNAN; CUENI; NIKITINA; BAAYEN, 2007), naive discriminative learning emerges with stateof-the-art predictive accuracy. Naive discriminative learning offers a united framework for understanding the learning of probabilistic distributional patterns, for classification, and for a cognitive grounding of distinctive collexeme analysis.


2019 ◽  
Author(s):  
Fabian Tomaschek ◽  
Daniel Duran

The McGurk effect is a well known perceptual phenomenon in which listeners perceive [da], when presented with visual instances of [ga] syllables combined with the audio of [ba] syllables. However, the underlying cognitive mechanisms are not yet fully understood. In this study, we investigated the McGurk effect from the perspective of two learning theories – the Exemplar theory and Discriminative Learning. We hypothesized that the McGurk effect arises from distributional differences of the [ba, da, ga] syllables in a lexicon of a given language. We tested this hypothesis using computational implementations of these theories, simulating learning on the basis of lexica in which we varied the distributions of these syllables systematically. These simulations support our hypothesis. For both learning theories, we found that the probability of observing the McGurk effect in our simulations was greater, when lexica contained a larger percentage of [da] and [ba] instances. Crucially, the probability of the McGurk effect was inversely proportional to the percentage of [ga], what ever the percentage of [ba] and [da]. To validate our results, we inspected the distributional properties of [ba, da, ga] in different languages for which the McGurk effect was or was not previously attested. Our results mirror the findings for these languages. Our study shows that the McGurk effect – an instance of multi-modal perceptual integration – arises from distributional properties in different languages and thus depends on language learning.


BMJ Open ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. e039921
Author(s):  
Anne-Laure Boulesteix ◽  
Rolf HH Groenwold ◽  
Michal Abrahamowicz ◽  
Harald Binder ◽  
Matthias Briel ◽  
...  

In health research, statistical methods are frequently used to address a wide variety of research questions. For almost every analytical challenge, different methods are available. But how do we choose between different methods and how do we judge whether the chosen method is appropriate for our specific study? Like in any science, in statistics, experiments can be run to find out which methods should be used under which circumstances. The main objective of this paper is to demonstrate that simulation studies, that is, experiments investigating synthetic data with known properties, are an invaluable tool for addressing these questions. We aim to provide a first introduction to simulation studies for data analysts or, more generally, for researchers involved at different levels in the analyses of health data, who (1) may rely on simulation studies published in statistical literature to choose their statistical methods and who, thus, need to understand the criteria of assessing the validity and relevance of simulation results and their interpretation; and/or (2) need to understand the basic principles of designing statistical simulations in order to efficiently collaborate with more experienced colleagues or start learning to conduct their own simulations. We illustrate the implementation of a simulation study and the interpretation of its results through a simple example inspired by recent literature, which is completely reproducible using the R-script available from online supplemental file 1.


2010 ◽  
Vol 20 (3) ◽  
pp. 100-105 ◽  
Author(s):  
Anne K. Bothe

This article presents some streamlined and intentionally oversimplified ideas about educating future communication disorders professionals to use some of the most basic principles of evidence-based practice. Working from a popular five-step approach, modifications are suggested that may make the ideas more accessible, and therefore more useful, for university faculty, other supervisors, and future professionals in speech-language pathology, audiology, and related fields.


1999 ◽  
Vol 15 (2) ◽  
pp. 91-98 ◽  
Author(s):  
Lutz F. Hornke

Summary: Item parameters for several hundreds of items were estimated based on empirical data from several thousands of subjects. The logistic one-parameter (1PL) and two-parameter (2PL) model estimates were evaluated. However, model fit showed that only a subset of items complied sufficiently, so that the remaining ones were assembled in well-fitting item banks. In several simulation studies 5000 simulated responses were generated in accordance with a computerized adaptive test procedure along with person parameters. A general reliability of .80 or a standard error of measurement of .44 was used as a stopping rule to end CAT testing. We also recorded how often each item was used by all simulees. Person-parameter estimates based on CAT correlated higher than .90 with true values simulated. For all 1PL fitting item banks most simulees used more than 20 items but less than 30 items to reach the pre-set level of measurement error. However, testing based on item banks that complied to the 2PL revealed that, on average, only 10 items were sufficient to end testing at the same measurement error level. Both clearly demonstrate the precision and economy of computerized adaptive testing. Empirical evaluations from everyday uses will show whether these trends will hold up in practice. If so, CAT will become possible and reasonable with some 150 well-calibrated 2PL items.


Sign in / Sign up

Export Citation Format

Share Document