scholarly journals Does signal reduction imply predictive coding in models of spoken word recognition?

Author(s):  
Sahil Luthra ◽  
Monica Y. C. Li ◽  
Heejo You ◽  
Christian Brodbeck ◽  
James S. Magnuson

AbstractPervasive behavioral and neural evidence for predictive processing has led to claims that language processing depends upon predictive coding. Formally, predictive coding is a computational mechanism where only deviations from top-down expectations are passed between levels of representation. In many cognitive neuroscience studies, a reduction of signal for expected inputs is taken as being diagnostic of predictive coding. In the present work, we show that despite not explicitly implementing prediction, the TRACE model of speech perception exhibits this putative hallmark of predictive coding, with reductions in total lexical activation, total lexical feedback, and total phoneme activation when the input conforms to expectations. These findings may indicate that interactive activation is functionally equivalent or approximant to predictive coding or that caution is warranted in interpreting neural signal reduction as diagnostic of predictive coding.

2021 ◽  
Vol 11 (12) ◽  
pp. 1628
Author(s):  
Michael S. Vitevitch ◽  
Gavin J. D. Mullin

Cognitive network science is an emerging approach that uses the mathematical tools of network science to map the relationships among representations stored in memory to examine how that structure might influence processing. In the present study, we used computer simulations to compare the ability of a well-known model of spoken word recognition, TRACE, to the ability of a cognitive network model with a spreading activation-like process to account for the findings from several previously published behavioral studies of language processing. In all four simulations, the TRACE model failed to retrieve a sufficient number of words to assess if it could replicate the behavioral findings. The cognitive network model successfully replicated the behavioral findings in Simulations 1 and 2. However, in Simulation 3a, the cognitive network did not replicate the behavioral findings, perhaps because an additional mechanism was not implemented in the model. However, in Simulation 3b, when the decay parameter in spreadr was manipulated to model this mechanism the cognitive network model successfully replicated the behavioral findings. The results suggest that models of cognition need to take into account the multi-scale structure that exists among representations in memory, and how that structure can influence processing.


2021 ◽  
Author(s):  
James Magnuson ◽  
Samantha Grubb ◽  
Anne Marie Crinnion ◽  
Sahil Luthra ◽  
Phoebe Gaston

Norris and Cutler (in press) revisit their arguments that (lexical-to-sublexical) feedback cannot improve word recognition performance, based on the assumption that feedback must boost signal and noise equally. They also argue that demonstrations that feedback improves performance (Magnuson, Mirman, Luthra, Strauss, & Harris, 2018) in the TRACE model of spoken word recognition (McClelland & Elman, 1986) were artifacts of converting activations to response probabilities. We first evaluate their claim that feedback in an interactive activation model must boost noise and signal equally. This is not true in a fully interactive activation model such as TRACE, where the feedback signal does not simply mirror the feedforward signal; it is instead shaped by joint probabilities over lexical patterns, and the dynamics of lateral inhibition. Thus, even under high levels of noise, lexical feedback will selectively boost signal more than noise. We demonstrate that feedback promotes faster word recognition and preserves accuracy under noise whether one uses raw activations or response probabilities. We then document that lexical feedback selectively boosts signal (i.e., lexically-coherent series of phonemes) more than noise by tracking sublexical (phoneme) activations under noise with and without feedback. Thus, feedback in a model like TRACE does improve word recognition, exactly by selective reinforcement of lexically-coherent signal. We conclude that whether lexical feedback is integral to human speech processing is an empirical question, and briefly review a growing body of work at behavioral and neural levels that is consistent with feedback and inconsistent with autonomous (non-feedback) architectures.


Author(s):  
Christina Blomquist ◽  
Rochelle S. Newman ◽  
Yi Ting Huang ◽  
Jan Edwards

Purpose Children with cochlear implants (CIs) are more likely to struggle with spoken language than their age-matched peers with normal hearing (NH), and new language processing literature suggests that these challenges may be linked to delays in spoken word recognition. The purpose of this study was to investigate whether children with CIs use language knowledge via semantic prediction to facilitate recognition of upcoming words and help compensate for uncertainties in the acoustic signal. Method Five- to 10-year-old children with CIs heard sentences with an informative verb ( draws ) or a neutral verb ( gets ) preceding a target word ( picture ). The target referent was presented on a screen, along with a phonologically similar competitor ( pickle ). Children's eye gaze was recorded to quantify efficiency of access of the target word and suppression of phonological competition. Performance was compared to both an age-matched group and vocabulary-matched group of children with NH. Results Children with CIs, like their peers with NH, demonstrated use of informative verbs to look more quickly to the target word and look less to the phonological competitor. However, children with CIs demonstrated less efficient use of semantic cues relative to their peers with NH, even when matched for vocabulary ability. Conclusions Children with CIs use semantic prediction to facilitate spoken word recognition but do so to a lesser extent than children with NH. Children with CIs experience challenges in predictive spoken language processing above and beyond limitations from delayed vocabulary development. Children with CIs with better vocabulary ability demonstrate more efficient use of lexical-semantic cues. Clinical interventions focusing on building knowledge of words and their associations may support efficiency of spoken language processing for children with CIs. Supplemental Material https://doi.org/10.23641/asha.14417627


2020 ◽  
Vol 47 (6) ◽  
pp. 1189-1206
Author(s):  
Félix DESMEULES-TRUDEL ◽  
Charlotte MOORE ◽  
Tania S. ZAMUNER

AbstractBilingual children cope with a significant amount of phonetic variability when processing speech, and must learn to weigh phonetic cues differently depending on the cues’ respective roles in their two languages. For example, vowel nasalization is coarticulatory and contrastive in French, but coarticulatory-only in English. In this study, we extended an investigation of the processing of coarticulation in two- to three-year-old English monolingual children (Zamuner, Moore & Desmeules-Trudel, 2016) to a group of four- to six-year-old English monolingual children and age-matched English–French bilingual children. Using eye tracking, we found that older monolingual children and age-matched bilingual children showed more sensitivity to coarticulation cues than the younger children. Moreover, when comparing the older monolinguals and bilinguals, we found no statistical differences between the two groups. These results offer support for the specification of coarticulation in word representations, and indicate that, in some cases, bilingual children possess language processing skills similar to monolinguals.


2019 ◽  
Vol 72 (11) ◽  
pp. 2574-2583 ◽  
Author(s):  
Julie Gregg ◽  
Albrecht W Inhoff ◽  
Cynthia M Connine

Spoken word recognition models incorporate the temporal unfolding of word information by assuming that positional match constrains lexical activation. Recent findings challenge the linearity constraint. In the visual world paradigm, Toscano, Anderson, and McMurray observed that listeners preferentially viewed a picture of a target word’s anadrome competitor (e.g., competitor bus for target sub) compared with phonologically unrelated distractors (e.g., well) or competitors sharing an overlapping vowel (e.g., sun). Toscano et al. concluded that spoken word recognition relies on coarse grain spectral similarity for mapping spoken input to a lexical representation. Our experiments aimed to replicate the anadrome effect and to test the coarse grain similarity account using competitors without vowel position overlap (e.g., competitor leaf for target flea). The results confirmed the original effect: anadrome competitor fixation curves diverged from unrelated distractors approximately 275 ms after the onset of the target word. In contrast, the no vowel position overlap competitor did not show an increase in fixations compared with the unrelated distractors. The contrasting results for the anadrome and no vowel position overlap items are discussed in terms of theoretical implications of sequential match versus coarse grain similarity accounts of spoken word recognition. We also discuss design issues (repetition of stimulus materials and display parameters) concerning the use of the visual world paradigm in making inferences about online spoken word recognition.


2018 ◽  
Vol 13 (3) ◽  
pp. 333-353
Author(s):  
Stéphan Tulkens ◽  
Dominiek Sandra ◽  
Walter Daelemans

Abstract An oft-cited shortcoming of Interactive Activation as a psychological model of word reading is that it lacks the ability to simultaneously represent words of different lengths. We present an implementation of the Interactive Activation model, which we call Metameric, that can simulate words of different lengths, and show that there is nothing inherent to Interactive Activation which prevents it from simultaneously representing multiple word lengths. We provide an in-depth analysis of which specific factors need to be present, and show that the inclusion of three specific adjustments, all of which have been published in various models before, lead to an Interactive Activation model which is fully capable of representing words of different lengths. Finally, we show that our implementation is fully capable of representing all words between 2 and 11 letters in length from the English Lexicon Project (31, 416 words) in a single model. Our implementation is completely open source, heavily optimized, and includes both command line and graphical user interfaces, but is also agnostic to specific input data or problems. It can therefore be used to simulate a myriad of other models, e.g., models of spoken word recognition. The implementation can be accessed at www.github.com/clips/metameric.


2014 ◽  
Vol 42 (4) ◽  
pp. 843-872 ◽  
Author(s):  
SUSANNAH V. LEVI

ABSTRACTResearch with adults has shown that spoken language processing is improved when listeners are familiar with talkers' voices, known as the familiar talker advantage. The current study explored whether this ability extends to school-age children, who are still acquiring language. Children were familiarized with the voices of three German–English bilingual talkers and were tested on the speech of six bilinguals, three of whom were familiar. Results revealed that children do show improved spoken language processing when they are familiar with the talkers, but this improvement was limited to highly familiar lexical items. This restriction of the familiar talker advantage is attributed to differences in the representation of highly familiar and less familiar lexical items. In addition, children did not exhibit accent-general learning; despite having been exposed to German-accented talkers during training, there was no improvement for novel German-accented talkers.


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Kristin J. Van Engen ◽  
Avanti Dey ◽  
Nichole Runge ◽  
Brent Spehar ◽  
Mitchell S. Sommers ◽  
...  

This study assessed the effects of age, word frequency, and background noise on the time course of lexical activation during spoken word recognition. Participants (41 young adults and 39 older adults) performed a visual world word recognition task while we monitored their gaze position. On each trial, four phonologically unrelated pictures appeared on the screen. A target word was presented auditorily following a carrier phrase (“Click on ________”), at which point participants were instructed to use the mouse to click on the picture that corresponded to the target word. High- and low-frequency words were presented in quiet to half of the participants. The other half heard the words in a low level of noise in which the words were still readily identifiable. Results showed that, even in the absence of phonological competitors in the visual array, high-frequency words were fixated more quickly than low-frequency words by both listener groups. Young adults were generally faster to fixate on targets compared to older adults, but the pattern of interactions among noise, word frequency, and listener age showed that older adults’ lexical activation largely matches that of young adults in a modest amount of noise.


2020 ◽  
Author(s):  
Yingcan Carol Wang ◽  
Ediz Sohoglu ◽  
Rebecca A. Gilbert ◽  
Richard N. Henson ◽  
Matthew H. Davis

AbstractHuman listeners achieve quick and effortless speech comprehension through computations of conditional probability using Bayes rule. However, the neural implementation of Bayesian perceptual inference remains unclear. Competitive-selection accounts (e.g. TRACE) propose that word recognition is achieved through direct inhibitory connections between units representing candidate words that share segments (e.g. hygiene and hijack share /haid3/). Manipulations that increase lexical uncertainty should increase neural responses associated with word recognition when words cannot be uniquely identified (during the first syllable). In contrast, predictive-selection accounts (e.g. Predictive-Coding) proposes that spoken word recognition involves comparing heard and predicted speech sounds and using prediction error to update lexical representations. Increased lexical uncertainty in words like hygiene and hijack will increase prediction error and hence neural activity only at later time points when different segments are predicted (during the second syllable). We collected MEG data to distinguish these two mechanisms and used a competitor priming manipulation to change the prior probability of specific words. Lexical decision responses showed delayed recognition of target words (hygiene) following presentation of a neighbouring prime word (hijack) several minutes earlier. However, this effect was not observed with pseudoword primes (higent) or targets (hijure). Crucially, MEG responses in the STG showed greater neural responses for word-primed words after the point at which they were uniquely identified (after /haid3/ in hygiene) but not before while similar changes were again absent for pseudowords. These findings are consistent with accounts of spoken word recognition in which neural computations of prediction error play a central role.Significance StatementEffective speech perception is critical to daily life and involves computations that combine speech signals with prior knowledge of spoken words; that is, Bayesian perceptual inference. This study specifies the neural mechanisms that support spoken word recognition by testing two distinct implementations of Bayes perceptual inference. Most established theories propose direct competition between lexical units such that inhibition of irrelevant candidates leads to selection of critical words. Our results instead support predictive-selection theories (e.g. Predictive-Coding): by comparing heard and predicted speech sounds, neural computations of prediction error can help listeners continuously update lexical probabilities, allowing for more rapid word identification.


2014 ◽  
Author(s):  
Julia Strand ◽  
Andrea M. Simenstad ◽  
Jeffrey J. Berg ◽  
Joseph A. Slote

Sign in / Sign up

Export Citation Format

Share Document