scholarly journals How are visemes and graphemes integrated with speech sounds during spoken word recognition? ERP evidence for supra-additive responses during audiovisual compared to auditory speech processing

2022 ◽  
Vol 225 ◽  
pp. 105058
Author(s):  
Chotiga Pattamadilok ◽  
Marc Sato
2020 ◽  
Author(s):  
Yingcan Carol Wang ◽  
Ediz Sohoglu ◽  
Rebecca A. Gilbert ◽  
Richard N. Henson ◽  
Matthew H. Davis

AbstractHuman listeners achieve quick and effortless speech comprehension through computations of conditional probability using Bayes rule. However, the neural implementation of Bayesian perceptual inference remains unclear. Competitive-selection accounts (e.g. TRACE) propose that word recognition is achieved through direct inhibitory connections between units representing candidate words that share segments (e.g. hygiene and hijack share /haid3/). Manipulations that increase lexical uncertainty should increase neural responses associated with word recognition when words cannot be uniquely identified (during the first syllable). In contrast, predictive-selection accounts (e.g. Predictive-Coding) proposes that spoken word recognition involves comparing heard and predicted speech sounds and using prediction error to update lexical representations. Increased lexical uncertainty in words like hygiene and hijack will increase prediction error and hence neural activity only at later time points when different segments are predicted (during the second syllable). We collected MEG data to distinguish these two mechanisms and used a competitor priming manipulation to change the prior probability of specific words. Lexical decision responses showed delayed recognition of target words (hygiene) following presentation of a neighbouring prime word (hijack) several minutes earlier. However, this effect was not observed with pseudoword primes (higent) or targets (hijure). Crucially, MEG responses in the STG showed greater neural responses for word-primed words after the point at which they were uniquely identified (after /haid3/ in hygiene) but not before while similar changes were again absent for pseudowords. These findings are consistent with accounts of spoken word recognition in which neural computations of prediction error play a central role.Significance StatementEffective speech perception is critical to daily life and involves computations that combine speech signals with prior knowledge of spoken words; that is, Bayesian perceptual inference. This study specifies the neural mechanisms that support spoken word recognition by testing two distinct implementations of Bayes perceptual inference. Most established theories propose direct competition between lexical units such that inhibition of irrelevant candidates leads to selection of critical words. Our results instead support predictive-selection theories (e.g. Predictive-Coding): by comparing heard and predicted speech sounds, neural computations of prediction error can help listeners continuously update lexical probabilities, allowing for more rapid word identification.


2000 ◽  
Vol 23 (3) ◽  
pp. 347-347
Author(s):  
Louisa M. Slowiaczek

Hesitations about accepting whole-heartedly Norris et al.'s suggestion to abandon feedback in speech processing models concern (1) whether accounting for all available data justifies additional layers of complexity in the model and (2) whether characterizing Merge as non- interactive is valid. Spoken word recognition studies support the nature of Merge's lexical level and suggest that phonemes should comprise the prelexical level.


2020 ◽  
Author(s):  
Hans Rutger Bosker ◽  
David Peeters

ABSTRACTBeat gestures – spontaneously produced biphasic movements of the hand – are among the most frequently encountered co-speech gestures in human communication. They are closely temporally aligned to the prosodic characteristics of the speech signal, typically occurring on lexically stressed syllables. Despite their prevalence across speakers of the world’s languages, how beat gestures impact spoken word recognition is unclear. Can these simple ‘flicks of the hand’ influence speech perception? Across six experiments, we demonstrate that beat gestures influence the explicit and implicit perception of lexical stress (e.g., distinguishing OBject from obJECT), and in turn, can influence what vowels listeners hear. Thus, we provide converging evidence for a manual McGurk effect: even the simplest ‘flicks of the hands’ influence which speech sounds we hear.SIGNIFICANCE STATEMENTBeat gestures are very common in human face-to-face communication. Yet we know little about their behavioral consequences for spoken language comprehension. We demonstrate that beat gestures influence the explicit and implicit perception of lexical stress, and, in turn, can even shape what vowels we think we hear. This demonstration of a manual McGurk effect provides some of the first empirical support for a recent multimodal, situated psycholinguistic framework of human communication, while challenging current models of spoken word recognition that do not yet incorporate multimodal prosody. Moreover, it has the potential to enrich human-computer interaction and improve multimodal speech recognition systems.


2009 ◽  
Vol 37 (4) ◽  
pp. 817-840 ◽  
Author(s):  
VIRGINIA A. MARCHMAN ◽  
ANNE FERNALD ◽  
NEREYDA HURTADO

ABSTRACTResearch using online comprehension measures with monolingual children shows that speed and accuracy of spoken word recognition are correlated with lexical development. Here we examined speech processing efficiency in relation to vocabulary development in bilingual children learning both Spanish and English (n=26 ; 2 ; 6). Between-language associations were weak: vocabulary size in Spanish was uncorrelated with vocabulary in English, and children's facility in online comprehension in Spanish was unrelated to their facility in English. Instead, efficiency of online processing in one language was significantly related to vocabulary size in that language, after controlling for processing speed and vocabulary size in the other language. These links between efficiency of lexical access and vocabulary knowledge in bilinguals parallel those previously reported for Spanish and English monolinguals, suggesting that children's ability to abstract information from the input in building a working lexicon relates fundamentally to mechanisms underlying the construction of language.


2007 ◽  
Vol 34 (2) ◽  
pp. 227-249 ◽  
Author(s):  
NEREYDA HURTADO ◽  
VIRGINIA A. MARCHMAN ◽  
ANNE FERNALD

Research on the development of efficiency in spoken language understanding has focused largely on middle-class children learning English. Here we extend this research to Spanish-learning children (n=49; M=2;0; range=1;3–3;1) living in the USA in Latino families from primarily low socioeconomic backgrounds. Children looked at pictures of familiar objects while listening to speech naming one of the objects. Analyses of eye movements revealed developmental increases in the efficiency of speech processing. Older children and children with larger vocabularies were more efficient at processing spoken language as it unfolds in real time, as previously documented with English learners. Children whose mothers had less education tended to be slower and less accurate than children of comparable age and vocabulary size whose mothers had more schooling, consistent with previous findings of slower rates of language learning in children from disadvantaged backgrounds. These results add to the cross-linguistic literature on the development of spoken word recognition and to the study of the impact of socioeconomic status (SES) factors on early language development.


2021 ◽  
Author(s):  
James Magnuson ◽  
Samantha Grubb ◽  
Anne Marie Crinnion ◽  
Sahil Luthra ◽  
Phoebe Gaston

Norris and Cutler (in press) revisit their arguments that (lexical-to-sublexical) feedback cannot improve word recognition performance, based on the assumption that feedback must boost signal and noise equally. They also argue that demonstrations that feedback improves performance (Magnuson, Mirman, Luthra, Strauss, & Harris, 2018) in the TRACE model of spoken word recognition (McClelland & Elman, 1986) were artifacts of converting activations to response probabilities. We first evaluate their claim that feedback in an interactive activation model must boost noise and signal equally. This is not true in a fully interactive activation model such as TRACE, where the feedback signal does not simply mirror the feedforward signal; it is instead shaped by joint probabilities over lexical patterns, and the dynamics of lateral inhibition. Thus, even under high levels of noise, lexical feedback will selectively boost signal more than noise. We demonstrate that feedback promotes faster word recognition and preserves accuracy under noise whether one uses raw activations or response probabilities. We then document that lexical feedback selectively boosts signal (i.e., lexically-coherent series of phonemes) more than noise by tracking sublexical (phoneme) activations under noise with and without feedback. Thus, feedback in a model like TRACE does improve word recognition, exactly by selective reinforcement of lexically-coherent signal. We conclude that whether lexical feedback is integral to human speech processing is an empirical question, and briefly review a growing body of work at behavioral and neural levels that is consistent with feedback and inconsistent with autonomous (non-feedback) architectures.


Sign in / Sign up

Export Citation Format

Share Document