scholarly journals COSMO-Onset: A Neurally-Inspired Computational Model of Spoken Word Recognition, Combining Top-Down Prediction and Bottom-Up Detection of Syllabic Onsets

2021 ◽  
Vol 15 ◽  
Author(s):  
Mamady Nabé ◽  
Jean-Luc Schwartz ◽  
Julien Diard

Recent neurocognitive models commonly consider speech perception as a hierarchy of processes, each corresponding to specific temporal scales of collective oscillatory processes in the cortex: 30–80 Hz gamma oscillations in charge of phonetic analysis, 4–9 Hz theta oscillations in charge of syllabic segmentation, 1–2 Hz delta oscillations processing prosodic/syntactic units and the 15–20 Hz beta channel possibly involved in top-down predictions. Several recent neuro-computational models thus feature theta oscillations, driven by the speech acoustic envelope, to achieve syllabic parsing before lexical access. However, it is unlikely that such syllabic parsing, performed in a purely bottom-up manner from envelope variations, would be totally efficient in all situations, especially in adverse sensory conditions. We present a new probabilistic model of spoken word recognition, called COSMO-Onset, in which syllabic parsing relies on fusion between top-down, lexical prediction of onset events and bottom-up onset detection from the acoustic envelope. We report preliminary simulations, analyzing how the model performs syllabic parsing and phone, syllable and word recognition. We show that, while purely bottom-up onset detection is sufficient for word recognition in nominal conditions, top-down prediction of syllabic onset events allows overcoming challenging adverse conditions, such as when the acoustic envelope is degraded, leading either to spurious or missing onset events in the sensory signal. This provides a proposal for a possible computational functional role of top-down, predictive processes during speech recognition, consistent with recent models of neuronal oscillatory processes.

2000 ◽  
Vol 23 (3) ◽  
pp. 337-338 ◽  
Author(s):  
William D. Marslen-Wilson

Norris et al. argue against using evidence from phonetic decision making to support top-down feedback in lexical access on the grounds that phonetic decision relies on processes outside the normal access sequence. This leaves open the possibility that bottom-up connectionist models, with some contextual constraints built into the access process, are still preferred models of spoken-word recognition.


2010 ◽  
Vol 18 (1) ◽  
pp. 136-164 ◽  
Author(s):  
Odette Scharenborg ◽  
Lou Boves

Computational modelling has proven to be a valuable approach in developing theories of spoken-word processing. In this paper, we focus on a particular class of theories in which it is assumed that the spoken-word recognition process consists of two consecutive stages, with an ‘abstract’ discrete symbolic representation at the interface between the stages. In evaluating computational models, it is important to bring in independent arguments for the cognitive plausibility of the algorithms that are selected to compute the processes in a theory. This paper discusses the relation between behavioural studies, theories, and computational models of spoken-word recognition. We explain how computational models can be assessed in terms of the goodness of fit with the behavioural data and the cognitive plausibility of the algorithms. An in-depth analysis of several models provides insights into how computational modelling has led to improved theories and to a better understanding of the human spoken-word recognition process.


Sign in / Sign up

Export Citation Format

Share Document