Prosody and Spoken-Word Recognition
This chapter outlines a Bayesian model of spoken-word recognition and reviews how prosody is part of that model. The review focuses on the information that assists the listener in recognizing the prosodic structure of an utterance and on how spoken-word recognition is also constrained by prior knowledge about prosodic structure. Recognition is argued to be a process of perceptual inference that ensures that listening is robust to variability in the speech signal. In essence, the listener makes inferences about the segmental content of each utterance, about its prosodic structure (simultaneously at different levels in the prosodic hierarchy), and about the words it contains, and uses these inferences to form an utterance interpretation. Four characteristics of the proposed prosody-enriched recognition model are discussed: parallel uptake of different information types, high contextual dependency, adaptive processing, and phonological abstraction. The next steps that should be taken to develop the model are also discussed.