Spoken language processing by machine

Author(s):  
Roger K. Moore

The past twenty-five years have witnessed a steady improvement in the capabilities of spoken language technology, first in the research laboratory and more recently in the commercial marketplace. Progress has reached a point where automatic speech recognition software for dictating documents onto a computer is available as an inexpensive consumer product in most computer stores, text-to-speech synthesis can be heard in public places giving automated voice announcements, and interactive voice response is becoming a familiar option for people paying bills or booking cinema tickets over the telephone. This article looks at the main computational approaches employed in contemporary spoken language processing. It discusses acoustic modelling, language modelling, pronunciation modelling, and noise modelling. The article also considers future prospects in the context of the obvious shortcomings of current technology, and briefly addresses the potential for achieving a unified approach to human and machine spoken language processing.

Author(s):  
Christina Blomquist ◽  
Rochelle S. Newman ◽  
Yi Ting Huang ◽  
Jan Edwards

Purpose Children with cochlear implants (CIs) are more likely to struggle with spoken language than their age-matched peers with normal hearing (NH), and new language processing literature suggests that these challenges may be linked to delays in spoken word recognition. The purpose of this study was to investigate whether children with CIs use language knowledge via semantic prediction to facilitate recognition of upcoming words and help compensate for uncertainties in the acoustic signal. Method Five- to 10-year-old children with CIs heard sentences with an informative verb ( draws ) or a neutral verb ( gets ) preceding a target word ( picture ). The target referent was presented on a screen, along with a phonologically similar competitor ( pickle ). Children's eye gaze was recorded to quantify efficiency of access of the target word and suppression of phonological competition. Performance was compared to both an age-matched group and vocabulary-matched group of children with NH. Results Children with CIs, like their peers with NH, demonstrated use of informative verbs to look more quickly to the target word and look less to the phonological competitor. However, children with CIs demonstrated less efficient use of semantic cues relative to their peers with NH, even when matched for vocabulary ability. Conclusions Children with CIs use semantic prediction to facilitate spoken word recognition but do so to a lesser extent than children with NH. Children with CIs experience challenges in predictive spoken language processing above and beyond limitations from delayed vocabulary development. Children with CIs with better vocabulary ability demonstrate more efficient use of lexical-semantic cues. Clinical interventions focusing on building knowledge of words and their associations may support efficiency of spoken language processing for children with CIs. Supplemental Material https://doi.org/10.23641/asha.14417627


Author(s):  
Michael K. Tanenhaus

Recently, eye movements have become a widely used response measure for studying spoken language processing in both adults and children, in situations where participants comprehend and generate utterances about a circumscribed “Visual World” while fixation is monitored, typically using a free-view eye-tracker. Psycholinguists now use the Visual World eye-movement method to study both language production and language comprehension, in studies that run the gamut of current topics in language processing. Eye movements are a response measure of choice for addressing many classic questions about spoken language processing in psycholinguistics. This article reviews the burgeoning Visual World literature on language comprehension, highlighting some of the seminal studies and examining how the Visual World approach has contributed new insights to our understanding of spoken word recognition, parsing, reference resolution, and interactive conversation. It considers some of the methodological issues that come to the fore when psycholinguists use eye movements to examine spoken language comprehension.


2004 ◽  
Author(s):  
Jinyoung Kim ◽  
Jeesun Kim ◽  
Chris Davis

Author(s):  
Andrej Zgank ◽  
Izidor Mlakar ◽  
Uros Berglez ◽  
Danilo Zimsek ◽  
Matej Borko ◽  
...  

The chapter presents an overview of human-computer interfaces, which are a crucial element of an ambient intelligence solution. The focus is given to the embodied conversational agents, which are needed to communicate with users in a most natural way. Different input and output modalities, with supporting methods, to process the captured information (e.g., automatic speech recognition, gesture recognition, natural language processing, dialog processing, text to speech synthesis, etc.), have the crucial role to provide the high level of quality of experience to the user. As an example, usage of embodied conversational agent for e-Health domain is proposed.


Sign in / Sign up

Export Citation Format

Share Document