Spoken language processing by machine

Speech Recognition Software ◽

Language Technology ◽

Text To Speech Synthesis ◽

Recognition Software

The past twenty-five years have witnessed a steady improvement in the capabilities of spoken language technology, first in the research laboratory and more recently in the commercial marketplace. Progress has reached a point where automatic speech recognition software for dictating documents onto a computer is available as an inexpensive consumer product in most computer stores, text-to-speech synthesis can be heard in public places giving automated voice announcements, and interactive voice response is becoming a familiar option for people paying bills or booking cinema tickets over the telephone. This article looks at the main computational approaches employed in contemporary spoken language processing. It discusses acoustic modelling, language modelling, pronunciation modelling, and noise modelling. The article also considers future prospects in the context of the obvious shortcomings of current technology, and briefly addresses the potential for achieving a unified approach to human and machine spoken language processing.

Spoken Language Processing: A Convergent Approach to Conceptualizing (Central) Auditory Processing

ASHA Leader ◽

10.1044/leader.ftr2.11082006.6 ◽

2006 ◽

Vol 11 (8) ◽

pp. 6-33 ◽

Cited By ~ 1

Author(s):

Larry Medwetsky

Keyword(s):

Language Processing ◽

Auditory Processing ◽

Spoken Language ◽

Central Auditory Processing ◽

Convergent Approach

Children With Cochlear Implants Use Semantic Prediction to Facilitate Spoken Word Recognition

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-20-00319 ◽

2021 ◽

pp. 1-14

Author(s):

Christina Blomquist ◽

Rochelle S. Newman ◽

Yi Ting Huang ◽

Jan Edwards

Keyword(s):

Word Recognition ◽

Target Word ◽

Cochlear Implants ◽

Language Processing ◽

Spoken Word Recognition ◽

Spoken Language ◽

Spoken Word ◽

Semantic Cues ◽

Matched Group ◽

Purpose Children with cochlear implants (CIs) are more likely to struggle with spoken language than their age-matched peers with normal hearing (NH), and new language processing literature suggests that these challenges may be linked to delays in spoken word recognition. The purpose of this study was to investigate whether children with CIs use language knowledge via semantic prediction to facilitate recognition of upcoming words and help compensate for uncertainties in the acoustic signal. Method Five- to 10-year-old children with CIs heard sentences with an informative verb ( draws ) or a neutral verb ( gets ) preceding a target word ( picture ). The target referent was presented on a screen, along with a phonologically similar competitor ( pickle ). Children's eye gaze was recorded to quantify efficiency of access of the target word and suppression of phonological competition. Performance was compared to both an age-matched group and vocabulary-matched group of children with NH. Results Children with CIs, like their peers with NH, demonstrated use of informative verbs to look more quickly to the target word and look less to the phonological competitor. However, children with CIs demonstrated less efficient use of semantic cues relative to their peers with NH, even when matched for vocabulary ability. Conclusions Children with CIs use semantic prediction to facilitate spoken word recognition but do so to a lesser extent than children with NH. Children with CIs experience challenges in predictive spoken language processing above and beyond limitations from delayed vocabulary development. Children with CIs with better vocabulary ability demonstrate more efficient use of lexical-semantic cues. Clinical interventions focusing on building knowledge of words and their associations may support efficiency of spoken language processing for children with CIs. Supplemental Material https://doi.org/10.23641/asha.14417627

Rediscovering 25 years of discoveries in spoken language processing: a preliminary ISCA archive analysis

10.21437/interspeech.2013-745 ◽

2013 ◽

Author(s):

Joseph Mariani ◽

Patrick Paroubek ◽

Gil Francopoulo ◽

Marine Delaborde

Keyword(s):

Language Processing ◽

Spoken Language ◽

Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96 ◽

The implications of the Tadoma method of speechreading for spoken language processing

10.1109/icslp.1996.607898 ◽

2002 ◽

Cited By ~ 1

Author(s):

C.M. Reed

Keyword(s):

Language Processing ◽

Spoken Language ◽

Spoken language comprehension: insights from eye movements

The Oxford Handbook of Psycholinguistics ◽

10.1093/oxfordhb/9780198568971.013.0018 ◽

2007 ◽

pp. 308-326 ◽

Cited By ~ 1

Author(s):

Michael K. Tanenhaus

Keyword(s):

Eye Movements ◽

Language Processing ◽

Language Comprehension ◽

Language Production ◽

World Literature ◽

Spoken Language ◽

Response Measure ◽

Visual World ◽

Spoken Language Comprehension

Recently, eye movements have become a widely used response measure for studying spoken language processing in both adults and children, in situations where participants comprehend and generate utterances about a circumscribed “Visual World” while fixation is monitored, typically using a free-view eye-tracker. Psycholinguists now use the Visual World eye-movement method to study both language production and language comprehension, in studies that run the gamut of current topics in language processing. Eye movements are a response measure of choice for addressing many classic questions about spoken language processing in psycholinguistics. This article reviews the burgeoning Visual World literature on language comprehension, highlighting some of the seminal studies and examining how the Visual World approach has contributed new insights to our understanding of spoken word recognition, parsing, reference resolution, and interactive conversation. It considers some of the methodological issues that come to the fore when psycholinguists use eye movements to examine spoken language comprehension.

In the Mind’s Eye: Eye-Tracking and Multi-modal Integration During Bilingual Spoken-Language Processing

Attention and Vision in Language Processing ◽

10.1007/978-81-322-2443-3_9 ◽

2015 ◽

pp. 147-164 ◽

Cited By ~ 3

Author(s):

Sarah Chabal ◽

Viorica Marian

Keyword(s):

Eye Tracking ◽

Language Processing ◽

Spoken Language ◽

Mind's Eye

Audio-visual spoken language processing

10.21437/interspeech.2004-411 ◽

2004 ◽

Author(s):

Jinyoung Kim ◽

Jeesun Kim ◽

Chris Davis

Keyword(s):

Language Processing ◽

Spoken Language ◽

Transcribing human-directed speech for spoken language processing

10.21437/interspeech.2009-4 ◽

2009 ◽

Author(s):

Mari Ostendorf

Keyword(s):

Language Processing ◽

Spoken Language ◽

Encyclopedia of Organizational Knowledge, Administration, and Technology - Advances in Logistics, Operations, and Management Science ◽

Embodied Conversation

10.4018/978-1-7998-3473-1.ch076 ◽

2021 ◽

pp. 1091-1107

Author(s):

Andrej Zgank ◽

Izidor Mlakar ◽

Uros Berglez ◽

Danilo Zimsek ◽

Matej Borko ◽

...

Keyword(s):

Language Processing ◽

Automatic Speech Recognition ◽

Speech Synthesis ◽

Conversational Agents ◽

Embodied Conversational Agent ◽

Computer Interfaces ◽

Human Computer Interfaces ◽

Text To Speech Synthesis ◽

High Level

The chapter presents an overview of human-computer interfaces, which are a crucial element of an ambient intelligence solution. The focus is given to the embodied conversational agents, which are needed to communicate with users in a most natural way. Different input and output modalities, with supporting methods, to process the captured information (e.g., automatic speech recognition, gesture recognition, natural language processing, dialog processing, text to speech synthesis, etc.), have the crucial role to provide the high level of quality of experience to the user. As an example, usage of embodied conversational agent for e-Health domain is proposed.

Effects of Contextual and Visual Cues on Spoken Language Processing

Salience in Second Language Acquisition ◽

10.4324/9781315399027-11 ◽

2017 ◽

pp. 201-220

Author(s):

Debra M. Hardison

Keyword(s):

Language Processing ◽

Visual Cues ◽

Spoken Language ◽