Spoken language processing by machine
The past twenty-five years have witnessed a steady improvement in the capabilities of spoken language technology, first in the research laboratory and more recently in the commercial marketplace. Progress has reached a point where automatic speech recognition software for dictating documents onto a computer is available as an inexpensive consumer product in most computer stores, text-to-speech synthesis can be heard in public places giving automated voice announcements, and interactive voice response is becoming a familiar option for people paying bills or booking cinema tickets over the telephone. This article looks at the main computational approaches employed in contemporary spoken language processing. It discusses acoustic modelling, language modelling, pronunciation modelling, and noise modelling. The article also considers future prospects in the context of the obvious shortcomings of current technology, and briefly addresses the potential for achieving a unified approach to human and machine spoken language processing.