Book contents
- Frontmatter
- Contents
- Preface
- PART I BEGINNINGS
- PART II EARLY EXPLORATIONS: 1950S AND 1960S
- PART III EFFLORESCENCE: MID-1960S TO MID-1970S
- PART IV APPLICATIONS AND SPECIALIZATIONS: 1970s TO EARLY 1980s
- 17 Speech Recognition and Understanding Systems
- 18 Consulting Systems
- 19 Understanding Queries and Signals
- 20 Progress in Computer Vision
- 21 Boomtimes
- PART V “NEW-GENERATION” PROJECT
- PART VI ENTR'ACTE
- PART VII THE GROWING ARMAMENTARIUM: FROM THE 1980s ONWARD
- PART VIII MODERN AI: TODAY AND TOMORROW
- Index
- Plate section
17 - Speech Recognition and Understanding Systems
Published online by Cambridge University Press: 05 August 2013
- Frontmatter
- Contents
- Preface
- PART I BEGINNINGS
- PART II EARLY EXPLORATIONS: 1950S AND 1960S
- PART III EFFLORESCENCE: MID-1960S TO MID-1970S
- PART IV APPLICATIONS AND SPECIALIZATIONS: 1970s TO EARLY 1980s
- 17 Speech Recognition and Understanding Systems
- 18 Consulting Systems
- 19 Understanding Queries and Signals
- 20 Progress in Computer Vision
- 21 Boomtimes
- PART V “NEW-GENERATION” PROJECT
- PART VI ENTR'ACTE
- PART VII THE GROWING ARMAMENTARIUM: FROM THE 1980s ONWARD
- PART VIII MODERN AI: TODAY AND TOMORROW
- Index
- Plate section
Summary
Speech Processing
The NLP systems I have already described required that their English input be in text format. Yet, there are several instances in which speaking to a computer would be preferable to typing at one. People can generally speak faster than they can type (about three words per second versus about one word per second), and they can speak while they are moving about. Also, speaking does not tie up hands or eyes.
In discussing the problem of computer processing of speech, it is important to make some distinctions. One involves the difference between recognizing an isolated spoken word versus processing a continuous stream of speech. Most AI research has concentrated on the second and harder of these problems. Another distinction is between speech recognition and speech understanding.
By speech recognition is meant the process of converting an acoustic stream of speech input, as gathered by a microphone and associated electronic equipment, into a text representation of its component words. This process is difficult because many acoustic streams sound similar but are composed of quite different words. (Consider, for example, the spoken versions of “There are many ways to recognize speech,” and “There are many ways to wreck a nice beach.”) Speech understanding, in contrast, requires that what is spoken be understood. An utterance can be said to be understood if it elicits an appropriate action or response, and this might even be possible without recognizing all of its words.
- Type
- Chapter
- Information
- The Quest for Artificial Intelligence , pp. 209 - 223Publisher: Cambridge University PressPrint publication year: 2009