5 - Psychoacoustics
Published online by Cambridge University Press: 05 June 2016
Summary
If there is one topic that has most deeply impacted audio and speech research over the past two decades or so, it is psychoacoustics (by contrast, the deepest impact in audio and speech engineering has probably been big data – explored separately in Chapter 8). We now know that perceived sounds and speech owe just as much to psychology as they do to physiology. The state and activity of the human brain and nervous system have a profound influence on the characteristics of speech and sounds that are perceived by human listeners.
It is definitely beyond the scope of this book to delve into too much detail concerning the psychological reasons underpinning psychoacoustics – and indeed much of that detail remains to be discovered – but we will discuss, demonstrate and uncover many interesting and useful psychoacoustic phenomena in this chapter. Extensive experiments by cross-disciplinary researchers over the past two decades have allowed computational models to be developed to begin to describe the effects of psychoacoustics. While these models vary in complexity and accuracy, and continue to increase in quality and usefulness, they have already found applications in many areas of daily life. The following sections will overview many of the effects that the models can (or could) describe. We will take a fascinating look at auditory scene analysis (which includes a number of auditory-illusion style demonstrations), before building and applying our own phsychoacoustic models.
Psychoacoustic processing
The use of psychoacoustic criteria to improve communications systems, or rather to target the available resources towards more subjectively important areas, is now common. Many telephone communications systems use A-law compression. Around 1990, Philips and Sony respectively produced the DCC (digital compact cassette) and the MiniDisc formats which both make extensive use of equal-loudness contours and masking information to compress high-quality audio [54]. Whilst neither of these was a runaway market success, they introduced psychoacoustics to the music industry, and paved the way for solid state music players such as the Creative Technologies Zen micro, Apple iPod and various devices from other innovative companies.
Most of these devices use the popular MP3 compression format, although more recent formats, such as Ogg Vorbis, MP4 and various proprietary alternatives such as WMA also exist (refer to Infobox 2.2 on page 15 for descriptions of these).
- Type
- Chapter
- Information
- Speech and Audio ProcessingA MATLAB-based Approach, pp. 109 - 139Publisher: Cambridge University PressPrint publication year: 2016