The human voice

Ian Vince McLoughlin

doi:10.1017/CBO9781316084205.005

In Chapter 2 we looked at the general handling, processing and visualisation of audio: vectors or sequences of samples captured at some particular sample rate, and which together represent sound.

In this chapter, we will build upon that foundation, and use it to begin to look at (or analyse) speech. There is nothing special about speech from an audio perspective – it is simply a continuous sequence of time varying amplitudes and tones just like any other sound – it's only when a human hears it and the brain becomes involved that the sound is interpreted as being speech.

There is a famous experiment which demonstrates a sentence of something called sinewave speech. This presents a particular sound recording made from sinewaves. Initially, the brain of a listener does not consider this to be speech, and so the signal is unintelligible. However, after the corresponding sentence is heard spoken aloud in a normal way, the listener's brain suddenly ‘realises’ that the signal is in fact speech, and from then on it becomes intelligible. After that the listener does not seem to ‘unlearn’ this ability to understand sinewave speech: subsequent sentences which may be completely unintelligible to others will have become intelligible to this listener [8]. To listen to some sinewave speech, please go to the book website at http://mcloughlin.eu/sws.

There is a point to sinewave speech. It demonstrates that, while speech is just a structured set of modulated frequencies, the combination of these in a certain way has a special meaning to the brain. Music and some naturally occurring sounds also have some inherently speech-like characteristics, but we do not often mistake music for speech. It is likely that there is some kind of decision process in the human hearing system that sends speech-like sounds to one part of the brain for processing (the part that handles speech), and sends other sounds to different parts of the brain. However, there is a lot hidden inside the human brain that we do not understand, and how it handles speech is just one of those grey areas.

Fortunately speech itself is much easier to analyse and understand computationally: the speech signal is easy to capture with a microphone and record on computer. Over the years, speech characteristics have been very well researched, with many specialised analysis, handling and processing methods having been developed for this particular type of audio.

Book contents

3 - The human voice

Summary

Access options

Book contents

3 - The human voice

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive