This paper has two main objectives. First, we propose that cross-linguistic variation in speech rhythm is not phonetically manifested simply as acoustic isochrony, but rather as relative temporal stability of syllables – i.e. the tendency for certain syllables to occur at particular points in time despite other factors that oppose this tendency. Second, we compare two languages believed to be rhythmically distinct, English and Japanese, and demonstrate that they not only display reliable rhythmic differences, but also striking similarities in the phonetic manifestation of foot-level structure. These goals are accomplished by using speech cycling – an artificial speaking task in which subjects repeat phrases in time with periodic auditory stimuli.
There has long been an intuition that languages are spoken with different kinds of rhythm. Conventionally, languages have been classified as either ‘stress-timed’ or ‘syllable-timed’ (Jones 1918; Pike 1945; Abercrombie 1967), depending on whether it is interstress intervals or intersyllable intervals that are regular. The prediction about timing that is generally seen to follow from ‘stress-timed’, ‘syllable-timed’, or even ‘mora-timed’ (Port, Dalby and O'Dell 1987), is perfect isochrony (equal time intervals) of stresses, syllables or moras. Not surprisingly, however, perfect isochrony has proven to be a difficult test to satisfy, at least in naturalistic speaking styles (Lehiste 1977; Dauer 1983; Couper-Kuhlen 1993).
In generative phonology, a way of characterizing the rhythm of so-called stress-accent languages has evolved based on the quasi-periodic alternation of strong and weak syllables (Liberman and Prince 1977; Selkirk 1984).