Skip to main content Accessibility help
×
Home

Speaker-specific processing and local context information: The case of speaking rate

  • EVA REINISCH (a1)

Abstract

To deal with variation in the speech signal, listeners rely on local context, such as speaking rate in a carrier sentence directly preceding a target, as well as more global properties of the speech signal, such as speaker-specific pronunciation variants. The present study addressed whether, despite its variability even within one speaker, habitual speaking rate can be tracked as a speaker-specific property and how such speaker-specific tracking of habitual rate would interact with effects of local-rate normalization. In two experiments, listeners were exposed to a 2-min dialogue between a fast and a slow speaker. At test, listeners categorized minimal word pair continua differing in the German /a/–/a:/ duration contrast spoken by the same two speakers. The results showed that listeners responded with /a:/ more often for the fast speaker but only when words were presented in isolation and not when presented with additional local-rate information. That is, despite the general assumption that duration cues and speaking rate are too variable to be used in a speaker-specific fashion, tracking habitual speaking rate may help speech perception. The results are discussed in relation to a belief-updating model of perceptual adaptation and exemplar models.

Copyright

Corresponding author

ADDRESS FOR CORRESPONDENCE Eva Reinisch, Institute of Phonetics and Speech Processing, Ludwig Maximilian University Munich, Schellingstraße 3, Munich 80799, Germany. E-mail: evarei@phonetik.uni-muenchen.de

References

Hide All
Abramson, A. S., & Lisker, L. (1985). Relative power of cues: F0 shift versus voice timing. In Fromkin, V. (Ed.), Phonetic linguistics: Essays in honor of Peter Ladefoged (pp. 2533). New York: Academic Press.
Allen, S. J., & Miler, J. L. (2004). Listener sensitivity to individual talker differences in voice-onset-time. Journal of the Acoustical Society of America, 115, 31713183.
Allen, J. S., Miller, J. L., & deSteno, D. (2003). Individual talker differences in voice-onset-time. Journal of the Acoustical Society of America, 113, 544552.
Baese-Berk, M., Bradlow, A. R., & Wright, B. A. (2013). Accent-independent adaptation to foreign-accented speech. Journal of the Acoustical Society of America, 133, EL174EL180.
Baese-Berk, M. M., Heffner, C. C., Dilley, C. L., Pitt, M. A., Morrill, T. H., & McAuley, J. D. (2014). Long-term temporal tracking of speech rate affects spoken-word recognition. Psychological Science, 25, 15461553.
Barr, D. J., Levy, R., Scheepers, C., & Tily, H. (2013). Random-effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68, 255278.
Boersma, P., & Weenink, D. (2009). PRAAT, doing phonetics by computer (version 5.1) [Computer software]. Retrieved from http://www.praat.org
Bradlow, A. R., & Bent, T. (2008). Perceptual adaptation to non-native speech. Cognition, 106, 707729.
Brouwer, S., Mitterer, H., & Huettig, F. (2012). Speech reductions change the dynamics of competition during spoken word recognition. Language and Cognitive Processes, 27, 539571.
Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36, 181253.
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Boston: Houghton Mifflin.
Creel, S. C., & Tumlin, M. A. (2011). On-line acoustic and semantic interpretation of talker information. Journal of Memory and Language, 65, 264285.
Crystal, T. H., & House, A. S. (1982). Segmental durations in connected speech signals: Preliminary results. Journal of the Acoustical Society of America, 72, 705716.
Crystal, T. H., & House, A. S. (1988). Segmental durations in connected-speech signals: Current results. Journal of the Acoustical Society of America, 83, 15531573.
Dilley, L. C., & Pitt, M. A. (2010). Altering context speech rate can cause words to appear or disappear. Psychological Science, 21, 16641670.
Farmer, T. A., Brown, M., & Tanenhaus, M. C. (2013). Prediction, explanation, and the role of generative models in language processing: Commentary to Clark, A. Behavioral and Brain Sciences, 36, 211212.
Gay, T. (1978). Effect of speaking rate on vowel formant movements. Journal of the Acoustical Society of America, 63, 223230.
Goldinger, S. D. (1996). Words and voices: Episodic traces in spoken word identification and recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 11661183.
Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105, 251279.
Green, K. P., Tomiak, G. R., & Kuhl, P. K. (1997). The encoding of rate and talker information during phonetic perception. Perception & Psychophysics, 59, 675692.
Jessen, M. (1993). Stress conditions on vowel quality and quantity in German. Working Papers of the Cornell Phonetics Laboratory, 8, 127.
Johnson, K. (1997). Speech perception without speaker normalization: An exemplar model. In Johnson, K. & Mullennix, J. W. (Eds.), Talker variability in speech processing (pp. 145165). San Diego, CA: Academic Press.
Johnson, K. (2006). Resonance in an exemplar-based lexicon: The emergence of social identity and phonology. Journal of Phonetics, 34, 485499.
Jungers, M. K., & Hupp, J. M. (2009). Speech priming: Evidence for rate persistence in unscripted speech. Language and Cognitive Processes, 24, 611624.
Kidd, G. R. (1989). Articulatory-rate context effects in phoneme identification. Journal of Experimental Psychology: Human Perception and Performance, 15, 736748.
Kleinschmidt, D. F., & Jaeger, T. F. (2015). Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel. Psychological Review, 122, 148203.
Koreman, J. (2006). Perceived speech rate: The effects of articulation rate and speaking style in spontaneous speech. Journal of the Acoustical Society of America, 119, 582596.
Kraljic, T., & Samuel, A. G. (2007). Perceptual adjustments to multiple speakers. Journal of Memory and Language, 56, 115.
Lisker, L., & Abramson, A. S. (1964). A cross language study of voicing in initial stops: Acoustic measurements. Word, 20, 384420.
Lisker, L., & Abramson, A. S. (1967). Some effects of context on voice onset time in English stops. Language and Speech, 10, 128.
McQueen, J. M., & Huettig, F. (2012). Changing only the probability that spoken words will be distorted changes how they are recognized. Journal of the Acoustical Society of America, 131, 509517.
Miller, J. L. (1987). Rate-dependent processing in speech perception. In Ellis, A. W. (Ed.), Progress in the psychology of language (Vol. 3, pp. 119157). London: Erlbaum.
Miller, J. L., & Dexter, E. R. (1988). Effects of speaking rate and lexical status on phonetic perception. Journal of Experimental Psychology: Human Perception and Performance, 14, 369378.
Miller, J. L., Grosjean, F., & Lomanto, C. (1984). Articulation rate and its variability in spontaneous speech: A reanalysis and some implications. Phonetica, 41, 215225.
Miller, J. L., & Liberman, A. M. (1979). Some effects of later-occurring information on the perception of stop consonant and semivowel. Perception & Psychophysics, 25, 457465.
Newman, R. S., & Sawusch, J. R. (1996). Perceptual normalization for speaking rate: Effects of temporal distance. Perception & Psychophysics, 58, 540560.
Newman, R. S., & Sawusch, J. R. (2009). Perceptual normalization for speaking rate: III. Effects of the rate of one voice on perception of another. Journal of Phonetics, 37, 4665.
Norris, D., McQueen, J. M., & Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology, 47, 204238.
Nygaard, L. C., & Pisoni, D. B. (1998). Talker-specific learning in speech perception. Perception & Psychophysics, 60, 355376.
Nygaard, L. C., Sommers, M. S., & Pisoni, D. B. (1994). Speech perception as a talker-contingent process. Psychological Science, 5, 4246.
Pätzold, M., & Simpson, A. P. (1997). Acoustic analysis of German vowels in read speech. In Simpson, A. P., Kohler, K. J., & Rettstadt, T. (Eds.). The Kiel Corpus of Read/Spontaneous Speech—Acoustic database, processing tools and analysis results (AIPUK Vol. 32, pp. 215247). Kiel, Germany: IPDS.
Pierrehumbert, J. B. (2001). Exemplar dynamics: Word frequency, lenition and contrast. in Bybee, J. & Hopper, P. (Eds.), Frequency effects and the emergence of linguistic structure (pp. 137157). Amsterdam: John Benjamins.
Poellmann, K., Mitterer, H., & McQueen, J. M. (2014). Use what you can: Storage, abstraction processes, and perceptual adjustments help listeners recognize reduced form. Frontiers in Psychology: Language Sciences, 5, 437.
Quené, H. (2008). Multilevel modeling of between-speaker and within-speaker variation in spontaneous speech tempo. Journal of the Acoustical Society of America, 123, 11041113.
Quené, H. (2013). Longitudinal trends in speech tempo: The case of queen Beatrix. Journal of the Acoustical Society of America, 133, EL452EL457.
Reinisch, E., Jesse, A., & McQueen, J. M. (2011a). Speaking rate from proximal and distal contexts is used during word segmentation. Journal of Experimental Psychology: Human Perception and Performance, 37, 978996.
Reinisch, E., Jesse, A., & McQueen, J. M. (2011b). Speaking rate affects the perception of duration as a suprasegmental lexical-stress cue. Language and Speech, 54, 147166.
Reinisch, E., & Sjerps, M. J. (2013). Compensation for speaking rate and spectral context take place at a similar point in time. Journal of Phonetics, 41, 101116.
Sawusch, J. R., & Newman, R. S. (2000). Perceptual normalization for speaking rate: II. Effects of signal discontinuities. Perception & Psychophysics, 62, 285300.
Sjerps, M. J., & Reinisch, , , E. (2015). Divide and conquer: How perceptual contrast sensitivity and perceptual learning cooperate in reducing input variation in speech perception. Journal of Experimental Psychology: Human Perception and Performance, 41, 710722.
Summerfield, Q. (1981). Articulatory rate and perceptual constancy in phonetic perception. Journal of Experimental Psychology: Human Perception and Performance, 7, 10741095.
Theodore, R. M., Miller, J. L., & deSteno, D. (2009). Individual talker differences in voice-onset-time: Contextual influences. Journal of the Acoustical Society of America, 125, 39743982.
Tsao, Y.-C., & Weismer, G. (1997). Interspeaker variation in habitual speaking rate: Evidence for a neuromuscular component. Journal of Speech, Language, and Hearing Research, 40, 858866.
Wayland, S. C., Miller, J. L., & Volaitis, L. E. (1994). The influence of sentential speaking rate on the internal structure of phonetic categories. Journal of the Acoustical Society of America, 95, 26942701.
Wilson, M., & Wilson, T. P. (2005). An oscillator model of the timing on turn-taking. Psychonomic Bulletin & Review, 12, 957968.

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed