Skip to main content Accessibility help

Syntactic probabilities affect pronunciation variation in spontaneous speech

  • Harry Tily (a1) (a2) (a3), Susanne Gahl (a1) (a2) (a3), Inbal Arnon (a1) (a2) (a3), Neal Snider (a1) (a2) (a3), Anubha Kothari (a1) (a2) (a3) and Joan Bresnan (a1) (a2) (a3)...


Speakers frequently have a choice among multiple ways of expressing one and the same thought. When choosing between syntactic constructions for expressing a given meaning, speakers are sensitive to probabilistic tendencies for syntactic, semantic or contextual properties of an utterance to favor one construction or another. Taken together, such tendencies may align to make one construction overwhelmingly more probable, marginally more probable, or no more probable than another. Here, we present evidence that acoustic features of spontaneous speech reflect these probabilities: when speakers choose a less probable construction, they are more likely to be disfluent, and their fluent words are likely to have a relatively longer duration. Conversely, words in more probable constructions are shorter and spoken more fluently. Our findings suggest that the differing probabilities of a syntactic construction in context are not epiphenomenal, but reflect a part of a speakers' knowledge of their language.


Corresponding author

Correspondence addresses: Harry Tily, Linguistics, Margaret Jacks Hall, Stanford University, CA 94305, USA. E-mail:
Linguistics, Margaret Jacks Hall, Stanford University, CA 94305, USA. E-mail:


Hide All
Allbritton, D. W., McKoon, G. & Ratcliff, R.. 1996. Reliability of prosodie cues for resolving syntactic ambiguity. Journal of Experimental Psychology: Learning, Memory, & Cognition 22(3). 714735.
Arnold, J., Wasow, T., Losongco, A. & Ginstrom, R.. 2000. Heaviness vs. newness: The effects of structural complexity and discourse status on constituent ordering. Language 76(1). 2855.
Aylett, M. & Turk, A.. 2004. The Smooth Signal Redundancy Hypothesis: A functional explanation for relationships between redundancy, prosodic prominence and duration in spontaneous speech. Language and Speech 47(1). 3156.
Baayen, H. 2008. Analyzing linguistic data: A practical introduction to Statistics using R. Cambridge: Cambridge University Press.
Barlow, M. & Kemmer, S. (eds.). 2000. Usage-based models of language. Chicago: CSLI.
Bell, A., Brenier, J., Gregory, M., Girand, C. & Jurafsky, D.. 2009. Predictability effects on durations of content and function words in conversational English. Journal of Memory and Language 60(1). 92111.
Bell, A., Jurafsky, D., Fosler-Lussier, E., Girand, C., Gregory, M. & Gildea, D.. 2003. Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. Journal of the Acoustical Society of America 113(2). 10011024.
Bod, R., Hay, J. & Jannedy, S. (eds.). 2003. Probabilistic linguistics. Cambridge, MA: MIT Press.
Borensztajn, G., Zuidema, W. & Bod, R.. 2009. Children's grammars grow more abstract with age—Evidence from an automatic procedure for identifying the productive units of language. Topics in Cognitive Science 1. 175188.
Brants, T. & Franz, A.. 2006. Web 1T 5-gram. Philadelphia, PA: LDC Data Consortium.
Bresnan, J. 2008. Is syntactic knowledge probabilistic? Experiments with the English dative alternation. In Featherston, S. & Sternefeld, W. (eds.), Roots: Linguistics in search of its evidential base, 7596. Berlin & New York: Mouton de Gruyter.
Bresnan, J., Cueni, A., Nikitina, T. & Baayen, H. R.. 2007. Predicting the dative alternation. In Bourne, G., Kraemer, I. & Zwarts, J. (eds.), Cognitive foundations of interpretation, 6994. Amsterdam: Royal Netherlands Academy of Science.
Bresnan, J. & Nikitina, T.. 2007. The gradience of the dative alternation. In Wee, L. H. & Uyechi, L. (eds.), Reality exploration and discovery: Pattern interaction in language and life. Stanford: CSLI.
Bybee, J. & Hopper, P. (eds.). 2001. Frequency and the emergence of linguistic structure (Typological studies in language 45). Amsterdam: John Benjamins.
Bybee, J. 2002. Phonological evidence for exemplar storage of multiword sequences. Studies in Second Language Acquisition 24(2). 215222.
Bybee, J. 2006. From usage to grammar: The mind's response to repetition. Language 82(4). 529551.
Clark, H. H. & Wasow, T.. 1998. Repeating words in spontaneous speech. Cognitive Psychology 37(3). 201242.
Clark, H. H. & Fox Tree, J. E.. 2002. Using uh and um in spontaneous speaking. Cognition 84(1). 73111.
Deshmukh, N., Ganapathiraju, A., Gleeson, A., Hamaker, J. & Picone, J.. 1998. Resegmentation of Switchboard. International Conference on Spoken Language Processing, Sydney, Australia, Australian Speech Science and Technology Association.
Erteschik-Shir, N. 1979. Discourse constraints on dative movement. In Givon, T. (ed.), Discourse and syntax, 441467. New York: Academic Press.
Fellbaum, C. 2005. Examining the constraints on the benefactive alternation by using the World Wide Web as a corpus. In Reis, M. & Kepser, S. (eds.), Linguistic evidence: Empirical, theoretical and computational perspectives, 209240. Berlin & New York: Mouton de Gruyter.
Ferreira, V. S. & Dell, G. S.. 2000. Effect of ambiguity and lexical availability on syntactic and lexical production. Cognitive Psychology 40(4). 296340.
Fox Tree, J. E. & Clark, H. H.. 1997. Pronouncing “the” as “thee” to signal problems in speaking. Cognition 62(2). 151167.
Gahl, S. 2008. “Time” and “thyme” are not homophones: Word durations in spontaneous speech. Language 84(3). 474496.
Gahl, S. & Garnsey, S. M.. 2004. Knowledge of grammar, knowledge of usage: Syntactic probabilities affect pronunciation variation. Language 80(4). 748775.
Gahl, S. & Garnsey, S. M.. 2006. Syntactic probabilities affect pronunciation variation. Language 82(2). 405410.
Gahl, S., Garnse, S. M., Fisher, C. & Matzen, L.. 2006. “That sounds unlikely”: Syntactic probabilities affect pronunciation. 28th Annual Conference of the Cognitive Science Society, CD-ROM.
Gahl, S. & Yu, A. C. L.. (eds.). 2006. Special issue on Exemplar-based Models in Linguistics. The Linguistic Review 23(3).
Garnsey, S. M., Pearlmutter, N. J., Myers, E. & Lotocky, M. A.. 1997. The contributions of verb bias and plausibility to the comprehension of temporarily ambiguous sentences. Journal of Memory & Language 37(1). 5893.
Gelman, A. 2008. Scaling regression inputs by dividing by two standard deviations. Statistics in Medicine 27. 28652873.
Genzel, D. & Charniak, E.. 2002. Entropy rate constancy in text. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 199206. Philadelphia, PA.
Godfrey, J., Holliman, E. & McDaniel, J.. 1992. Switchboard: Telephone speech corpus for research and development. International Conference on Acoustics, Speech and Signal Processing.
Green, G. 1971. Some implications of an interaction among constraints. In Papers from the seventh regional meeting, 85100. Chicago: Chicago Linguistic Society.
Green, G. 1974. Semantics and syntactic regularity. Bloomington: Indiana University Press.
Gropen, J., Pinker, S.Hollander, M., Goldberg, R. & Wilson, R.. 1989. The learnability and acquisition of the dative alternation in English. Language 65(2). 203257.
Harrell, F. E. 2007. Design: Design package. R Package version 2.1-1 [Computer software].
Hawkins, J. 1994. A performance theory of order and constituency. Cambridge: Cambridge University Press.
Jaeger, T. F. 2006. Redundancy and syntactic reduction in spontaneous speech. Stanford University, CA: Unpublished PhD dissertation.
Jaeger, T. F., Snider, N., Staum, L. & Jurafsky, D.. 2006. (In)dependence of lexical and syntactic production: That-reduction and omission in spontaneous speech. Poster presented at the 19th annual CUNY conference on Human Sentence Processing, New York, NY.
Johnson, K. 1997. Speech perception without speaker normalization: An exemplar model. In Johnson, K. & Mullennix, (eds.), Talker variability in speech processing, 145165. San Diego: Academic Press.
Jurafsky, D., Bell, A., Gregory, M. & Raymond, W. D.. 2001. Probabilistic relations between words: Evidence from reduction in lexical production [References]. In Bybee, Joan and Hopper, Paul (eds.), Frequency and the emergence of linguistic structure (Typological Studies in Language 45), 229254. Amsterdam: John Benjamins.
Kuperman, V. & Pluymaekers, M., Ernestus, M. & Baayen, R. H.. 2007. Morphological predictability and acoustic duration of interfixes in Dutch compounds. Journal of the Acoustical Society of America 121(4). 22612271.
Lawless, J. & Singhal, K.. 1978. Efficient screening on nonnormal regression models. Biometrics 34. 318327.
Levy, R. & Jaeger, T. F.. 2007. Speakers optimize information density through syntactic reduction. In Schlökopf, B., Platt, J. & Hoffman, T. (eds.), Advances in neural information processing systems 19, 849856. Cambridge, MA: MIT Press.
Lieberman, P. 1963. Some effects of semantic and grammatical context on the production and perception of speech. Language and Speech 6. 172187.
Pierrehumbert, J. B. 2001. Exemplar dynamics: Word frequency, lenition and contrast. [References]. In Bybee, Joan and Hopper, Paul (eds.), Frequency and the emergence of linguistic structure (Typological Studies in Language 45), 137157. Amsterdam: John Benjamins.
Pierrehumbert, J. B. 2002. Word-specific phonetics. In Gussenhoven, C. & Warner, N. (eds.), Laboratory phonology VII, 101140. Berlin: Mouton de Gruyter.
Pluymaekers, M., Ernestus, M. & Baayen, R. H.. 2005. Lexical frequency and acoustic reduction in spoken Dutch. Journal of the Acoustical Society of America 118(4). 25612569.
R Development Core Team. 2008. R: A language and environment for statistical computing. Vienna.
Recchia, G. 2007. STRATA: Search Tools for Richly Annotated and Time-Aligned Linguistic Data. Stanford University, CA: Unpublished undergraduate honors thesis.
Schafer, A. J., Speer, S. R. & Warren, P.. 2005. Prosodic influences on the production and comprehension of syntactic ambiguity in a game-based conversation task. In Trueswell, J. C. & Tanenhaus, M. K. (eds.), Approaches to studying world-situated language use, 209225. Cambridge, MA: MIT Press.
Shriberg, E. 2001. To ‘errrr’ is human: Ecology and acoustics of speech disfluencies. Journal of the International Phonetic Association 31(1). 153169.
Szmrecsanyi, B. & Hinrichs, L.. 2008. Probabilistic determinants of genitive variation in spoken and written English: A multivariate comparison across time, space, and genres. In Nevalainen, T., Taavitsainen, I., Pahta, P. & Korhonen, M. (eds.), The dynamics of linguistic variation: Corpus evidence on English past and present. Amsterdam: John Benjamins.
Trueswell, J. C., Tanenhaus, M. K. & Kello, C.. 1993. Verb-specific constraints in sentence processing: Separating effects of lexical preference from garden-paths. Journal of Experimental Psychology: Learning, Memory & Cognition 19(3). 528553.
Wasow, T. 2002. Postverbal behavior. Stanford, CA: CSLI Publications.



Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed