Audition and Perception

Section IV - Audition and Perception

Published online by Cambridge University Press: 11 November 2021

Edited by

Rachael-Anne Knight and

Jane Setter

Show author details

Rachael-Anne Knight: Affiliation:
City, University of London
Jane Setter: Affiliation:
University of Reading

Book contents

Get access

Summary

A summary is not available for this content so a preview has been provided. Please use the Get access link above for information on how to access this content.

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'

Type: Chapter
Information: The Cambridge Handbook of Phonetics , pp. 405 - 500

DOI: https://doi.org/10.1017/9781108644198 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2021

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

16.7 References

Abutalebi, J. (2008). Neural aspects of second language representation and language control. Acta Psychologica, 128(3), 466–78.Google Scholar

Abutalebi, J., Della Rosa, P. A., Green, D. W., Hernandez, M., Scifo, P., Keim, R. et al. (2012). Bilingualism tunes the anterior cingulate cortex for conflict monitoring. Cerebral Cortex, 22(9), 2076–86.Google Scholar

Arantes, M., Arantes, J. & Ferreira, M. A. (2018). Tools and resources for neuroanatomy education: A systematic review. BMC Medical Education, 18(1), 94.CrossRef Google Scholar PubMed

Beaulieu, C. (2002). The basis of anisotropic water diffusion in the nervous system: A technical review. NMR in Biomedicine, 15(7–8), 435–55.Google Scholar

Beaulieu, C. (2014). The biological basis of diffusion anisotropy BT – diffusion MRI: From quantitative measurement to in-vivo neuroanatomy. In Johansen-Berg, H. & Behrens, T. E. J., eds., Diffusion MRI: From Quantitative Measurement to In-vivo Neuroanatomy. San Diego, CA: Academic Press, pp. 155–83.Google Scholar

Bidelman, G. M. (2018). Subcortical sources dominate the neuroelectric auditory frequency-following response to speech. NeuroImage, 175, 56–69.Google Scholar

Binder, J. R. (2015). The Wernicke area: Modern evidence and a reinterpretation. Neurology, 85(24), 2170–5.Google Scholar

Binder, J. R., Frost, J. A., Hammeke, T. A., Bellgowan, P. S., Springer, J. A., Kaufman, J. N. et al. (2000). Human temporal lobe activation by speech and nonspeech sounds. Cerebral Cortex, 10(5), 512–28.Google Scholar

Bopp, K. L. & Verhaeghen, P. (2005). Aging and verbal memory span: A meta-analysis. The Journals of Gerontology. Series B, Psychological Sciences and Social Sciences, 60(5), P223–P233.CrossRef Google Scholar PubMed

Brauer, J., Anwander, A. & Friederici, A. D. (2011). Neuroanatomical prerequisites for language functions in the maturing brain. Cerebral Cortex, 21(2), 459–66.Google Scholar

Burke, D. M. & Shafto, M. A. (2004). Aging and language production. Current Directions in Psychological Science, 13(1), 21–4.Google Scholar

Catani, M. & Mesulam, M. (2008). The arcuate fasciculus and the disconnection theme in language and aphasia: history and current state. Cortex, 44(8), 953–61.CrossRef Google Scholar PubMed

Chandrasekaran, B., Hornickel, J., Skoe, E., Nicol, T. & Kraus, N. (2009). Context-dependent encoding in the human auditory brainstem relates to hearing speech in noise: Implications for developmental dyslexia. Neuron, 64(3), 311–19.Google Scholar

Chandrasekaran, B., Chan, A. H. D. & Wong, P. C. M. (2011). Neural processing of what and who information in speech. Journal of Cognitive Neuroscience, 23(10), 2690–700.CrossRef Google Scholar PubMed

Chartier, J., Anumanchipalli, G. K., Johnson, K. & Chang, E. F. (2018). Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex. Neuron, 98(5), 1042–54.Google Scholar

Chee, M. W. L., Hon, N., Lee, H. L. & Soon, C. S. (2001). Relative language proficiency modulates BOLD signal change when bilinguals perform semantic judgments. NeuroImage, 13(6), 1155–63.Google Scholar

Chodosh, J., Reuben, D. B., Albert, M. S. & Seeman, T. E. (2002). Predicting cognitive impairment in high-functioning community-dwelling older persons: MacArthur studies of successful aging. Journal of the American Geriatrics Society, 50(6), 1051–60.Google Scholar

Coffey, E. B. J., Herholz, S. C., Chepesiuk, A. M. P., Baillet, S. & Zatorre, R. J. (2016). Cortical contributions to the auditory frequency-following response revealed by MEG. Nature Communications, 7, 11070.CrossRef Google Scholar

Cullum, S., Huppert, F. A., Mcgee, M., Dening, T. O. M., Ahmed, A., Paykel, E. S. et al. (2000). Decline across different domains of cognitive function in normal ageing: Results of a longitudinal population-based study using CAMCOG. International Journal of Geriatric Psychiatry, 15(9), 853–62.3.0.CO;2-T>CrossRef Google Scholar PubMed

Davis, M. H. & Johnsrude, I. S. (2003). Hierarchical processing in spoken language comprehension. The Journal of Neuroscience, 23(8), 3423–31.Google Scholar

Diamond, M. C., Scheibel, A. B. & Elson, L. M. (1985). The Human Brain Coloring Book: Coloring Concepts. New York: HarperCollins.Google Scholar

Drachman, D. A. (2006). Aging of the brain, entropy, and Alzheimer disease. Neurology, 67(8), 1340–52.Google Scholar

Eckert, M. A., Keren, N. I., Roberts, D. R., Calhoun, V. D. & Harris, K. C. (2010). Age-related changes in processing speed: Unique contributions of cerebellar and prefrontal cortex. Frontiers in Human Neuroscience, 4, 10.Google Scholar PubMed

Federmeier, K. D., Van Petten, C., Schwartz, T. J. & Kutas, M. (2003). Sounds, words, sentences: Age-related changes across levels of language processing. Psychology and Aging, 18(4), 858–72.Google Scholar

Feng, G., Ingvalson, E. M., Grieco-Calub, T. M., Roberts, M. Y., Ryan, M. E., Birmingham, P. et al. (2018). Neural preservation underlies speech improvement from auditory deprivation in young cochlear implant recipients. Proceedings of the National Academy of Sciences of the United States of America, 115(5), E1022–E1031.Google Scholar

Flinker, A., Korzeniewska, A., Shestyuk, A. Y., Franaszczuk, P. J., Dronkers, N. F., Knight, R. T. et al. (2015). Redefining the role of Broca’s area in speech. Proceedings of the National Academy of Sciences of the United States of America, 112(9), 2871–5.Google Scholar PubMed

Formisano, E., De Martino, F., Bonte, M. & Goebel, R. (2008). ‘Who’ is saying ‘what’? Brain-based decoding of human voice and speech. Science, 322(5903), 970–3.Google Scholar

Friederici, A. D. (2002). Towards a neural basis of auditory sentence processing. Trends in Cognitive Sciences, 6(2), 78–84.Google Scholar

Friederici, A. D. (2009). Allocating functions to fiber tracts: Facing its indirectness. Trends in Cognitive Sciences, 13(9), 370–1.Google Scholar

Giles, J. (2010). Clinical neuroscience attachments: A student’s view of ‘neurophobia’. The Clinical Teacher, 7(1), 9–13.CrossRef Google Scholar PubMed

Glasser, M. F. & Rilling, J. K. (2008). DTI tractography of the human brain’s language pathways. Cerebral Cortex, 18(11), 2471–82.Google Scholar

Goebel, R. (2008). Brain Tutor 3D. Retrieved from www.brainvoyager.com.Google Scholar

Golestani, N., Molko, N., Dehaene, S., LeBihan, D. & Pallier, C. (2007). Brain structure predicts the learning of foreign speech sounds. Cerebral Cortex, 17(3), 575–82.Google Scholar

Grady, C. L. & Craik, F. I. (2000). Changes in memory processing with age. Current Opinion in Neurobiology, 10(2), 224–31.Google Scholar

Green, D. W. (2003). Neural basis of lexicon and grammar in L2 acquisition: The convergence hypothesis. In van Hout, R., Hulk, A., Kuiken, F. & Towell, R. J., eds., The Interface Between Syntax and the Lexicon in Second Language Acquisition. Amsterdam: John Benjamins, pp. 197–218.Google Scholar

Greenwood, P. M. (2007). Functional plasticity in cognitive aging: Review and hypothesis. Neuropsychology, 21(6), 657–73.Google Scholar

Hagoort, P. (2005). On Broca, brain, and binding: A new framework. Trends in Cognitive Sciences, 9(9), 416–23.CrossRef Google Scholar

Harrington, D. L. & Haaland, K. Y. (1992). Skill learning in the elderly: Diminished implicit and explicit memory for a motor sequence. Psychology and Aging, 7(3), 425–34.Google Scholar

Hickok, G. & Poeppel, D. (2007). The cortical organization of speech processing. Nature Reviews Neuroscience, 8(5), 393–402.Google Scholar

Hirayasu, Y., McCarley, R. W., Salisbury, D. F., Tanaka, S., Kwon, J. S., Frumin, M. et al. (2000). Planum temporale and Heschl’s gyrus volume reduction in schizophrenia: A magnetic resonance imaging study of first-episode patients. Archives of General Psychiatry, 57(7), 692–99.Google Scholar

Johnson, K. & Mullennix, J. W. (1997). Talker Variability in Speech Processing. San Diego, CA: Academic Press.Google Scholar

Kraus, N. & Chandrasekaran, B. (2010). Music training for the development of auditory skills. Nature Reviews Neuroscience, 11(8), 599–605.Google Scholar

Krishnan, A., Xu, Y., Gandour, J. & Cariani, P. (2005). Encoding of pitch in the human brainstem is sensitive to language experience. Brain Research. Cognitive Brain Research, 25(1), 161–8.CrossRef Google Scholar PubMed

Leonard, M. K., Desai, M., Hungate, D., Cai, R., Singhal, N. S., Knowlton, R. C. et al. (2019). Direct cortical stimulation of inferior frontal cortex disrupts both speech and music production in highly trained musicians. Cognitive Neuropsychology, 36(3–4), 158–66.Google Scholar

Liberman, A. M. & Mattingly, I. G. (1985). The motor theory of speech perception revisited. Cognition, 21, 1–36.Google Scholar

Mechelli, A., Crinion, J. T., Noppeney, U., O’Doherty, J., Ashburner, J., Frackowiak, R. S. et al. (2004). Structural plasticity in the bilingual brain. Nature, 431(7010), 757–757.Google Scholar

Menjot de Champfleur, N., Lima Maldonado, I., Moritz-Gasser, S., Machi, P., Le Bars, E., Bonafé, A. et al. (2013). Middle longitudinal fasciculus delineation within language pathways: A diffusion tensor imaging study in human. European Journal of Radiology, 82(1), 151–7.Google Scholar

Mitchell, D. B. & Bruss, P. J. (2003). Age differences in implicit memory: Conceptual, perceptual, or methodological? Psychology and Aging, 18(4), 807–22.Google Scholar

Morse, C. K. (1993). Does variability increase with age? An archival study of cognitive measures. Psychology and Aging, 8(2), 156–64.Google Scholar

Neef, N. E., Müller, B., Liebig, J., Schaadt, G., Grigutsch, M., Gunter, T. C. et al. (2017). Dyslexia risk gene relates to representation of sound in the auditory brainstem. Developmental Cognitive Neuroscience, 24, 63–71.Google Scholar

Nelson, D. L., Schreiber, T. A. & McEvoy, C. L. (1992). Processing implicit and explicit representations. Psychological Review, 99(2), 322–48.Google Scholar

Okada, K., Rong, F., Venezia, J., Matchin, W., Hsieh, I.-H., Saberi, K. et al. (2010). Hierarchical organization of human auditory cortex: Evidence from acoustic invariance in the response to intelligible speech. Cerebral Cortex, 20(10), 2486–95.Google Scholar

Otto-Meyer, S., Krizman, J., White-Schwoch, T. & Kraus, N. (2018). Children with autism spectrum disorder have unstable neural responses to sound. Experimental Brain Research, 32(11), 14111–56.Google Scholar

Park, D. C., Lautenschlager, G., Hedden, T., Davidson, N. S., Smith, A. D. & Smith, P. K. (2002). Models of visuospatial and verbal memory across the adult life span. Psychology and Aging, 17(2), 299–320.Google Scholar

Peelle, J. E., Johnsrude, I. S. & Davis, M. H. (2010). Hierarchical processing for speech in human auditory cortex and beyond. Frontiers in Human Neuroscience, 4, 51.Google Scholar PubMed

Peelle, J. E., Troiani, V., Wingfield, A. & Grossman, M. (2010). Neural processing during older adults’ comprehension of spoken sentences: Age differences in resource allocation and connectivity. Cerebral Cortex, 20(4), 773–82.Google Scholar

Perani, D. & Abutalebi, J. (2005). The neural basis of first and second language processing. Current Opinion in Neurobiology, 15(2), 202–6.Google Scholar

Poeppel, D. (2012). The maps problem and the mapping problem: Two challenges for a cognitive neuroscience of speech and language. Cognitive Neuropsychology, 29(1–2), 34–55.Google Scholar

Poeppel, D. (2014). The neuroanatomic and neurophysiological infrastructure for speech and language. Current Opinion in Neurobiology, 28, 142–9.Google Scholar

Price, C. J. (2000). The anatomy of language: Contributions from functional neuroimaging. Journal of Anatomy, 197(3), 335–59.Google Scholar

Pulvermuller, F., Huss, M., Kherif, F., Moscoso del Prado Martin, F., Hauk, O. & Shtyrov, Y. (2006). Motor cortex maps articulatory features of speech sounds. Proceedings of the National Academy of Sciences, 103(20), 7865–70.Google Scholar

Rauschecker, J. P. & Scott, S. K. (2009). Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nature Neuroscience, 12(6), 718–24.Google Scholar

Röder, B., Stock, O., Neville, H., Bien, S. & Rösler, F. (2002). Brain activation modulated by the comprehension of normal and pseudo-word sentences of different processing demands: A functional magnetic resonance imaging study. NeuroImage, 15(4), 1003–14.Google Scholar

Saur, D., Kreher, B. W., Schnell, S., Kümmerer, D., Kellmeyer, P., Vry, M.-S. et al. (2008). Ventral and dorsal pathways for language. Proceedings of the National Academy of Sciences of the United States of America, 105(46), 18035–40.Google Scholar

Saur, D., Schelter, B., Schnell, S., Kratochvil, D., Küpper, H., Kellmeyer, P. et al. (2010). Combining functional and anatomical connectivity reveals brain networks for auditory language comprehension. NeuroImage, 49(4), 3187–97.Google Scholar

Schacter, D. L. (1992). Priming and multiple memory systems: Perceptual mechanisms of implicit memory. Journal of Cognitive Neuroscience, 4(3), 244–56.Google Scholar

Schneider, P., Scherg, M., Dosch, H. G., Specht, H. J., Gutschalk, A. & Rupp, A. (2002). Morphology of Heschl’s gyrus reflects enhanced activation in the auditory cortex of musicians. Nature Neuroscience, 5(7), 688–94.Google Scholar

Scott, S. K., Blank, C. C., Rosen, S. & Wise, R. J. (2000). Identification of a pathway for intelligible speech in the left temporal lobe. Brain, 123(12), 2400–6.CrossRef Google Scholar PubMed

Skoe, E., Chandrasekaran, B., Spitzer, E. R., Wong, P. C. M. & Kraus, N. (2014). Human brainstem plasticity: The interaction of stimulus probability and auditory learning. Neurobiology of Learning and Memory, 109, 82–93.CrossRef Google Scholar PubMed

Slevc, L. R. & Miyake, A. (2006). Individual differences in second-language proficiency. Psychological Science, 17(8), 675–81.Google Scholar

Smith, P. A. (2010). Ageing, auditory distraction, and grammaticality judgement. Aphasiology, 24(11), 1342–53.Google Scholar

Staeren, N., Renvall, H., De Martino, F., Goebel, R. & Formisano, E. (2009). Sound categories are represented as distributed patterns in the human auditory cortex. Current Biology, 19(6), 498–502.Google Scholar

Tremblay, P. & Dick, A. S. (2016). Broca and Wernicke are dead, or moving past the classic model of language neurobiology. Brain and Language, 162, 60–71.Google Scholar

Vaden, K. I., Piquado, T. & Hickok, G. (2011). Sublexical properties of spoken words modulate activity in Broca’s area but not superior temporal cortex: Implications for models of speech recognition. Journal of Cognitive Neuroscience, 23(10), 2665–74.Google Scholar

Veroude, K., Norris, D. G., Shumskaya, E., Gullberg, M. & Indefrey, P. (2010). Functional connectivity between brain regions involved in learning words of a new language. Brain and Language, 113(1), 21–7.CrossRef Google Scholar PubMed

Vigneau, M., Beaucousin, V., Hervé, P. Y., Duffau, H., Crivello, F., Houdé, O. et al. (2006). Meta-analyzing left hemisphere language areas: Phonology, semantics, and sentence processing. NeuroImage, 30(4), 1414–32.Google Scholar

Vigneau, M., Beaucousin, V., Hervé, P.-Y., Jobard, G., Petit, L., Crivello, F. et al. (2011). What is right-hemisphere contribution to phonological, lexico-semantic, and sentence processing? Insights from a meta-analysis. NeuroImage, 54(1), 577–93.Google Scholar

Warrier, C., Wong, P. C. M., Penhune, V., Zatorre, R., Parrish, T., Abrams, D. et al. (2009). Relating structure to function: Heschl’s gyrus and acoustic processing. The Journal of Neuroscience, 29(1), 61–9.Google Scholar

Weber, M. J. & Thompson-Schill, S. L. (2010). Functional neuroimaging can support causal claims about brain function. Journal of Cognitive Neuroscience, 22(11), 2415–16.Google Scholar

Weiller, C., Musso, M., Rijntjes, M. & Saur, D. (2009). Please don’t underestimate the ventral pathway in language. Trends in Cognitive Sciences, 13(9), 361–9.CrossRef Google Scholar PubMed

Whalley, L. J., Deary, I. J., Appleton, C. L. & Starr, J. M. (2004). Cognitive reserve and the neurobiology of cognitive aging. Ageing Research Reviews, 3(4), 369–82.Google Scholar

White-Schwoch, T., Woodruff Carr, K., Thompson, E. C., Anderson, S., Nicol, T., Bradlow, A. R. et al. (2015). Auditory processing in noise: A preschool biomarker for literacy. PLOS Biology, 13(7), e1002196.Google Scholar

Wingfield, A., Peelle, J. E. & Grossman, M. (2003). Speech rate and syntactic complexity as multiplicative factors in speech comprehension by young and older adults. Aging, Neuropsychology, and Cognition, 10(4), 310–22.Google Scholar

Wise, R., Chollet, F., Hadar, U., Friston, K., Hoffner, E. & Frackowiak, R. (1991). Distribution of cortical neural networks involved in word comprehension and word retrieval. Brain, 114(4), 1803–17.Google Scholar

Wong, F. C. K., Chandrasekaran, B., Garibaldi, K. & Wong, P. C. M. (2011). White matter anisotropy in the ventral language pathway predicts sound-to-word learning success. The Journal of Neuroscience, 31(24), 8780–5.Google Scholar

Wong, P. C. M., Perrachione, T. K. & Parrish, T. B. (2007). Neural characteristics of successful and less successful speech and word learning in adults. Human Brain Mapping, 28, 995–1006.Google Scholar

Wong, P. C. M., Skoe, E., Russo, N. M., Dees, T. & Kraus, N. (2007). Musical experience shapes human brainstem encoding of linguistic pitch patterns. Nature Neuroscience, 10(4), 420–2.Google Scholar

Wong, P. C. M., Warrier, C. M., Penhune, V. B., Roy, A. K., Sadehh, A., Parrish, T. B. et al. (2008). Volume of left Heschl’s Gyrus and linguistic pitch learning. Cerebral Cortex, 18(4), 828–36.CrossRef Google Scholar PubMed

Wong, P. C. M., Jin, J. X., Gunasekera, G. M., Abel, R., Lee, E. R. & Dhar, S. (2009). Aging and cortical mechanisms of speech perception in noise. Neuropsychologia, 47(3), 693–703.Google Scholar

Yang, J. & Li, P. (2012). Brain networks of explicit and implicit learning. PLOS ONE, 7(8), e42993.Google Scholar

Yetkin, O., Yetkin, F. Z., Haughton, V. M. & Cox, R. W. (1996). Use of functional MR to map language in multilingual volunteers. American Journal of Neuroradiology, 17(3), 473–7.Google Scholar

Zhang, F., Wang, J.-P., Kim, J., Parrish, T. & Wong, P. C. M. (2015). Decoding multiple sound categories in the human temporal cortex using high-resolution fMRI. PloS One, 10(2), e0117303.Google Scholar

17.7 References

Allen, J. S. & Miller, J. L. (2004). Listener sensitivity to individual talker differences in voice-onset-time. Journal of the Acoustical Society of America, 115(6), 3171–83.CrossRef Google Scholar PubMed

Andruski, J. E., Blumstein, S. E. & Burton, M. (1994). The effect of subphonetic differences on lexical access. Cognition, 52(3), 163–87.CrossRef Google Scholar PubMed

Bowers, J. S. (2000). In defense of abstractionist theories of repetition priming and word identification. Psychonomic Bulletin & Review, 7(1), 83–99.Google Scholar

Bradlow, A. R., Nygaard, L. C. & Pisoni, D. B. (1999). Effects of talker, rater, and amplitude variation on recognition memory for spoken words. Perception & Psychophysics, 61(2), 206–19.Google Scholar

Cai, Z. G., Gilbert, R. A., Davis, M. H., Gaskell, M. G., Farrar, L., Adler, S. et al. (2017). Accent modulates access to word meaning: Evidence for a speaker-model account of spoken word recognition. Cognitive Psychology, 98, 73–101.Google Scholar

Campbell-Kibler, K. (2007). Accent, (ING), and the social logic of listener perceptions. American Speech, 82(1), 32–64.Google Scholar

Campbell-Kibler, K. (2009). The nature of sociolinguistic perception. Language Variation and Change, 21, 135–56.Google Scholar

Church, B. A. & Schacter, D. L. (1994). Perceptual specificity of auditory priming: Implicit memory for voice intonation and fundamental frequency. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20(3), 521–33.Google Scholar

Clayards, M., Tanenhaus, M., Aslin, R. & Jacobs, R. (2008). Perception of speech reflects optimal use of probabilistic speech cues. Cognition, 108, 804–9.Google Scholar

Connine, C. M. (2004). It’s not what you hear but how often you hear it: On the neglected role of phonological variant frequency in auditory word recognition. Psychonomic Bulletin & Review, 11(6), 1084–9.Google Scholar

Cooper, A., Brouwer, S. & Bradlow, A. R. (2015). Interdependent processing and encoding of speech and concurrent background noise. Attention, Perception & Psychophysics, 77(4), 1342–57.Google Scholar

Creel, S. C., Aslin, R. N. & Tanenhaus, M. K. (2012). Word learning under adverse listening conditions: Context-specific recognition. Language and Cognitive Processes, 27, 1021–38.Google Scholar

Dahan, D., Drucker, S. J. & Scarborough, R. A. (2008). Talker adaptation in speech perception: Adjusting the signal or the representations? Cognition, 108(3), 710–18.CrossRef Google Scholar PubMed

Dilley, L., Wieland, E., Gamache, J., McAuley, J. D. & Redford, M. (2013). Age-related changes to spectral voice characteristics affect judgments of prosodic, segmental, and talker attributes for child and adult speech. Journal of Speech, Language, and Hearing Research, 56, 159–77.CrossRef Google Scholar PubMed

D’Onofrio, A. (2015). Perceiving personae: Effects of social information on perceptions of TRAP-backing. University of Pennsylvania Working Papers in Linguistics, 21(2), 31–9.Google Scholar

D’Onofrio, A. (in press). Sociolinguistic signs as cognitive representations. In Hall-Lew, L., Podesva, E. &Moore, R. J., eds., Social Meaning in Linguistic Variation: Theorizing the Third Wave. Cambridge: Cambridge University Press.Google Scholar

Dumay, N. & Gaskell, M. G. (2005). Do words go to sleep? Exploring consolidation of spoken forms through direct and indirect measures. Behavioural and Brain Sciences, 28, 69–70.Google Scholar

Dumay, N. & Gaskell, M. G. (2007). Sleep-associated changes in the mental representation of spoken words. Psychological Science, 18, 35–9.Google Scholar

Eckert, P. (2008). Variation and the indexical field. Journal of Sociolinguistics, 12(4), 453–76.Google Scholar

Eckert, P. (2012). Three waves of variation study: The emergence of meaning in the study of sociolinguistic variation. Annual Review of Anthropology, 41(1), 87–100.Google Scholar

Freeman, J. B. & Ambady, N. (2011). A dynamic interactive theory of person construal. Psychological Review, 118(2), 247–79.Google Scholar

Ganong, W. F. (1980). Phonetic categorization in auditory word perception. Journal of Experimental Psychology: Human Perception and Performance, 6(1), 110–25.Google Scholar

Gaskell, M. G. & Marslen-Wilson, W. D. (1996). Phonological variation and inference in lexical access. Journal of Experimental Psychology: Human Perception and Performance, 22(1), 144–58.Google Scholar

Goh, W. D. (2005). Talker variability and recognition memory: Instance-specific and voice-specific effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31(1), 40–53.Google Scholar

Goldinger, S. D. (1996). Words and voices: Episodic traces in spoken word identification and recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22(5), 1166–83.Google Scholar

Goldinger, S. D. (1998). Echoes of echoes: An episodic theory of lexical access. Psychological Review, 105, 251–79.Google Scholar

Gordon, M., Barthmaier, P. & Sands, K. (2002). A cross-linguistic acoustic study of voiceless fricatives. Journal of the International Phonetic Association, 32(2), 141–74.Google Scholar

Gow, D. W. (2001). Assimilation and anticipation in continuous spoken word recognition. Journal of Memory and Language, 45(1), 133–59.Google Scholar

Gow, D. W. (2002). Does English coronal place assimilation create lexical ambiguity? Journal of Experimental Psychology: Human Perception and Performance, 28(1), 163–79.Google Scholar

Gow, D. W. (2003). Feature parsing: Feature cue mapping in spoken word recognition. Perception & Psychophysics, 65(4), 575–90.CrossRef Google Scholar PubMed

Gow, D. W. & Im, A. M. (2004). A cross-linguistic examination of assimilation context effects. Journal of Memory and Language, 51(2), 279–96.Google Scholar

Grossberg, S. (2013). Adaptive Resonance Theory: How a brain learns to consciously attend, learn, and recognize a changing world. Neural Networks, 37, 1–47.Google Scholar

Halpern, D. F. & Hakel, M. D. (2003). Applying the science of learning to the university and beyond. Change Magazine. July/August, pp. 36–41.Google Scholar

Hay, J., Podlubny, R., Drager, K. & McAuliffe, M. (2017). Car-talk: Location-specific speech production and perception. Journal of Phonetics, 65, 94–109.CrossRef Google Scholar

Howe, M. L., Wimmer, M. C., Gagnon, N. & Plumpton, S. (2009). An associative-activation theory of children’s and adults’ memory illusions. Journal of Memory and Language, 60, 229–51.Google Scholar

Johnson, K. (1997). Speech perception without speaker normalization: An exemplar model. In Johnson, K. & Mullennix, J. W., eds., Talker Variability in Speech Processing. San Diego, CA: Academic Press, pp. 145–65.Google Scholar

Johnson, K. (2005). Speaker normalization in speech perception. In Pisoni, D. B. & Remez, R. E., eds., The Handbook of Speech Perception. Malden, MA: Blackwell, pp. 363–89.Google Scholar

Johnson, K. (2006). Resonance in an exemplar-based lexicon: The emergence of social identity and phonology. Journal of Phonetics, 34, 485–99.Google Scholar

Keating, P. A. (1998). Word-level phonetic variation in large speech corpora. In A. Alexiadou, N. Fuhrop, U. Kleinhenz & P. Law, eds., ZAS Papers in Linguistics, 11, 35–50.Google Scholar

Kim, S. K. (2015). Speech, Variation, and Meaning: The Effects of Emotional Prosody on Word Recognition. PhD thesis, Stanford University.Google Scholar

Kim, S. K. & Sumner, M. (2017). Beyond lexical meaning: The effect of emotional prosody on spoken word recognition. Journal of the Acoustical Society of America, 142(1), EL49–55.Google Scholar

King, E. & Sumner, M. (2015). Voice-specific effects in semantic association. Proceedings of the Annual Meeting of the Cognitive Science Society, 37, 1111–16.Google Scholar

Klatt, D. H. (1979). Speech perception: A model of acoustic-phonetic analysis and lexical access. Journal of Phonetics, 7, 279–312.Google Scholar

Kleinschmidt, D. F. & Jaeger, T. F. (2015). Robust speech perception: Recognize the familiar, generalize to the similar, and adapt to the novel. Psychological Review, 122(2), 148–203.Google Scholar

Kong, E. J., Kang, S. & Seo, M. (2014). Gender difference in the affricate productions of young Seoul Korean speakers. Journal of the Acoustical Society of America, 136(4), EL329–EL335.Google Scholar

Kraljic, T. & Samuel, A. G. (2006). Generalization in perceptual learning for speech. Psychonomic Bulletin & Review, 13(2), 262–8.Google Scholar

Kumaran, D. and McClelland, J. L. (2012). Generalization through the recurrent interaction of episodic memories: A model of the hippocampal system. Psychological Review 119(3), 573–616.CrossRef Google Scholar

Ladefoged, P. & Broadbent, D. E. (1960). Perception of sequence in auditory events. Quarterly Journal of Experimental Psychology, 12(3), 162–70.Google Scholar

Lahiri, A. & Marslen-Wilson, W. (1991). The mental representation of lexical form: A phonological approach to the recognition lexicon. Cognition, 38, 245–94.Google Scholar

Liberman, A. M. & Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition, 21(1), 1–36.Google Scholar

Liberman, A. M., Cooper, F. S., Shankweiler, D. P. & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74(6), 431–61.Google Scholar

Lindblom, B. E. & Studdert‐Kennedy, M. (1967). On the role of formant transitions in vowel recognition. Journal of the Acoustical Society of America, 42(4), 830–43.Google Scholar

Lindblom, B. (1990). Explaining phonetic variation: A sketch of the H&H theory. In Hardcastle, W. J. & Marchal, A., eds., Speech Production and Speech Modeling. Dordrecht: Kluwer Academic Publishers, pp. 403–39.Google Scholar

LoCasto, P. C. & Connine, C. M. (2002). Rule-governed missing information in spoken word recognition: Schwa vowel deletion. Perception & Psychophysics, 64(2), 208–19.Google Scholar

LoCasto, P. C. & Connine, C. M. (2011). Processing of no-release variants in connected speech. Language and Speech, 54(2), 181–97.Google Scholar

Luce, P. A. & Lyons, E. (1998). Specificity of memory representation for spoken words. Memory & Cognition, 26, 708–15.CrossRef Google Scholar PubMed

Luce, P. A. & McLennan, C. T. (2005). Spoken word recognition: The challenge of variation. In Pisoni, D. B. & Remez, R. E., eds., The Handbook of Speech Perception. Malden, MA: Wiley.Google Scholar

Maida, C. (2014). Project-based learning: A critical pedagogy for the twenty-first century. Policy Futures in Education, 9, 759–68.Google Scholar

Marslen-Wilson, W. & Warren, P. (1994). Levels of perceptual representation and process in lexical access: Words, phonemes, and features. Psychological Review, 101(4), 653.Google Scholar

Marslen-Wilson, W., Nix, A. & Gaskell, G. (1995). Phonological variation in lexical access: Abstractness, inference and English place assimilation. Language and Cognitive Processes, 10, 285–308.Google Scholar

Mattys, S. L., Davis, M. H., Bradlow, A. R. & Scott, S. K. (2012). Speech recognition in adverse conditions: A review. Language and Cognitive Processes, 27(7–8), 953–78.Google Scholar

Maye, J., Asline, R. N. & Tanenhaus, M. K. (2008). The weckud wetch of the wast: Lexical adaptation to a novel accent. Cognitive Science, 32(3), 543–62.Google Scholar

McClelland, J. L. & Elman, J. L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 1–86.Google Scholar

McGowan, K. B. & Sumner, M. (2014). The effect of contextual mismatches on lexical activation of phonetic variants. Journal of the Acoustical Society of America, 135(4), 2199.Google Scholar

Niedzielski, N. (1999). The effect of social information on the perception of sociolinguistic variables. Journal of Language and Social Psychology, 18(1), 62–85.CrossRef Google Scholar

Noelle, D. C., Dale, R., Warlaumont, A. S., Yoshimi, J., Matlock, T., Jennings, C. D. et al. (2015). Proceedings of the 37th Annual Meeting of the Cognitive Science Society. Austin, TX: Cognitive Science Society.Google Scholar

Norris, D. (1994). Shortlist: A connectionist model of continuous speech recognition. Cognition, 52(3), 189–234.Google Scholar

Norris, D., McQueen, J. M. & Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology, 47(2), 204–38.CrossRef Google Scholar PubMed

Nygaard, L. C. & Lunders, E. R. (2002). Resolution of lexical ambiguity by emotional tone of voice. Memory & Cognition, 30(4), 583–93.Google Scholar

Nygaard, L. C. & Pisoni, D. B. (1998). Talker-specific learning in speech perception. Perception & Psychophysics, 60(3), 355–76.Google Scholar

Otgaar, H., Peters, M. & Howe, M. L. (2012). Dividing attention lowers children’s, but increases adults’ false memories. Journal of Experimental Psychology: Learning, Memory, and Cognition, 38(1), 204–10.Google Scholar

Palmeri, T. J., Goldinger, S. D. & Pisoni, D. B. (1993). Episodic encoding of voice attributes and recognition memory for spoken words. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19(2), 309–28.Google Scholar

Pierrehumbert, J. B. (2001). Exemplar dynamics: Word frequency, lenition and contrast. In Bybee, J. & Hopper, P., eds., Frequency Effects and the Emergence of Linguistic Structure. Amsterdam: John Benjamins, pp. 137–58.Google Scholar

Pierrehumbert, J. B. (2016). Phonological representation: Beyond abstract versus episodic. Annual Review of Linguistics, 2, 33–52.Google Scholar

Pitt, M. A. (2009). The strength and time course of lexical activation of pronunciation variants. Journal of Experimental Psychology: Human Perception and Performance, 35(3), 896–910.Google Scholar

Pufahl, A. & Samuel, A. G. (2014). How lexical is the lexicon? Evidence for integrated auditory memory representations. Cognitive Psychology, 70, 1–30.Google Scholar

Salverda, A. P., Kleinschmidt, D. & Tanenhaus, M. K. (2014). Immediate effects of anticipatory coarticulation in spoken-word recognition. Journal of Memory and Language, 71(1), 145–63.Google Scholar

Samuel, A. G. & Kraljic, T. (2009). Perceptual learning for speech. Attention, Perception, & Psychophysics, 71(6), 1207–18.Google Scholar

Schacter, D. L. & Church, B. A. (1992). Auditory priming: Implicit and explicit memory for words and voices. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18(5), 915–30.Google Scholar

Steen, S., Bader, C. & Kubrin, C. (1999). Rethinking the graduate seminar. Teaching Sociology, 27(2), 167–73.Google Scholar

Strand, E. A. (2000). Gender Stereotype Effects in Speech Processing. Doctoral dissertation, Ohio State University.Google Scholar

Strori, D., Zaar, J., Cooke, M. & Mattys, S. L. (2018). Sound specificity effects in spoken word recognition: The effect of integrality between words and sounds. Attention, Perception & Psychophysics, 80, 222–41.Google Scholar

Sumner, M. (2015). The social weight of spoken words. Trends in Cognitive Sciences, 19(5), 238–9.Google Scholar

Sumner, M., and Kataoka, R. (2013). Effects of phonetically cued talker variation on semantic-encoding. Journal of Acoustical Society of America, 134, EL485–491.Google Scholar

Sumner, M. & Samuel, A. G. (2007). Lexical inhibition and sublexical facilitation are surprisingly long lasting. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33(4), 769–90.Google Scholar

Sumner, M. & Samuel, A. G. (2009). The effect of experience on the perception and representation of dialect variants. Journal of Memory and Language, 60, 487–501.Google Scholar

Sumner, M., Kurumada, C., Gafter, R. & Casillas, M. (2013). Phonetic variation and the recognition of words with pronunciation variants. Proceedings of the Annual Meeting of the Cognitive Science Society, 35, 3486–91.Google Scholar

Sumner, M., Kim, S. K., King, E. & McGowan, K. B. (2014). The socially weighted encoding of spoken words: A dual-route approach to speech perception. Frontiers in Psychology, 4(January), 1–13.Google Scholar

Toscano, J. C. & McMurray, B. (2010). Cue integration with categories: Weighting acoustic cues in speech using unsupervised learning and distributional statistics. Cognitive Science, 34, 434–64.Google Scholar

Toscano, J. C. & McMurray, B. (2015). The time-course of speaking rate compensation: effects of sentential rate and vowel length on voicing judgments. Language, Cognition and Neuroscience, 30(5), 529–43.Google Scholar

Vitevitch, M. S. (2003). Change deafness: The inability to detect changes between two voices. Journal of Experimental Psychology: Human Perception and Performance, 29, 333–42.Google Scholar

Warren, P. (2016). Uptalk: The Phenomenon of Rising Intonation, Cambridge: Cambridge University Press.Google Scholar

Zhao, Y. (2009). Statistical Inference in Learning of Novel Phonetic Categories. Doctoral dissertation, Stanford University, CA.Google Scholar

18.7 References

Allopenna, P. D., Magnuson, J. S. & Tanenhaus, M. K. (1998). Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models. Journal of Memory and Language, 38(4), 419–39.Google Scholar

Alsius, A., Navarra, J., Campbell, R. & Soto-Faraco, S. (2005). Audiovisual integration of speech falters under high attention demands. Current Biology, 15(9), 839–43. https://doi.org/10.1016/j.cub.2005.03.046.Google Scholar

Altmann, G. T. M. (2011). Language can mediate eye movement control within 100 milliseconds, regardless of whether there is anything to move the eyes to. Acta Psychologica, 137(2), 190–200. https://doi.org/10.1016/j.actpsy.2010.09.009.Google Scholar

Arnold, J. E. (2008). THE BACON not the bacon: How children and adults understand accented and unaccented noun phrases. Cognition, 108(1), 69–99. https://doi.org/10.1016/j.cognition.2008.01.001.Google Scholar

Barr, D. J. (2008). Analyzing ‘visual world’ eyetracking data using multilevel logistic regression. Journal of Memory and Language, 59(4), 457–74. https://doi.org/10.1016/j.jml.2007.09.002.Google Scholar

Beckman, M. & Hirschberg, J. (1994). The ToBI Annotation Conventions, Columbus, OH: Ohio State University.Google Scholar

Beddor, P. S., McGowan, K. B., Boland, J. E., Coetzee, A. W. & Brasher, A. (2013). The time course of perception of coarticulation. Journal of the Acoustical Society of America, 133(4), 2350–66. https://doi.org/10.1121/1.4794366.Google Scholar

Brouwer, S., Mitterer, H. & Huettig, F. (2012). Can hearing puter activate pupil? Phonological competition and the processing of reduced spoken words in spontaneous conversations. The Quarterly Journal of Experimental Psychology, 65(11), 2193–220. https://doi.org/10.1080/17470218.2012.693109.Google Scholar

Brouwer, S., Mitterer, H. & Huettig, F. (2013). Discourse context and the recognition of reduced and canonical spoken words. Applied Psycholinguistics, 34, 519–39. https://doi.org/10.1017/s0142716411000853.Google Scholar

Brown, M., Salverda, A. P., Dilley, L. C. & Tanenhaus, M. K. (2011). Expectations from preceding prosody influence segmentation in online sentence processing. Psychonomic Bulletin & Review, 18(6), 1189–96. https://doi.org/10.3758/s13423-011–0167-9.Google Scholar

Brown, M., Salverda, A. P., Dilley, L. C. & Tanenhaus, M. K. (2015a). Metrical expectations from preceding prosody influence perception of lexical stress. Journal of Experimental Psychology: Human Perception and Performance, 41(2), 306–23. https://doi.org/10.1037/a0038689.Google Scholar

Brown, M., Salverda, A. P., Gunlogson, C. & Tanenhaus, M. K. (2015b). Interpreting prosodic cues in discourse context. Language, Cognition and Neuroscience, 30(1–2), 149–66. https://doi.org/10.1080/01690965.2013.862285.Google Scholar

Brown-Schmidt, S. & Toscano, J. C. (2017). Gradient acoustic information induces long-lasting referential uncertainty in short discourses. Language, Cognition and Neuroscience, 32(10), 1211–28. https://doi.org/10.1080/23273798.2017.1325508.Google Scholar

Chen, A., den Os, E. & de Ruiter, J. P. (2007). Pitch accent type matters for online processing of information status: Evidence from natural and synthetic speech. The Linguistic Review, 24(2–3), 317–44. https://doi.org/10.1515/TLR.2007.012.Google Scholar

Clayards, M., Niebuhr, O. & Gaskell, M. G. (2015). The time course of auditory and language-specific mechanisms in compensation for sibilant assimilation. Attention, Perception & Psychophysics, 77(1), 311–28. https://doi.org/10.3758/s13414-014–0750-z.Google Scholar

Cooper, R. M. (1974). The control of eye fixation by the meaning of spoken language: A new methodology for the real-time investigation of speech perception, memory, and language processing. Cognitive Psychology, 6(1), 84–107. https://doi.org/10.1016/0010–0285(74)90005-X.Google Scholar

Cutler, A., Weber, A. & Otake, T. (2006). Asymmetric mapping from phonetic to lexical representations in second-language listening. Journal of Phonetics, 34, 269–84. https://doi.org/10.1016/j.wocn.2005.06.002.Google Scholar

Dahan, D. & Tanenhaus, M. K. (2004). Continuous mapping from sound to meaning in spoken-language comprehension: Immediate effects of verb-based thematic constraints. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 498–513.Google Scholar

Dahan, D., Tanenhaus, M. K. & Chambers, C. G. (2002). Accent and reference resolution in spoken-language comprehension. Journal of Memory and Language, 47(2), 292–314. https://doi.org/10.1016/S0749-596X(02)00001–3.Google Scholar

Donnelly, S. & Verkuilen, J. (2017). Empirical logit analysis is not logistic regression. Journal of Memory and Language, 94, 28–42. https://doi.org/10.1016/j.jml.2016.10.005.CrossRef Google Scholar

Escudero, P., Hayes-Harb, R. & Mitterer, H. (2008). Novel second-language words and asymmetric lexical access. Journal of Phonetics, 36, 345–60. https://doi.org/10.1016/j.wocn.2007.11.002.Google Scholar

Gow, D. W. & McMurray, B. (2007). Word recognition and phonology: The case of English coronal place assimilation. In Cole, J. & Hualde, J., eds., Laboratory Phonology 9. New York: Mouton de Gruyter, pp. 173–200.Google Scholar

Hanulíková, A. & Weber, A. (2012). Sink positive: Linguistic experience with th substitutions influences nonnative word recognition. Attention, Perception & Psychophysics, 74(3), 613–29. https://doi.org/10.3758/s13414-011–0259-7.Google Scholar

Heeren, W. F. L., Bibyk, S. A., Gunlogson, C. & Tanenhaus, M. K. (2015). Asking or telling: Real-time processing of prosodically distinguished questions and statements. Language and Speech, 58(4), 474–501. https://doi.org/10.1177/0023830914564452.Google Scholar

Hisanaga, S., Sekiyama, K., Igasaki, T. & Murayama, N. (2016). Language/culture modulates brain and gaze processes in audiovisual speech perception. Scientific Reports, 6, srep35265. https://doi.org/10.1038/srep35265.Google Scholar

Huettig, F. & Altmann, G. T. M. (2007). Visual-shape competition during language-mediated attention is based on lexical input and not modulated by contextual appropriateness. Visual Cognition, 15(8), 985–1018. https://doi.org/10.1080/13506280601130875.Google Scholar

Huettig, F. & McQueen, J. M. (2007). The tug of war between phonological, semantic and shape information in language-mediated visual search. Journal of Memory and Language, 57(4), 460–82. https://doi.org/10.1016/j.jml.2007.02.001.Google Scholar

Huettig, F., Rommers, J. & Meyer, A. S. (2011). Using the visual world paradigm to study language processing: A review and critical evaluation. Acta Psychologica, 137(2), 151–71. https://doi.org/10.1016/j.actpsy.2010.11.003.Google Scholar

Ito, K. & Speer, S. R. (2008). Anticipatory effects of intonation: Eye movements during instructed visual search. Journal of Memory and Language, 58(2), 541–73. https://doi.org/10.1016/j.jml.2007.06.013.Google Scholar

Ito, K. & Speer, S. R. (2011). Semantically-independent but contextually-dependent interpretation of contrastive accent. In Frota, S., Elordieta, G. & Prieto, P., eds., Prosodic Categories: Production, Perception and Comprehension. Dordrecht: Springer, pp. 69–92. https://doi.org/10.1007/978–94-007–0137-3_4.Google Scholar

Jesse, A., Poellmann, K. & Kong, Y.-Y. (2017). English listeners use suprasegmental cues to lexical stress early during spoken-word recognition. Journal of Speech, Language, and Hearing Research, 60(1), 190–8. https://doi.org/10.1044/2016_JSLHR-H-15–0340.Google Scholar

Kingston, J., Levy, J., Rysling, A. & Staub, A. (2016). Eye movement evidence for an immediate Ganong effect. Journal of Experimental Psychology. Human Perception and Performance, 42(12), 1969–88. https://doi.org/10.1037/xhp0000269.Google Scholar

Liberman, A. M., Harris, K. S., Hoffman, H. S. & Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimental Psychology, 54(5), 358–68. https://doi.org/10.1037/h0044417.Google Scholar

Llompart, M. & Reinisch, E. (2017). Articulatory information helps encode lexical contrasts in a second language. Journal of Experimental Psychology: Human Perception and Performance, 43(5), 1040–56. https://doi.org/10.1037/xhp0000383.Google Scholar

Magnuson, J. S., Dixon, J. A., Tanenhaus, M. K. & Aslin, R. N. (2007). The dynamics of lexical competition during spoken word recognition. Cognitive Science, 31(1), 133–56. https://doi.org/10.1080/03640210709336987.Google Scholar

Malins, J. G. & Joanisse, M. F. (2010). The roles of tonal and segmental information in Mandarin spoken word recognition: An eyetracking study. Journal of Memory and Language, 62(4), 407–20. https://doi.org/10.1016/j.jml.2010.02.004.Google Scholar

McClelland, J. L. & Elman, J. L. (1986). The TRACE model of speech perception. Cognitive Psychology, 18(1), 1–86. https://doi.org/10.1016/0010–0285(86)90015–0.Google Scholar

McMurray, B., Clayards, M. A., Tanenhaus, M. K. & Aslin, R. N. (2008). Tracking the time course of phonetic cue integration during spoken word recognition. Psychonomic Bulletin & Review, 15(6), 1064–71.Google Scholar

McMurray, B., Tanenhaus, M. K. & Aslin, R. N. (2009). Within-category VOT affects recovery from ‘lexical’ garden paths: Evidence against phoneme-level inhibition. Journal of Memory and Language, 60(1), 65–91. https://doi.org/10.1016/j.jml.2008.07.002.Google Scholar

Mirman, D., Dixon, J. A. & Magnuson, J. S. (2008). Statistical and computational models of the visual world paradigm: Growth curves and individual differences. Journal of Memory and Language, 59(4), 475–94. https://doi.org/10.1016/j.jml.2007.11.006.Google Scholar

Mitterer, H. & Ernestus, M. (2006). Listeners recover /t/s that speakers reduce: Evidence from /t/-lenition in Dutch. Journal of Phonetics, 34(1), 73–103. https://doi.org/10.1016/j.wocn.2005.03.003.Google Scholar

Mitterer, H. & McQueen, J. M. (2009). Processing reduced word-forms in speech perception using probabilistic knowledge about speech production. Journal of Experimental Psychology: Human Perception and Performance, 35(1), 244–63. https://doi.org/10.1037/a0012730.Google Scholar

Mitterer, H. & Reinisch, E. (2013). No delays in application of perceptual learning in speech recognition: Evidence from eye tracking. Journal of Memory and Language, 69(4), 527–45. https://doi.org/10.1016/j.jml.2013.07.002.Google Scholar

Mitterer, H. & Reinisch, E. (2015). Letters don’t matter: No effect of orthography on the perception of conversational speech. Journal of Memory and Language, 85, 116–34. https://doi.org/10.1016/j.jml.2015.08.005.Google Scholar

Mitterer, H. & Reinisch, E. (2017). Visual speech influences speech perception immediately but not automatically. Attention, Perception & Psychophysics, 79(2), 660–78. https://doi.org/10.3758/s13414-016–1249-6.Google Scholar

Mitterer, H., Kim, S. & Cho, T. (2013). Compensation for complete assimilation in speech perception: The case of Korean labial-to-velar assimilation. Journal of Memory and Language, 69(1), 59–83. https://doi.org/10.1016/j.jml.2013.02.001.Google Scholar

Nakamura, C., Arai, M. & Mazuka, R. (2012). Immediate use of prosody and context in predicting a syntactic structure. Cognition, 125(2), 317–23. https://doi.org/10.1016/j.cognition.2012.07.016.Google Scholar

Nixon, J. S., van Rij, J., Mok, P., Baayen, R. H. & Chen, Y. (2016). The temporal dynamics of perceptual uncertainty: Eye movement evidence from Cantonese segment and tone perception. Journal of Memory and Language, 90, 103–25. https://doi.org/10.1016/j.jml.2016.03.005.Google Scholar

Quam, C. & Swingley, D. (2014). Processing of lexical stress cues by young children. Journal of Experimental Child Psychology, 123, 73–89. https://doi.org/10.1016/j.jecp.2014.01.010.Google Scholar

Reinisch, E. & Sjerps, M. J. (2013). The uptake of spectral and temporal cues in vowel perception is rapidly influenced by context. Journal of Phonetics, 41(2), 101–16.Google Scholar

Reinisch, E. & Weber, A. (2012). Adapting to suprasegmental lexical stress errors in foreign-accented speech. Journal of the Acoustical Society of America, 132(2), 1165–76.Google Scholar

Reinisch, E., Jesse, A. & McQueen, J. M. (2010). Early use of phonetic information in spoken word recognition: Lexical stress drives eye movements immediately. The Quarterly Journal of Experimental Psychology, 63(4), 772–83.Google Scholar

Reinisch, E., Jesse, A. & McQueen, J. M. (2011). Speaking rate from proximal and distal contexts is used during word segmentation. Journal of Experimental Psychology: Human Perception and Performance, 37(3), 978.Google Scholar

Rossano, F., Brown, P. & Levinson, S. C. (2009). Gaze, questioning and culture. In Sidnell, J., ed., Conversation Analysis: Comparative Perspectives. Cambridge: Cambridge University Press, pp. 187–249.Google Scholar

Salverda, A. P. & Tanenhaus, M. K. (2010). Tracking the time course of orthographic information in spoken-word recognition. Journal of Experimental Psychology. Learning, Memory, and Cognition, 36(5), 1108–17. https://doi.org/10.1037/a0019901.Google Scholar

Salverda, A. P., Dahan, D. & McQueen, J. M. (2003). The role of prosodic boundaries in the resolution of lexical embedding in speech comprehension. Cognition, 90(1), 51–89. https://doi.org/10.1016/S0010-0277(03)00139–2.Google Scholar

Salverda, A. P., Dahan, D., Tanenhaus, M. K., Crosswhite, K., Masharov, M. & McDonough, J. (2007). Effects of prosodically-modulated sub-phonetic variation on lexical competition. Cognition, 105(2), 466–76. https://doi.org/10.1016/j.cognition.2006.10.008.Google Scholar

Sedivy, J. C., Tanenhaus, M. K., Chambers, C. G. & Carlson, G. N. (1999). Achieving incremental semantic interpretation through contextual representation. Cognition, 71(2), 109–47. https://doi.org/10.1016/S0010-0277(99)00025–6.Google Scholar

Shatzman, K. B. & McQueen, J. M. (2006a). Prosodic knowledge affects the recognition of newly acquired words. Psychological Science, 17(5), 372–7. https://doi.org/10.1111/j.1467–9280.2006.01714.x.Google Scholar

Shatzman, K. B. & McQueen, J. M. (2006b). Segment duration as a cue to word boundaries in spoken-word recognition. Perception & Psychophysics, 68(1), 1–16. https://doi.org/10.3758/BF03193651.Google Scholar

Shatzman, K. B. & McQueen, J. M. (2006c). The modulation of lexical competition by segment duration. Psychonomic Bulletin & Review, 13(6), 966–71. https://doi.org/10.3758/BF03213910.Google Scholar

Shen, J., Deutsch, D. & Rayner, K. (2013). On-line perception of Mandarin Tones 2 and 3: Evidence from eye movements. Journal of the Acoustical Society of America, 133(5), 3016–29. https://doi.org/10.1121/1.4795775.Google Scholar

Shockey, L. (2003). Sound Patterns of Spoken English. Cambridge, MA: Blackwell.Google Scholar

Snedeker, J. & Trueswell, J. (2003). Using prosody to avoid ambiguity: Effects of speaker awareness and referential context. Journal of Memory and Language, 48(1), 103–30. https://doi.org/10.1016/S0749-596X(02)00519–3.Google Scholar

Somppi, S., Törnqvist, H., Hänninen, L., Krause, C. & Vainio, O. (2012). Dogs do look at images: Eye tracking in canine cognition research. Animal Cognition, 15(2), 163–74. https://doi.org/10.1007/s10071-011–0442-1.Google Scholar

Sulpizio, S. & McQueen, J. M. (2012). Italians use abstract knowledge about lexical stress during spoken-word recognition. Journal of Memory and Language, 66(1), 177–93. https://doi.org/10.1016/j.jml.2011.08.001.Google Scholar

Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M. & Sedivy, J. C. (1995). Integration of visual and linguistic information in spoken language comprehension. Science, 268(5217), 1632–4.Google Scholar

Toscano, J. C. & McMurray, B. (2015). The time-course of speaking rate compensation: Effects of sentential rate and vowel length on voicing judgments. Language, Cognition and Neuroscience, 30(5), 529–43. https://doi.org/10.1080/23273798.2014.946427.Google Scholar

van der Heijden, A. H. C. (1992). Selective Attention in Vision. New York: Routledge.Google Scholar

Vatikiotis-Bateson, E., Eigsti, I.-M., Yano, S. & Munhall, K. G. (1998). Eye movement of perceivers during audiovisual speech perception. Perception & Psychophysics, 60(6), 926–40. https://doi.org/10.3758/BF03211929.Google Scholar

Viebahn, M. C., Ernestus, M. & McQueen, J. M. (2015). Syntactic predictability in the recognition of carefully and casually produced speech. Journal of Experimental Psychology: Learning, Memory, and Cognition, 41(6), 1684–702. https://doi.org/10.1037/a0039326.Google Scholar

Watson, D. G., Tanenhaus, M. K. & Gunlogson, C. A. (2008). Interpreting pitch accents in online comprehension: H* vs. L+H*. Cognitive Science, 32(7), 1232–44. https://doi.org/10.1080/03640210802138755.Google Scholar

Weber, A. & Cutler, A. (2004). Lexical competition in non-native spoken-word recognition. Journal of Memory and Language, 50, 1–25. https://doi.org/10.1016/S0749-596X(03)00105–0.Google Scholar

Weber, A., Braun, B. & Crocker, M. W. (2006a). Finding referents in time: Eye-tracking evidence for the role of contrastive accents. Language and Speech, 49(3), 367–92. https://doi.org/10.1177/00238309060490030301.Google Scholar

Weber, A., Grice, M. & Crocker, M. W. (2006b). The role of prosody in the interpretation of structural ambiguities: A study of anticipatory eye movements. Cognition, 99(2), B63–B72. https://doi.org/10.1016/j.cognition.2005.07.001.Google Scholar

Westfall, J., Kenny, D. A. & Judd, C. M. (2014). Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli. Journal of Experimental Psychology. General, 143(5), 2020–45. https://doi.org/10.1037/xge0000014.Google Scholar

19.7 References

Allen, J. (1994). How do humans process and recognize speech. IEEE Transactions on Speech and Audio Processing, 2(4) 567–77.Google Scholar

Baker, J. K. (1975). The DRAGON System: An overview. IEEE Transactions on Acoustics, Speech and Signal Processing, 23(1), 24–9.Google Scholar

Bourlard, H. A. & Morgan, N. (1994). Connectionist Speech Recognition: A Hybrid Approach. Berlin: Springer-Verlag.Google Scholar

Chan, W., Jaitly, N., Le, Q. & Vinyals, O. (2016). Listen, attend and spell: A neural network for large vocabulary conversational speech recognition. In Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Shanghai, pp. 4960–4.Google Scholar

Cherry, C. (1968). On Human Communications. Cambridge, MA: MIT Press.Google Scholar

Cohen, M. H., Giangola, J. P. & Balogh, J. (2004). Voice User Interface Design. Hoboken, NJ: Anderson-Wiley.Google Scholar

Davis, K. H., Biddulph, R. & Balashek, S. (1952). Automatic recognition of spoken digits. Journal of the Acoustical Society of America, 24(6), 637–42.Google Scholar

Denes, P. E. & Pinson, E. N. (1993). The Speech Chain: The Physics and Biology of Spoken Languages, 2nd ed. Oxford: W. H. Freeman and Company.Google Scholar

Fant, G. (1960). Acoustic Theory of Speech Production. The Hague: Mouton.Google Scholar

Fant, G. (1973). Speech Sounds and Features. Cambridge, MA: MIT Press.Google Scholar

Flanagan, J. L. (1965). Speech Analysis, Synthesis and Perception. Berlin: Springer-Verlag.Google Scholar

Forgie, J. W. & Forgie, C. D. (1959). Results obtained from a vowel recognition computer program. Journal of the Acoustical Society of America, 31(11), 1480–89.Google Scholar

Gold, B. & Morgan, N. (1999). Speech and Audio Signal Processing. New York: Wiley.Google Scholar

Graves, A., Fernández, S., Gomez, F. & Schmidhuber, J. (2006). Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd International Conference on Machine Learning, pp. 369–76.Google Scholar

Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E. et al. (2014). Deep speech: Scaling up end-to-end speech recognition. In arXiv preprint arXiv:1412.5567.Google Scholar

Hinton, G. E. & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–7.Google Scholar

Hinton, G. E., Deng, L., Yu, D., Dahl, G., Mohamed, A. R., Jaitly, N. et al. (2012). Deep neural networks for acoustic modelling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82–97.Google Scholar

Huang, X., Acero, A. & Hong, H.-W. (2001). Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Upper Saddle River, NJ: Prentice Hall.Google Scholar

Jelinek, F. (1997). Statistical Methods for Speech Recognition. Cambridge, MA: MIT Press.Google Scholar

Juang, B. H. & Furui, S. (2000). Automatic speech recognition and understanding: A first step toward natural human–machine communication. Proceedings of the IEEE, 88(8), 1142–65.Google Scholar

Juneja, A., Deshmukh, O. & Espy-Wilson, C. (2002). An event-based acoustic-phonetic approach to speech segmentation and E-set recognition. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, 4: IV/4164.Google Scholar

Jurafsky, D. & Martin, J. H. (2000). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Upper Saddle River, NJ: Prentice Hall.Google Scholar

Klatt, D. (1977). Review of the ARPA Speech Understanding Project. Journal of the Acoustical Society of America, 62(6), 1324–66.Google Scholar

Lee, C. H. & Rabiner, L. R. (1989). A frame-synchronous network search algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 37(11), 1649–58.Google Scholar

Lee, C. H., Soong, F. K. & Paliwal, K. K. (1996). Automatic Speech and Speaker Recognition: Advanced Topics. Dordrecht: Kluwer Academic.Google Scholar

Lee, C.-H. & Huo, Q. (2000). On adaptive decision rules and decision parameter adaptation for automatic speech recognition. Proceedings of the IEEE, 88(8), 1241–69.Google Scholar

Lee, C.-H. & Siniscalchi, S. M. (2013). An information-extraction approach to speech processing: Analysis, detection, verification and recognition. Proceedings of the IEEE, 101(5), 1089–115.Google Scholar

Liu, S. A. (1996). Landmark detection for distinctive feature-based speech recognition. Journal of the Acoustical Society of America, 100(5), 3417–30.Google Scholar

Lippmann, R. P. (1997). Speech recognition by machines and humans. Speech Communication, 22(1), 1–15.Google Scholar

Lowerre, B. (1990). The HARPY speech understanding system. In Lea, W., ed., Trends in Speech Recognition. Upper Saddle River, NJ: Prentice Hall, pp. 576–86.Google Scholar

Manning, C. & Schutze, H. (1999). Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.Google Scholar

Martin, T. B., Nelson, A. L. & Zadell, H. J. (1964). Speech Recognition by Feature-Abstraction Techniques. Tech Report AL-TDR-64–176, Air Force Avionics Lab.Google Scholar

Mohri, M., Pereira, F. C. N. & Riley, M. (2002). Weighted finite-state transducers in speech recognition. Computer Speech & Language, 16, 69–88.Google Scholar

Nagata, K., Kato, Y. & Chiba, S. (1963). Spoken Digit Recognizer for Japanese Language. NEC Research and Development Laboratories.Google Scholar

Ney, H. & Ortmanns, S. (2000). Progress in dynamic programming search for LVCSR. Proceedings of the IEEE, 88(8), 1224–40.Google Scholar

Olive, J. P., Greenwood, A. & Coleman, J. (1993). Acoustics of American English Speech: A Dynamic Approach. Berlin: Springer-Verlag.Google Scholar

Olson, H. F. & Belar, H. (1956). Phonetic typewriter. Journal of the Acoustical Society of America, 28(6), 1072–81.Google Scholar

O’Shaughnessy, D. (2000). Speech Communications: Human and Machine. Reading, MA: Addison-Wesley.Google Scholar

Ostendorf, M. (1999). Moving beyond the beads-on-a-string model of speech. In Proceedings of. IEEE ASRU Automatic Speech Recognition and Understanding, Singapore, pp. 79–84.Google Scholar

Ostendorf, M., Digalakis, V. V. & Kimball, O. A. (1996). From HMM’s to segment models: A unified view of stochastic modeling for speech recognition. IEEE Transactions on Speech and Audio Processing, 4(5), 360–78.Google Scholar

Paul, D. B. & Baker, J. M. (1992). The design for the Wall Street Journal-based CSR Corpus. In Proceedings of the Workshop on Speech and Natural Language, pp. 899–902.Google Scholar

Rabiner, L. R. (1989). A tutorial on Hidden Markov Models and selected applications in speech recognition. Proceedings of the. IEEE, 77(2), 257–86.Google Scholar

Rabiner, L. R. & Juang, B.-H. (1993). Fundamentals of Speech Recognition. Upper Saddle River, NJ: Prentice Hall.Google Scholar

Rabiner, L. R. & Schafer, R. W. (2010). Theory and Applications of Digital Speech Processing. Upper Saddle River, NJ: Prentice Hall.Google Scholar

Ramabhadran, B., Chen, N. F., Harper, M. P., Kingsbury, B. & Knill, K. (2017). Introduction to the special issue on end-to-end speech and language processing. IEEE Journal of Selected Topics in Signal Processing, 11(8), 1237–9.Google Scholar

Sainath, T. N., Weiss, R. J., Wilson, K. W., Li, B., Narayanan, A., Variani, E. et al. (2017). Multichannel signal processing with deep neural networks for automatic speech recognition. IEEE /ACM Transactions on Audio, Speech, and Language Processing, 25, 965–79.Google Scholar

Sakoe, H. (1979). Two-level DP matching: A dynamic programming-based pattern matching algorithm for connected word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, 27, 588–95.Google Scholar

Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27, 379–423 & 623–56.Google Scholar

Siniscalchi, S. M. & Lee, C.-H. (2009). A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition. Speech Communication, 51, 1139–53.Google Scholar

Sproat, R. (1998). Multilingual Text-to-Speech Synthesis: The Bell Labs Approach, Dordrecht: Kluwer Academic.Google Scholar

Stevens, K. (2000). Acoustic Phonetics. Cambridge, MA: MIT Press.Google Scholar

Stork, D. G. (1997). HAL’s Legacy: 2001’s Computer as Dream and Reality. Cambridge, MA: MIT Press.Google Scholar

Sundermeyer, M., Schlüter, R. & Ney, H. (2012). LSTM neural networks for language modelling. In Proceedings of INTERSPEECH, Portland, OR, 194–6.Google Scholar

Taylor, P. (2009). Text-to-Speech Synthesis. Cambridge: Cambridge University Press.Google Scholar

Thomáš, M. (2012). Statistical Language Models Based on Neural Networks. PhD thesis, Brno University of Technology.Google Scholar

Vintsyuk, T. K. (1968). Speech discrimination by dynamic programming. Kibernetika, 4(2), 81–8.Google Scholar

Viterbi, A. J. (1967). Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 13(2), 260–9.Google Scholar

Yu, D. & Deng, L. (2014). Automatic Speech Recognition: A Deep Learning Approach. Berlin: Springer-Verlag.Google Scholar