Evaluating Language Acquisition Models: A Utility-Based Look at Bayesian Segmentation

Lisa Pearl; Lawrence Phillips

doi:10.1017/9781316676974.008

8 - Evaluating Language Acquisition Models: A Utility-Based Look at Bayesian Segmentation

from Part III - Data Driven Models

Published online by Cambridge University Press: 30 November 2017

Lisa Pearl and

Lawrence Phillips

Edited by

Thierry Poibeau and

Aline Villavicencio

Show author details

Lisa Pearl: Affiliation:
Departments of Linguistics and Cognitive Sciences, University of California, Irvine, USA
Lawrence Phillips: Affiliation:
Department of Cognitive Sciences, University of California, Irvine, USA
Thierry Poibeau: Affiliation:
Centre National de la Recherche Scientifique (CNRS), Paris
Aline Villavicencio: Affiliation:
Universidade Federal do Rio Grande do Sul, Brazil

Book contents

Get access

Summary

Abstract

Computational models of language acquisition often face evaluation issues associated with unsupervised machine learning approaches. These acquisition models are typically meant to capture how children solve language acquisition tasks without relying on explicit feedback, making them similar to other unsupervised learning models. Evaluation issues include uncertainty about the exact form of the target linguistic knowledge, which is exacerbated by a lack of empirical evidence about children's knowledge at different stages of development. Put simply, a model's output may be good enough even if it does not match adult knowledge because children's output at various stages of development also may not match adult knowledge. However, it is not easy to determine what counts as “good enough” model output. We consider this problem using the case study of speech segmentation modeling, where the acquisition task is to segment a fluent stream of speech into useful units like words. We focus on a particular Bayesian segmentation strategy previously shown to perform well on English, and discuss several options for assessing whether a segmentation model's output is good enough, including cross-linguistic utility, the presence of reasonable errors, and downstream evaluation. Our findings highlight the utility of considering multiple metrics for segmentation success, which is likely also true for language acquisition modeling more generally.

Introduction

A core issue in machine learning is how to evaluate unsupervised learning approaches (von Luxburg, Williamson, & Guyon, 2011), since there is no a priori correct answer the way that there is for supervised learning approaches. Computational models of language acquisition commonly face this problem because they attempt to capture how children solve language acquisition tasks without explicit feedback, and so typically use unsupervised learning approaches. Moreover, evaluation is made more difficult by uncertainty about the exact nature of the target linguistic knowledge and a lack of empirical evidence about children's knowledge at specific stages in development. Given this, how do we know that a model's output is “good enough”? How should success be measured? To create informative cognitive models of acquisition that offer insight into how children acquire language, we should consider how to evaluate acquisition models appropriately (Pearl, 2014; Phillips, 2015; Phillips & Pearl, 2015b).

Type: Chapter
Information: Language, Cognition, and Computational Models , pp. 185 - 224

DOI: https://doi.org/10.1017/9781316676974.008 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2018

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Anderson, J. (1990). The adaptive character of thought. Hillsdale, NJ: Erlbaum.Google Scholar

Baldwin, D. A. (1993). Early referential understanding: Infants’ ability to recognize referential acts for what they are. Developmental Psychology, 29(5), 832.CrossRef Google Scholar

Bergelson, E., & Swingley, D. (2012). At 6–9 months, human infants know the meanings of many common nouns. Proceedings of the National Academy of Sciences, 109(9), 3253–3258.CrossRef Google Scholar PubMed

Bertonicini, J., Bijeljac-Babic, R., Jusczyk, P., Kennedy, L., & Mehler, J. (1988). An investigation of young infants’ perceptual representations of speech sounds. Journal of Experimental Psychology, 117(1), 21–33.Google Scholar

Best, C., McRoberts, G., LaFleur, R., & Silver-Isenstadt, J. (1995). Divergent developmental patterns for infants’ perception of two nonnative consonant contrasts. Infant Behavior and Development, 18, 339–350.CrossRef Google Scholar

Best, C., McRoberts, G., & Sithole, N. (1988). Examination of perceptual reorganization for nonnative speech contrasts: Zulu click discrimination by english-speaking adults and infants. Journal of Experimental Psychology: Human Perception and Performance, 14(3), 345–360.Google Scholar PubMed

Bijeljac-Babic, R., Bertoncini, J., & Mehler, J. (1993). How do 4-day-old infants categorize multisyllabic utterances? Developmental Psychology, 29(4), 711–721.CrossRef Google Scholar

Blanchard, D., Heinz, J., & Golinkoff, R. (2010). Modeling the contribution of phonotactic cues to the problem of word segmentation. Journal of Child Language, 37, 487–511.CrossRef Google Scholar PubMed

Bonawitz, E., Denison, S., Chen, A., Gopnik, A., & Griffiths, T. (2011). A simple sequential algorithm for approximating bayesian inference. In Proceedings of the 33rd annual conference of the cognitive science society, 2463–2468.

Bortfeld, H., Morgan, J., Golinkoff, R. & Rathbun, K. (2005). Mommy and me: Familiar names help launch babies into speech-stream segmentation. Psychological Science, 16(4), 298–304.CrossRef Google Scholar PubMed

Brent, M. (1999). An efficient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning, 34, 71–105.CrossRef Google Scholar

Brent, M., & Siskind, J. (2001). The role of exposure to isolated words in early vocabulary. Cognition, 81, 31–44.CrossRef Google Scholar PubMed

Brown, R. (1973). A first language: The early stages. Harvard University Press.CrossRef Google Scholar

Carey, S. (1978). The child as word learner. In J., Bresnan G., Miller, & M., Halle (Eds), Linguistic theory and psychological reality. (pp. 264–293). Cambridge, MA: MIT Press.Google Scholar

Christiansen, M. H., Allen, J., & Seidenberg, M. S. (1998). Learning to segment speech using multiple cues: A connectionist model. Language and Cognitive Processes, 13(2–3), 221–268.CrossRef Google Scholar

Cole, R., & Jakimik, J. (1980). Perception and production of fluent speech. In R., Cole (Ed.), Perception and production of fluent speech (pp. 133–163) Hillsdale, NJ: Erlbaum.Google Scholar

Cornell, E. H., & Bergstrom, L. I. (1983). Serial-position effects in infants’ recognition memory. Memory & Cognition, 11(5), 494–499.CrossRef Google Scholar PubMed

Davis, S. J., Newport, E. L., & Aslin, R. N. (2011). Probability-matching in 10-month-old infants. Proceedings of the 33rd Cognitive Science Society, 3011–3015.

Denison, S., Bonawitz, E., Gopnik, A., & Griffiths, T. (2013). Rational variability in children's causal inferences: The Sampling Hypothesis. Cognition, 126, 285–300.CrossRef Google Scholar PubMed

Doyle, G., & Levy, R. (2013). Combining multiple information types in Bayesian word segmentation. In Highlights – North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 117–126).

Eimas, P. (1999). Segmental and syllabic representations in the perception of speech by young infants. Journal of the Acoustical Society of America, 105(3), 1901–1911.CrossRef Google Scholar PubMed

Ferguson, T. (1973). A Bayesian analysis of Some Nonparametric Problems. Annals of Statistics, 1(2), 209–230.CrossRef Google Scholar

Fernald, A., & Morikawa, H. (1993). Common themes and cultural variations in japanese and american mothers’ speech to infants. Child Development, 64(3), 637–656.CrossRef Google Scholar PubMed

Fourtassi, A., Börschinger, B., Johnson, M., & Dupoux, E. (2013). Whyisenglishsoeasytosegment. In Proceedings of the Fourth Annual Workshop on Cognitive Modeling and Computational Linguistics (pp. 1–10).

Frank, M., Goodman, N., & Tenenbaum, J. (2009). Using speakers’ referential intentions to model early cross-situational word learning. Psychological Science, 20, 579–585.CrossRef Google Scholar PubMed

Galliers, J., & Jones, K. S. (1993). Evaluating natural language processing systems. (Tech. Rept. No. 291). Computer Laboratory, University of Cambridge.Google Scholar

Geman, S., & Geman, D. (1984). Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 6, 721–741.Google Scholar PubMed

Gervain, J., & Erra, R. G. (2012). The statistical signature of morphosyntax: A study of Hungarian and Italian infant-directed speech. Cognition, 125(2), 263–287.CrossRef Google Scholar PubMed

Goldwater, S., Griffiths, T., & Johnson, M. (2009). A bayesian framework for word segmentation. Cognition, 112(1), 21–54.CrossRef Google Scholar PubMed

Goldwater, S., Griffiths, T., & Johnson, M. (2011). Producing power-law distributions and damping word frequencies with two-stage language models. Journal of Machine Learning Research, 12, 2335–2382.Google Scholar

Gulya, M., Rovee-Collier, C., Galluccio, L., & Wilk, A. (1998). Memory processing of a serial list by young infants. Psychological Science, 9(4), 303–307.CrossRef Google Scholar

Hohne, E., & Jusczyk, P. (1994). Two-month-old infants’ sensitivity to allophonic differences. Perception & Psychophysics, 56(6), 613–623.CrossRef Google Scholar PubMed

Johnson, E., & Jusczyk, P. (2001). Word segmentation by 8-month-olds: When speech cues count more than statistics. Journal of Memory and Language, 44, 548–567.CrossRef Google Scholar

Johnson, M. (2008). Unsupervised word segmentation for Sesotho using adaptor grammars. In Proceedings of the tenth meeting of the ACL special interest group on computational morphology and phonology (pp. 20–27).CrossRef

Johnson, M., & Demuth, K. (2010). Unsupervised phonemic chinese word segmentation using adaptor grammars. In Proceedings of the 23rd international conference on computational linguistics (pp. 528–536).

Johnson, M., Demuth, K., Jones, B., & Black, M. J. (2010). Synergies in learning words and their referents. In Advances in neural information processing systems (pp. 1018–1026).

Jusczyk, P. (1997). The discovery of spoken language. Cambridge, MA: MIT Press.Google Scholar

Jusczyk, P., Cutler, A., & Redanz, N. (1993). Infants’ preference for the predominant stress pattern of English words. Child Development, 64(3), 675–687.CrossRef Google Scholar PubMed

Jusczyk, P., & Derrah, C. (1987). Representation of speech sounds by young infants. Developmental Psychology, 23(5), 648–654.CrossRef Google Scholar

Jusczyk, P., Hohne, E., & Baumann, A. (1999). Infants’ sensitivity to allphonic cues for word segmentation. Perception and Psychophysics, 61, 1465–1476.CrossRef Google Scholar PubMed

Jusczyk, P., Houston, D., & Newsome, M. (1999). The beginnings of word segmentation in english-learning infants. Cognitive Psychology, 39, 159–207.CrossRef Google Scholar PubMed

Jusczyk, P., Jusczyk, A., Kennedy, L., Schomberg, T., & Koenig, N. (1995). Young infants’ retention of information about bisyllabic utterances. Journal of Experimental Psychology: Human Perception and Performance, 21(4), 822–836.Google Scholar PubMed

Kam, C. H., & Newport, E. (2005). Regularizing unpredictable variation: The roles of adult and child learners in language formation and change. Language Learning and Development, 1(2), 151–195.CrossRef Google Scholar

Kam, C. L. H., & Newport, E. L. (2009). Getting it right by getting it wrong: When learners change languages. Cognitive Psychology, 59(1), 30–66.Google Scholar PubMed

Karins, K., MacIntyre, R., Brandmair, M., Lauscher, S., & McLemore, C. (1997). CALL-HOME German Lexicon. Linguistic Data Consortium.Google Scholar

Kingsbury, P., Strassel, S., McLemore, C., & MacIntyre, R. (1997). CALLHOME American English Lexicon (PRONLEX). Linguistic Data Consortium.Google Scholar

Kolodny, O., Lotem, A., & Edelman, S. (2015). Learning a generative probabilistic grammar of experience: A process-level model of language acquisition. Cognitive Science, 39, 227–267.CrossRef Google Scholar PubMed

Köpcke, K.-M. (1998). The acquisition of plural marking in English and German revisited: Schemata versus rules. Journal of Child Language, 25(2), 293–319.CrossRef Google Scholar PubMed

Korman, M. (1984). Adaptive aspects of maternal vocalizations in differing contexts at ten weeks. First Language, 5, 44–45.Google Scholar

Kuhl, P., Williams, K., Lacerda, F., Stevens, K., & Lindblom, B. (1992). Linguistic experience alters phonetic perception in infants by 6 months of age. Science, 255, 606–608.CrossRef Google Scholar PubMed

Lignos, C. (2012). Infant word segmentation: An incremental, integrated model. In Proceedings of the 30th West Coast conference on formal linguistics (pp. 237–247).

Lignos, C., & Yang, C. (2010). Recession segmentation: Simpler online word segmentation using limited resources. In Proceedings of the fourteenth conference on computational natural language learning (pp. 88-97).

MacWhinney, B. (2000). The childes project: Tools for analyzing talk. 3 edn. Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar

Markman, E. (1989). Categorization and naming in children: Problems of induction. Cambridge, MA: MIT Press.Google Scholar

Markman, E., & Wachtel, G. (1988). Children's use of mutual exclusivity to constrain the meanings of words. Cognitive Psychology, 20, 121–157.CrossRef Google Scholar

Markman, E., Wasow, J., & Hansen, M. (2003). Use of the mutual exclusivity assumption by young word learners. Cognitive Psychology, 47, 241–275.CrossRef Google Scholar PubMed

Markson, L., & Bloom, P. (1997). Evidence against a dedicated system for word learning in children. Nature, 385, 813–815.CrossRef Google Scholar PubMed

Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. New York: Henry Holt and Co. Inc.Google Scholar

Marthi, B., Pasula, H., Russell, S., & Peres, Y. (2002). Decayed MCMC filtering. In Proceedings of 18th UAI (pp. 319–326).

Mattys, S., Jusczyk, P., & Luce, P. (1999). Phonotactic and prosodic effects on word segmentation in infants. Cognitive Psychology, 38, 465–494.CrossRef Google Scholar PubMed

Pearl, L. (2014). Evaluating learning strategy components: Being fair. Language, 90(3), e107–e114.CrossRef Google Scholar

Pearl, L., Goldwater, S., & Steyvers, M. (2011). Online learning mechanisms for bayesian models of word segmentation. Research on Language and Computation, 8(2), 107–132. (special issue on computational models of language acquisition).Google Scholar

Peters, A. (1983). The units of language acquisition. New York: Cambridge University Press.Google Scholar

Phillips, L. (2015). The role of empirical evidence in modeling speech segmentation. Unpublished doctoral dissertation, University of California, Irvine.

Phillips, L., & Pearl, L. (2012). ‘Less is More’ in Bayesian word segmentation: When cognitively plausible leaners outperform the ideal. In Proceedings of the 34th Annual Conference of the Cognitive Science Society (pp. 863–868).

Phillips, L., & Pearl, L. (2014a). Bayesian inference as a cross-linguistic word segmentation strategy: Always learning useful things. In Proceedings of the computational and cognitive models of language acquisition and language processing workshop (pp. 9–13).

Phillips, L., & Pearl, L. (2014b). Bayesian inference as a viable cross-linguistic word segmentation strategy: It's all about what's useful. In Proceedings of the 36th Annual Conference of the Cognitive Science Society (p. 2775–2780). Quebec City: Cognitive Science Society.

Phillips, L., & Pearl, L. (2015a). Utility-based evaluation metrics for models of language acquisition: A look at speech segmentation. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics 2015 (pp. 68–78). NAACL.

Phillips, L., & Pearl, L. (2015b). The utility of cognitive plausibility in language acquisition modeling: Evidence from word segmentation. Cognitive Science, 39(8), 1824–1854.CrossRef Google Scholar PubMed

Polka, L., & Werker, J. (1994). Developmental changes in perception of nonnative vowel contrasts. Journal of Experimental Psychology: Human Perception and Performance, 20(2), 421–435.Google Scholar PubMed

Rollins, P. (2003). Caregiver contingent comments and subsequent vocabulary comprehension. Applied Psycholinguistics, 24, 221–234.Google Scholar

Rose, S. A., Feldman, J. F., & Jankowski, J. J. (2001). Visual short-term memory in the first year of life: Capacity and recency effects. Developmental Psychology, 37(4), 539–549.CrossRef Google Scholar PubMed

Shi, L., Griffiths, T., Feldman, N., & Sanborn, A. (2010). Exemplar models as a mechanism for performing Bayesian inference. Psychonomic Bulletin & Review, 17(4), 443–464.CrossRef Google Scholar PubMed

Smith, L., & Yu, C. (2008). Infants rapidly learn word-referent mappings via crosssituational statistics. Cognition, 106(3), 1558–1568.CrossRef Google Scholar PubMed

Swingley, D. (2005). Statistical clustering and the contents of the infant vocabulary. Cognitive Psychology, 50, 86–132.CrossRef Google Scholar PubMed

Teh, Y., Jordan, M., Beal, M., & Blei, D. (2006). Heirarchical Dirichlet processes. Journal of the American Statistical Association, 101(476), 1566–1581.CrossRef Google Scholar

Teinonen, T., Fellman, V., Näätänen, R., Alku, P., & Huotilainen, M. (2009). Statistical language learning in neonates revealed by event-related brain potentials. BMC Neuroscience, 10(1), 21.CrossRef Google Scholar PubMed

Thiessan, E., & Saffran, J. (2007). Learning to learn: Infant's acquisition of stress-based strategies for word segmentation. Language Learning and Development, 3(1), 73–100.Google Scholar

Thiessen, E., & Saffran, J. (2003). When cues collide: Use of stress and statistical cues to word boundaries by 7-to 9-month-old infants. Developmental Psychology, 39(4), 706–716.CrossRef Google Scholar PubMed

Tincoff, R., & Jusczyk, P. W. (1999). Some beginnings of word comprehension in 6-month-olds. Psychological Science, 10(2), 172–175.CrossRef Google Scholar

Tincoff, R., & Jusczyk, P. W. (2012). Six-month-olds comprehend words that refer to parts of the body. Infancy, 17(4), 432–444.CrossRef Google Scholar

von Luxburg, U., Williamson, R., & Guyon, I. (2011). Clustering: Science or Art? In JMLR workshop and conference proceedings 27 (pp. 65–79).

Werker, J., & Lalonde, C. (1988). Cross-language speech perception: initial capabilities and developmental change. Developmental Psychology, 24(5), 672–683.CrossRef Google Scholar

Werker, J., & Tees, R. (1984). Cross-language speech perception: Evidence for perceptual reorganization during the first year of life. Infant Behavior & Development, 7, 49–63.CrossRef Google Scholar

Wilson, M. (1988). The MRC psycholinguistic database machine readable dictionary. Behavioral Research Methods, Instruments and Computers, 20, 6–11.CrossRef Google Scholar

Xu, F. (2002). The role of language in acquiring object kind concepts in infancy. Cognition, 85(3), 223–250.CrossRef Google Scholar PubMed

Yu, C., & Smith, L. B. (2007). Rapid word learning under uncertainty via cross-situational statistics. Psychological Science, 18(5), 414–420.CrossRef Google Scholar PubMed

Yu, C., & Smith, L. B. (2011). What you learn is what you see: Using eye movements to study infant cross-situational word learning. Developmental Science, 14(2), 165–180.CrossRef Google Scholar PubMed

Book contents

8 - Evaluating Language Acquisition Models: A Utility-Based Look at Bayesian Segmentation

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive