Hostname: page-component-76fb5796d-wq484 Total loading time: 0 Render date: 2024-04-26T12:41:18.364Z Has data issue: false hasContentIssue false

A structured distributional model of sentence meaning and processing

Published online by Cambridge University Press:  31 July 2019

E. Chersoni*
Affiliation:
Department of Chinese and Bilingual Studies, Hong Kong Polytechnic University, Hong Kong, China
E. Santus
Affiliation:
Computer Science and Artificial Intelligence Lab, MIT, Cambridge (MA), United States
L. Pannitto
Affiliation:
Department of Philology, Literature and Linguistics, University of Pisa, Pisa, Italy
A. Lenci
Affiliation:
Department of Philology, Literature and Linguistics, University of Pisa, Pisa, Italy
P. Blache
Affiliation:
Laboratoire Parole et Langage, Aix-Marseille University, France
C.-R. Huang
Affiliation:
Department of Chinese and Bilingual Studies, Hong Kong Polytechnic University, Hong Kong, China
*
*Corresponding author. Email: emmanuelechersoni@gmail.com

Abstract

Most compositional distributional semantic models represent sentence meaning with a single vector. In this paper, we propose a structured distributional model (SDM) that combines word embeddings with formal semantics and is based on the assumption that sentences represent events and situations. The semantic representation of a sentence is a formal structure derived from discourse representation theory and containing distributional vectors. This structure is dynamically and incrementally built by integrating knowledge about events and their typical participants, as they are activated by lexical items. Event knowledge is modelled as a graph extracted from parsed corpora and encoding roles and relationships between participants that are represented as distributional vectors. SDM is grounded on extensive psycholinguistic research showing that generalized knowledge about events stored in semantic memory plays a key role in sentence comprehension.We evaluate SDMon two recently introduced compositionality data sets, and our results show that combining a simple compositionalmodel with event knowledge constantly improves performances, even with dif ferent types of word embeddings.

Type
Article
Copyright
© Cambridge University Press 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adi, Y., Kermany, E., Belinkov, Y., Lavi, O. and Goldberg, Y. (2017). Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. In ICLR.Google Scholar
Arora, S., Liang, Y. and Ma, T. (2017). A simple but tough-to-beat baseline for sentence embeddings. In ICLR.Google Scholar
Baggio, G. and Hagoort, P. (2011). The balance between memory and unification in semantics: A dynamic account of the N400. Language and Cognitive Processes 26(9), 13381367.CrossRefGoogle Scholar
Bar, M. (2009). The proactive brain. Philosophical Transactions of the Royal Society B 364(March), 12351243.CrossRefGoogle ScholarPubMed
Bar, M., Aminoff, E., Mason, M. and Fenske, M. (2007). The units of thought. Hippocampus 17(6), 420428.CrossRefGoogle Scholar
Baroni, M., Bernardi, R. and Zamparelli, R. (2013). Frege in space: A program for compositional distributional semantics. Linguistic Issues in Language Technologies 9.Google Scholar
Baroni, M., Bernardini, S., Ferraresi, A. and Zanchetta, E. (2009). The WaCky Wide Web: A collection of very large linguistically processed Web-Crawled Corpora. Language Resources and Evaluation 43(3), 209226.CrossRefGoogle Scholar
Baroni, M., Dinu, G. and Kruszewski, G. (2014). Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In ACL.Google Scholar
Baroni, M. and Lenci, A. (2010). Distributional memory: A general framework for corpus-based semantics. Computational Linguistics 36(4), 673721.CrossRefGoogle Scholar
Beltagy, I., Roller, S., Cheng, P., Erk, K. and Mooney, R.J. (2016). Representing meaning with a combination of logical and distributional models. Computational Linguistics 42(4), 763808.CrossRefGoogle Scholar
Bicknell, K., Elman, J.L., Hare, M., McRae, K. and Kutas, M. (2010). Effects of event knowledge in processing verbal arguments. Journal of Memory and Language 63(4), 489505.CrossRefGoogle ScholarPubMed
Binder, J.R. (2016). In defense of abstract conceptual representations. Psychonomic Bulletin & Review 23, 10961108.CrossRefGoogle ScholarPubMed
Boleda, G. and Herbelot, A. (2016). Formal distributional semantics: Introduction to the special issue. Computational Linguistics 42(4).CrossRefGoogle Scholar
Chersoni, E., Lenci, A. and Blache, P. (2017). Logical metonymy in a distributional model of sentence comprehension. In *SEM.CrossRefGoogle Scholar
Chersoni, E., Santus, E., Blache, P. and Lenci, A. (2017). Is structure necessary for modeling argument expectations in distributional semantics? In IWCS.Google Scholar
Chersoni, E., Santus, E., Lenci, A., Blache, P. and Huang, C.-R. (2016). Representing verbs with rich contexts: An evaluation on verb similarity. In EMNLP.CrossRefGoogle Scholar
Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences 36(3), 181204.CrossRefGoogle ScholarPubMed
Conneau, A., Kruszewski, G., Lample, G., Barrault, L. and Baroni, M. (2018). What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties. In ACL.Google Scholar
Elman, J.L. (2009). On the meaning of words and Dinosaur bones: Lexical knowledge without a Lexicon. Cognitive Science 33(4), 136.CrossRefGoogle ScholarPubMed
Elman, J.L. (2014). Systematicity in the Lexicon: On having your cake and eating it too. In The Architecture of Cognition: Rethinking Fodor and Pylyshyn’s Systematicity Challenge. Cambridge, MA: The MIT Press. pp. 115146.CrossRefGoogle Scholar
Erk, K. (2007). A simple, similarity-based model for selectional preferences. In ACL. pp. 635653.Google Scholar
Erk, K. (2012). Vector space models of word meaning and phrase meaning: A survey. Linguistics and Language Compass 6(10).CrossRefGoogle Scholar
Erk, K., Padó, S. and Padó, U. (2010). A flexible, corpus-driven model of regular and inverse selectional preferences. Computational Linguistics 36.CrossRefGoogle Scholar
Ettinger, A., Elgohary, A. and Resnik, P. (2016). Probing for semantic evidence of composition by means of simple classification tasks. In Workshop on evaluating vector-space representations for NLP. ACL.Google Scholar
Evert, S. (2004). The Statistics of Word Cooccurrences Word Pairs and Collocations. PhD Thesis, University of Stuttgart.Google Scholar
Ferretti, T.R., McRae, K. and Hatherell, A. (2001). Integrating verbs, situation schemas, and thematic role concepts. Journal of Memory and Language 44(4), 516547.CrossRefGoogle Scholar
Greenberg, C., Sayeed, A.B. and Demberg, V. (2015). Improving unsupervised vector-space thematic fit evaluation via role-filler prototype clustering. In NAACL-HLT.CrossRefGoogle Scholar
Hagoort, P. (2013). MUC (Memory, Unification, Control) and beyond. Frontiers in Psychology 4.CrossRefGoogle ScholarPubMed
Hagoort, P. (2016). MUC (Memory, Unification, Control): A model on the neurobiology of language beyond single word processing. In Neurobiology of Language, Volume 28. Amsterdam: Elsevier. pp. 339347.CrossRefGoogle Scholar
Hare, M., Jones, M., Thomson, C., Kelly, S. and McRae, K. (2009). Activating event knowledge. Cognition 111(2), 151167.CrossRefGoogle ScholarPubMed
Heim, I. (1983). File change semantics and the familiarity theory of definiteness. In Meaning, Use, and Interpretation of Language. Berlin: De Gruyter.Google Scholar
Hill, F., Cho, K. and Korhonen, A. (2016). Learning distributed representations of sentences from unlabelled data. In NAACL-HLT.CrossRefGoogle Scholar
Hong, X., Sayeed, A. and Demberg, V. (2018). Learning distributed event representations with a multi-task approach. In *SEM.CrossRefGoogle Scholar
Kamp, H. (2013). Meaning and the Dynamics of Interpretation: Selected papers by Hans Kamp. Leiden-Boston: Brill.CrossRefGoogle Scholar
Kamp, H. (2016). Entity Representations and Articulated Contexts: An Exploration of the Semantics and Pragmatics of Definite Noun Phrases. Unpublished manuscript.Google Scholar
Kiros, R., Zhu, Y., Salakhutdinov, R.R., Zemel, R., Urtasun, R., Torralba, A. and Fidler, S. (2015). Skip-thought vectors. In NIPS.Google Scholar
Kuperberg, G.R. and Jaeger, T.F. (2015). What do we mean by prediction in language comprehension? Language Cognition & Neuroscience 3798, 3259.Google Scholar
Lapesa, G. and Evert, S. (2017). Large-scale evaluation of dependency-based DSMs: Are they worth the effort? In EACL.CrossRefGoogle Scholar
Leech, G.N. (2000). Manual to accompany the British National Corpus (version 2) with improved word-class tagging.Google Scholar
Lenci, A. (2011). Composing and updating verb argument expectations: A distributional semantic model. In ACL Workshop on Cognitive Modeling and Computational Linguistics.Google Scholar
Lenci, A. (2018a). Distributional models of word meaning. Annual Review of Linguistics 4, 151171.CrossRefGoogle Scholar
Lenci, A. (2018b). Dynamic Distributional Semantics. Unpublished manuscript.Google Scholar
Levy, O. and Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. In NIPS.Google Scholar
Levy, O., Goldberg, Y. and Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics 3.CrossRefGoogle Scholar
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. and McClosky, D. (2014). The Stanford CoreNLP Natural Language Processing Toolkit. In ACL (System Demonstrations).CrossRefGoogle Scholar
Matsuki, K., Chow, T., Hare, M., Elman, J.L., Scheepers, C. and McRae, K. (2011). Event-based plausibility immediately influences online language comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition 37(4).Google Scholar
McNally, L. (2017). Kinds, descriptions of kinds, concepts, and distributions. In Bridging Formal and Conceptual Semantics. Selected Papers of BRIDGE-14. DUP.Google Scholar
McNally, L. and Boleda, G. (2017). Conceptual vs. referential affordance in concept composition. In Compositionality and Concepts in Linguistics and Psychology. Berlin: Springer. pp. 245267.CrossRefGoogle Scholar
McRae, K., Hare, M., Elman, J.L. and Ferretti, T. (2005). A basis for generating expectancies for verbs from nouns. Memory & Cognition 33(7), 11741184.CrossRefGoogle ScholarPubMed
McRae, K. and Matsuki, K. (2009). People use their knowledge of common events to understand language, and do so as quickly as possible. Language and Linguistics Compass 3(6), 14171429.CrossRefGoogle Scholar
McRae, K., Spivey-Knowlton, M.J. and Tanenhaus, M.K. (1998). Modeling the influence of thematic fit (and other constraints) in online sentence comprehension. Journal of Memory and Language 38(3), 283312.CrossRefGoogle Scholar
Meltzer-Asscher, A., Mack, J.E., Barbieri, E. and Thompson, C.K. (2015). How the brain processes different dimensions of argument structure complexity: Evidence from fMRI. Brain and Language 142, 6575.CrossRefGoogle ScholarPubMed
Metusalem, R., Kutas, M., Urbach, T.P., Hare, M., McRae, K. and Elman, J.L. (2012). Generalized event knowledge activation during online sentence comprehension. Journal of Memory and Language 66(4), 545567.CrossRefGoogle ScholarPubMed
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S. and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In NIPS.Google Scholar
Mitchell, J. and Lapata, M. (2010). Composition in distributional models of semantics. Cognitive Science 34(8), 13881429.CrossRefGoogle ScholarPubMed
Paczynski, M. and Kuperberg, G.R. (2012). Multiple influences of semantic memory on sentence processing: Distinct effects of semantic relatedness on violations of real-world event/state knowledge and animacy selection restrictions. J Memory and Language 67(4), 426448.CrossRefGoogle ScholarPubMed
Padó, U. (2007). The Integration of Syntax and Semantic Plausibility in a Wide-coverage Model of Human Sentence Processing. PhD Thesis, University of Stuttgart.Google Scholar
Palangi, H., Smolensky, P., He, X. and Deng, L. (2018). Question-answering with grammatically-interpretable representations. In AAAI.Google Scholar
Pham, N., Kruszewski, G., Lazaridou, A., Baroni, M. (2015). Jointly optimizing word representations for lexical and sentential tasks with the C-phrase model. In ACL.CrossRefGoogle Scholar
Paperno, D., Pham, N.T. and Baroni, M. (2014). A practical and linguistically-motivated approach to compositional distributional semantics. In ACL, Volume 1.Google Scholar
Pustejovsky, J. (1995). The Generative Lexicon. Cambridge, MA: MIT Press.Google Scholar
Rimell, L., Maillard, J., Polajnar, T. and Clark, S. (2016). RELPRON: A relative clause evaluation data set for compositional distributional semantics. Computational Linguistics 42(4), 661701.CrossRefGoogle Scholar
Santus, E., Chersoni, E., Lenci, A. and Blache, P. (2017). Measuring thematic fit with distributional feature overlap. In EMNLP.CrossRefGoogle Scholar
Sayeed, A., Demberg, V. and Shkadzko, P. (2015). An exploration of semantic features in an unsupervised thematic fit evaluation framework. Italian Journal of Linguistics 1(1).Google Scholar
Sayeed, A., Greenberg, C. and Demberg, V. (2016). Thematic fit evaluation: An aspect of selectional preferences. In ACL Workshop for Evaluating Vector Space Representations for NLP.CrossRefGoogle Scholar
Thompson, C.K. and Meltzer-Asscher, A. (2014). Neurocognitive mechanisms of verb argument structure processing. In Structuring the Argument: Multidisciplinary Research on Verb Argument Structure. Amsterdam: John Benjamins.Google Scholar
Tian, R., Okazaki, N. and Inui, K. (2017). The mechanism of additive composition. Machine Learning 106(7), 10831130.CrossRefGoogle Scholar
Tilk, O., Demberg, V., Sayeed, A.B., Klakow, D. and Thater, S. (2016). Event participant modelling with neural networks. In EMNLP.CrossRefGoogle Scholar
Vassallo, P., Chersoni, E., Santus, E., Lenci, A. and Blache, P. (2018). Event knowledge in sentence processing: A new dataset for the evaluation of argument typicality. In LREC Workshop on Linguistic and Neurocognitive Resources (LiNCR).Google Scholar
Washtell, J. (2010). Expectation vectors: A semiotics inspired approach to geometric lexical-semantic representation. In Workshop on Geometrical Models of Natural Language Semantics. ACL.Google Scholar
Williams, A., Reddigari, S. and Pylkkänen, L. (2017). Early sensitivity of left perisylvian cortex to relationality in nouns and verbs. Neuropsychologia 100, 131143.CrossRefGoogle ScholarPubMed
Zarcone, A., Utt, J. and Padó, S. (2012). Modeling covert event retrieval in logical metonymy: Probabilistic and distributional accounts. In Proceedings of the NAACL Workshop on Cognitive Modeling and Computational Linguistics.Google Scholar
Zaremba, W., Sutskever, I. and Vinyals, O. (2015). Recurrent neural network regularization. In ICLR. arXiv:1409.2329.Google Scholar
Zhu, X., Li, T. and de Melo, G. (2018). Exploring semantic properties of sentence embeddings. In ACL.CrossRefGoogle Scholar