A structured distributional model of sentence meaning and processing

E. Chersoni; E. Santus; L. Pannitto; A. Lenci; P. Blache; C.-R. Huang

doi:10.1017/S1351324919000214

A structured distributional model of sentence meaning and processing

Published online by Cambridge University Press: 31 July 2019

E. Chersoni ,

E. Santus ,

L. Pannitto ,

A. Lenci ,

P. Blache and

C.-R. Huang

Show author details

E. Chersoni*: Affiliation:
Department of Chinese and Bilingual Studies, Hong Kong Polytechnic University, Hong Kong, China
E. Santus: Affiliation:
Computer Science and Artificial Intelligence Lab, MIT, Cambridge (MA), United States
L. Pannitto: Affiliation:
Department of Philology, Literature and Linguistics, University of Pisa, Pisa, Italy
A. Lenci: Affiliation:
Department of Philology, Literature and Linguistics, University of Pisa, Pisa, Italy
P. Blache: Affiliation:
Laboratoire Parole et Langage, Aix-Marseille University, France
C.-R. Huang: Affiliation:
Department of Chinese and Bilingual Studies, Hong Kong Polytechnic University, Hong Kong, China
*: *Corresponding author. Email: emmanuelechersoni@gmail.com

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Most compositional distributional semantic models represent sentence meaning with a single vector. In this paper, we propose a structured distributional model (SDM) that combines word embeddings with formal semantics and is based on the assumption that sentences represent events and situations. The semantic representation of a sentence is a formal structure derived from discourse representation theory and containing distributional vectors. This structure is dynamically and incrementally built by integrating knowledge about events and their typical participants, as they are activated by lexical items. Event knowledge is modelled as a graph extracted from parsed corpora and encoding roles and relationships between participants that are represented as distributional vectors. SDM is grounded on extensive psycholinguistic research showing that generalized knowledge about events stored in semantic memory plays a key role in sentence comprehension.We evaluate SDMon two recently introduced compositionality data sets, and our results show that combining a simple compositionalmodel with event knowledge constantly improves performances, even with dif ferent types of word embeddings.

Keywords

distributional semantics event knowledge discourse representation theory word embeddings sentence processing

Type: Article
Information: Natural Language Engineering , Volume 25 , Issue 4 , July 2019 , pp. 483 - 502

DOI: https://doi.org/10.1017/S1351324919000214 [Opens in a new window]
Copyright: © Cambridge University Press 2019

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adi, Y., Kermany, E., Belinkov, Y., Lavi, O. and Goldberg, Y. (2017). Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. In ICLR.Google Scholar

Arora, S., Liang, Y. and Ma, T. (2017). A simple but tough-to-beat baseline for sentence embeddings. In ICLR.Google Scholar

Baggio, G. and Hagoort, P. (2011). The balance between memory and unification in semantics: A dynamic account of the N400. Language and Cognitive Processes 26(9), 1338–1367.CrossRef Google Scholar

Bar, M. (2009). The proactive brain. Philosophical Transactions of the Royal Society B 364(March), 1235–1243.CrossRef Google Scholar PubMed

Bar, M., Aminoff, E., Mason, M. and Fenske, M. (2007). The units of thought. Hippocampus 17(6), 420–428.CrossRef Google Scholar

Baroni, M., Bernardi, R. and Zamparelli, R. (2013). Frege in space: A program for compositional distributional semantics. Linguistic Issues in Language Technologies 9.Google Scholar

Baroni, M., Bernardini, S., Ferraresi, A. and Zanchetta, E. (2009). The WaCky Wide Web: A collection of very large linguistically processed Web-Crawled Corpora. Language Resources and Evaluation 43(3), 209–226.CrossRef Google Scholar

Baroni, M., Dinu, G. and Kruszewski, G. (2014). Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In ACL.Google Scholar

Baroni, M. and Lenci, A. (2010). Distributional memory: A general framework for corpus-based semantics. Computational Linguistics 36(4), 673–721.CrossRef Google Scholar

Beltagy, I., Roller, S., Cheng, P., Erk, K. and Mooney, R.J. (2016). Representing meaning with a combination of logical and distributional models. Computational Linguistics 42(4), 763–808.CrossRef Google Scholar

Bicknell, K., Elman, J.L., Hare, M., McRae, K. and Kutas, M. (2010). Effects of event knowledge in processing verbal arguments. Journal of Memory and Language 63(4), 489–505.CrossRef Google Scholar PubMed

Binder, J.R. (2016). In defense of abstract conceptual representations. Psychonomic Bulletin & Review 23, 1096–1108.CrossRef Google Scholar PubMed

Boleda, G. and Herbelot, A. (2016). Formal distributional semantics: Introduction to the special issue. Computational Linguistics 42(4).CrossRef Google Scholar

Chersoni, E., Lenci, A. and Blache, P. (2017). Logical metonymy in a distributional model of sentence comprehension. In *SEM.CrossRef Google Scholar

Chersoni, E., Santus, E., Blache, P. and Lenci, A. (2017). Is structure necessary for modeling argument expectations in distributional semantics? In IWCS.Google Scholar

Chersoni, E., Santus, E., Lenci, A., Blache, P. and Huang, C.-R. (2016). Representing verbs with rich contexts: An evaluation on verb similarity. In EMNLP.CrossRef Google Scholar

Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences 36(3), 181–204.CrossRef Google Scholar PubMed

Conneau, A., Kruszewski, G., Lample, G., Barrault, L. and Baroni, M. (2018). What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties. In ACL.Google Scholar

Elman, J.L. (2009). On the meaning of words and Dinosaur bones: Lexical knowledge without a Lexicon. Cognitive Science 33(4), 1–36.CrossRef Google Scholar PubMed

Elman, J.L. (2014). Systematicity in the Lexicon: On having your cake and eating it too. In The Architecture of Cognition: Rethinking Fodor and Pylyshyn’s Systematicity Challenge. Cambridge, MA: The MIT Press. pp. 115–146.CrossRef Google Scholar

Erk, K. (2007). A simple, similarity-based model for selectional preferences. In ACL. pp. 635–653.Google Scholar

Erk, K. (2012). Vector space models of word meaning and phrase meaning: A survey. Linguistics and Language Compass 6(10).CrossRef Google Scholar

Erk, K., Padó, S. and Padó, U. (2010). A flexible, corpus-driven model of regular and inverse selectional preferences. Computational Linguistics 36.CrossRef Google Scholar

Ettinger, A., Elgohary, A. and Resnik, P. (2016). Probing for semantic evidence of composition by means of simple classification tasks. In Workshop on evaluating vector-space representations for NLP. ACL.Google Scholar

Evert, S. (2004). The Statistics of Word Cooccurrences Word Pairs and Collocations. PhD Thesis, University of Stuttgart.Google Scholar

Ferretti, T.R., McRae, K. and Hatherell, A. (2001). Integrating verbs, situation schemas, and thematic role concepts. Journal of Memory and Language 44(4), 516–547.CrossRef Google Scholar

Greenberg, C., Sayeed, A.B. and Demberg, V. (2015). Improving unsupervised vector-space thematic fit evaluation via role-filler prototype clustering. In NAACL-HLT.CrossRef Google Scholar

Hagoort, P. (2013). MUC (Memory, Unification, Control) and beyond. Frontiers in Psychology 4.CrossRef Google Scholar PubMed

Hagoort, P. (2016). MUC (Memory, Unification, Control): A model on the neurobiology of language beyond single word processing. In Neurobiology of Language, Volume 28. Amsterdam: Elsevier. pp. 339–347.CrossRef Google Scholar

Hare, M., Jones, M., Thomson, C., Kelly, S. and McRae, K. (2009). Activating event knowledge. Cognition 111(2), 151–167.CrossRef Google Scholar PubMed

Heim, I. (1983). File change semantics and the familiarity theory of definiteness. In Meaning, Use, and Interpretation of Language. Berlin: De Gruyter.Google Scholar

Hill, F., Cho, K. and Korhonen, A. (2016). Learning distributed representations of sentences from unlabelled data. In NAACL-HLT.CrossRef Google Scholar

Hong, X., Sayeed, A. and Demberg, V. (2018). Learning distributed event representations with a multi-task approach. In *SEM.CrossRef Google Scholar

Kamp, H. (2013). Meaning and the Dynamics of Interpretation: Selected papers by Hans Kamp. Leiden-Boston: Brill.CrossRef Google Scholar

Kamp, H. (2016). Entity Representations and Articulated Contexts: An Exploration of the Semantics and Pragmatics of Definite Noun Phrases. Unpublished manuscript.Google Scholar

Kiros, R., Zhu, Y., Salakhutdinov, R.R., Zemel, R., Urtasun, R., Torralba, A. and Fidler, S. (2015). Skip-thought vectors. In NIPS.Google Scholar

Kuperberg, G.R. and Jaeger, T.F. (2015). What do we mean by prediction in language comprehension? Language Cognition & Neuroscience 3798, 32–59.Google Scholar

Lapesa, G. and Evert, S. (2017). Large-scale evaluation of dependency-based DSMs: Are they worth the effort? In EACL.CrossRef Google Scholar

Leech, G.N. (2000). Manual to accompany the British National Corpus (version 2) with improved word-class tagging.Google Scholar

Lenci, A. (2011). Composing and updating verb argument expectations: A distributional semantic model. In ACL Workshop on Cognitive Modeling and Computational Linguistics.Google Scholar

Lenci, A. (2018a). Distributional models of word meaning. Annual Review of Linguistics 4, 151–171.CrossRef Google Scholar

Lenci, A. (2018b). Dynamic Distributional Semantics. Unpublished manuscript.Google Scholar

Levy, O. and Goldberg, Y. (2014). Neural word embedding as implicit matrix factorization. In NIPS.Google Scholar

Levy, O., Goldberg, Y. and Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics 3.CrossRef Google Scholar

Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S. and McClosky, D. (2014). The Stanford CoreNLP Natural Language Processing Toolkit. In ACL (System Demonstrations).CrossRef Google Scholar

Matsuki, K., Chow, T., Hare, M., Elman, J.L., Scheepers, C. and McRae, K. (2011). Event-based plausibility immediately influences online language comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition 37(4).Google Scholar

McNally, L. (2017). Kinds, descriptions of kinds, concepts, and distributions. In Bridging Formal and Conceptual Semantics. Selected Papers of BRIDGE-14. DUP.Google Scholar

McNally, L. and Boleda, G. (2017). Conceptual vs. referential affordance in concept composition. In Compositionality and Concepts in Linguistics and Psychology. Berlin: Springer. pp. 245–267.CrossRef Google Scholar

McRae, K., Hare, M., Elman, J.L. and Ferretti, T. (2005). A basis for generating expectancies for verbs from nouns. Memory & Cognition 33(7), 1174–1184.CrossRef Google Scholar PubMed

McRae, K. and Matsuki, K. (2009). People use their knowledge of common events to understand language, and do so as quickly as possible. Language and Linguistics Compass 3(6), 1417–1429.CrossRef Google Scholar

McRae, K., Spivey-Knowlton, M.J. and Tanenhaus, M.K. (1998). Modeling the influence of thematic fit (and other constraints) in online sentence comprehension. Journal of Memory and Language 38(3), 283–312.CrossRef Google Scholar

Meltzer-Asscher, A., Mack, J.E., Barbieri, E. and Thompson, C.K. (2015). How the brain processes different dimensions of argument structure complexity: Evidence from fMRI. Brain and Language 142, 65–75.CrossRef Google Scholar PubMed

Metusalem, R., Kutas, M., Urbach, T.P., Hare, M., McRae, K. and Elman, J.L. (2012). Generalized event knowledge activation during online sentence comprehension. Journal of Memory and Language 66(4), 545–567.CrossRef Google Scholar PubMed

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S. and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In NIPS.Google Scholar

Mitchell, J. and Lapata, M. (2010). Composition in distributional models of semantics. Cognitive Science 34(8), 1388–1429.CrossRef Google Scholar PubMed

Paczynski, M. and Kuperberg, G.R. (2012). Multiple influences of semantic memory on sentence processing: Distinct effects of semantic relatedness on violations of real-world event/state knowledge and animacy selection restrictions. J Memory and Language 67(4), 426–448.CrossRef Google Scholar PubMed

Padó, U. (2007). The Integration of Syntax and Semantic Plausibility in a Wide-coverage Model of Human Sentence Processing. PhD Thesis, University of Stuttgart.Google Scholar

Palangi, H., Smolensky, P., He, X. and Deng, L. (2018). Question-answering with grammatically-interpretable representations. In AAAI.Google Scholar

Pham, N., Kruszewski, G., Lazaridou, A., Baroni, M. (2015). Jointly optimizing word representations for lexical and sentential tasks with the C-phrase model. In ACL.CrossRef Google Scholar

Paperno, D., Pham, N.T. and Baroni, M. (2014). A practical and linguistically-motivated approach to compositional distributional semantics. In ACL, Volume 1.Google Scholar

Pustejovsky, J. (1995). The Generative Lexicon. Cambridge, MA: MIT Press.Google Scholar

Rimell, L., Maillard, J., Polajnar, T. and Clark, S. (2016). RELPRON: A relative clause evaluation data set for compositional distributional semantics. Computational Linguistics 42(4), 661–701.CrossRef Google Scholar

Santus, E., Chersoni, E., Lenci, A. and Blache, P. (2017). Measuring thematic fit with distributional feature overlap. In EMNLP.CrossRef Google Scholar

Sayeed, A., Demberg, V. and Shkadzko, P. (2015). An exploration of semantic features in an unsupervised thematic fit evaluation framework. Italian Journal of Linguistics 1(1).Google Scholar

Sayeed, A., Greenberg, C. and Demberg, V. (2016). Thematic fit evaluation: An aspect of selectional preferences. In ACL Workshop for Evaluating Vector Space Representations for NLP.CrossRef Google Scholar

Thompson, C.K. and Meltzer-Asscher, A. (2014). Neurocognitive mechanisms of verb argument structure processing. In Structuring the Argument: Multidisciplinary Research on Verb Argument Structure. Amsterdam: John Benjamins.Google Scholar

Tian, R., Okazaki, N. and Inui, K. (2017). The mechanism of additive composition. Machine Learning 106(7), 1083–1130.CrossRef Google Scholar

Tilk, O., Demberg, V., Sayeed, A.B., Klakow, D. and Thater, S. (2016). Event participant modelling with neural networks. In EMNLP.CrossRef Google Scholar

Vassallo, P., Chersoni, E., Santus, E., Lenci, A. and Blache, P. (2018). Event knowledge in sentence processing: A new dataset for the evaluation of argument typicality. In LREC Workshop on Linguistic and Neurocognitive Resources (LiNCR).Google Scholar

Washtell, J. (2010). Expectation vectors: A semiotics inspired approach to geometric lexical-semantic representation. In Workshop on Geometrical Models of Natural Language Semantics. ACL.Google Scholar

Williams, A., Reddigari, S. and Pylkkänen, L. (2017). Early sensitivity of left perisylvian cortex to relationality in nouns and verbs. Neuropsychologia 100, 131–143.CrossRef Google Scholar PubMed

Zarcone, A., Utt, J. and Padó, S. (2012). Modeling covert event retrieval in logical metonymy: Probabilistic and distributional accounts. In Proceedings of the NAACL Workshop on Cognitive Modeling and Computational Linguistics.Google Scholar

Zaremba, W., Sutskever, I. and Vinyals, O. (2015). Recurrent neural network regularization. In ICLR. arXiv:1409.2329.Google Scholar

Zhu, X., Li, T. and de Melo, G. (2018). Exploring semantic properties of sentence embeddings. In ACL.CrossRef Google Scholar

Article contents

A structured distributional model of sentence meaning and processing

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests