Hostname: page-component-8448b6f56d-t5pn6 Total loading time: 0 Render date: 2024-04-19T01:05:09.799Z Has data issue: false hasContentIssue false

A Semantic Scattering model for the automatic interpretation of English genitives

Published online by Cambridge University Press:  01 April 2009

ADRIANA BADULESCU
Affiliation:
Lymba Corporation, 1701 N. Collins Blvd., Suite 3000, Richardson, TX 75080, USA e-mail: adriana@lymba.com, moldovan@lymba.com
DAN MOLDOVAN
Affiliation:
Lymba Corporation, 1701 N. Collins Blvd., Suite 3000, Richardson, TX 75080, USA e-mail: adriana@lymba.com, moldovan@lymba.com

Abstract

An important problem in knowledge discovery from text is the automatic extraction of semantic relations. This paper addresses the automatic classification of the semantic relations expressed by English genitives. A learning model is introduced based on the statistical analysis of the distribution of genitives' semantic relations in a corpus. The semantic and contextual features of the genitive's noun phrase constituents play a key role in the identification of the semantic relation. The algorithm was trained and tested on a corpus of approximately 20,000 sentences and achieved an f-measure of 79.80 per cent for of-genitives, far better than the 40.60 per cent obtained using a Decision Trees algorithm, the 50.55 per cent obtained using a Naive Bayes algorithm, or the 72.13 per cent obtained using a Support Vector Machines algorithm on the same corpus using the same features. The results were similar for s-genitives: 78.45 per cent using Semantic Scattering, 47.00 per cent using Decision Trees, 43.70 per cent using Naive Bayes, and 70.32 per cent using a Support Vector Machines algorithm. The results demonstrate the importance of word sense disambiguation and semantic generalization/specialization for this task. They also demonstrate that different patterns (in our case the two types of genitive constructions) encode different semantic information and should be treated differently in the sense that different models should be built for different patterns.

Type
Papers
Copyright
Copyright © Cambridge University Press 2008

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agirre, E., Alfonseca, E., and de Lacalle, O. L. 2004. Approximating hierarchy-based similarity for WordNet Nominal Synsets using topic signatures. In Proceedings of GWC 2004, Brno, Czech Republic.Google Scholar
Altenberg, B. 1982. Binominal NPs in a thematic perspective: genitive vs. of-constructions in 17th century English. Scandinavian Symposium on Syntax Variation, Stockholm Studies in English, 52, Stockholm: Almqvist & Wiksell.Google Scholar
Badulescu, A. 2004. Classification of Semantic Relations Between Nouns. PhD Dissertation, University of Texas at Dallas, Richardson, Texas.Google Scholar
Barker, C. 1995. Possessive Descriptions. Stanford: CSLI Publications.Google Scholar
Berland, M., and Charniak, E. 1999. Finding parts in very large corpora. In Proceeding of ACL, 1999, College Park, Maryland.CrossRefGoogle Scholar
Carreras, X., and Marquez, L. 2004. Introduction to the CoNLL-2004 shared task: semantic role labeling. In Proceedings of CoNLL-2004, Boston, Massachusetts.CrossRefGoogle Scholar
Carreras, X., and Marquez, L. 2005. Introduction to the CoNLL-2005 shared task: semantic role labeling. In Proceedings of CoNLL-2005, Ann Arbor, Michigan.CrossRefGoogle Scholar
Chang, C.-C., and Lin, C.-J. 2004. Libsvm: a library for support vector machines. http://www.csie.ntu.edu.tw/cjlin/papers/libsvm.pdf (accessed December 29, 2004).Google Scholar
Chomsky, N. 1986. Knowledge of Language. Its Nature, Origin and Use. New York: Praeger.Google Scholar
Cilibrasi, R. L., and Vitanyi, P. M. B. 2007. The google similarity distance. In IEEE/ACM Transactions on Knowledge and Data Engineering, 19(3):370–383.CrossRefGoogle Scholar
Evens, M. 1980. Lexical-semantic relations: a comparative survey. Edmonton, Canada: Linguistic Research, Inc.Google Scholar
Fellbaum, C. 1998. WordNet—An Electronic Lexical Database. Cambridge, MA: MIT Press.CrossRefGoogle Scholar
Friedman, N., Geiger, D., and Goldszmidt, M. 1997. Bayesian network classifiers. Machine Learning 29 (2/3):131163.CrossRefGoogle Scholar
Girju, R., Badulescu, A., and Moldovan, D. 2003. Learning semantic constraints for the automatic discovery of part-whole relations. In Proceedings of the Human Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada.CrossRefGoogle Scholar
Girju, R., Badulescu, A., and Moldovan, D. 2006. Automatic discovery of part-whole relations. Computational Linguistics 32 (2):83135.Google Scholar
Girju, R., Moldovan, D., Tatu, M., and Antohe, D. 2005. On the semantics of noun compounds. Computer Speech & Language 19 (4):479496.CrossRefGoogle Scholar
Girju, R., Nakov, P., Nastase, V., Szpakowicz, S., Turney, P., and Yuret, D. 2007. SemEval-2007 task 04: classification of semantic relations between nominals. In Proceedings of SemEval-2007 Workshop at ACL 2007, Prague, Czech Republic.CrossRefGoogle Scholar
Hearst, M. 1998. Automated discovery of WordNet relations. In Fellbaum, C. (ed.), An Electronic Lexical Database and Some of its Applications. Cambridge, MA: MIT Press.Google Scholar
Hirst, G., and St-Onge, D. 1998. Lexical chains as representations of context for the detection and correction of malapropisms. In Fellbaum, Christiane (ed.), WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.Google Scholar
Huddleston, R., and Pullum, G. K. 2002. The Cambridge Grammar of the English Language. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Jespersen, O. 1949. A modern English grammar on historical principles. In Syntax 7, Copenhagen: Munksgaard.Google Scholar
Jiang, J. J., and Conrath, D. W. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of ROCLING 1997, Taipei, Taiwan.Google Scholar
Langacker, R. 1990. Concept, Image, and Symbol. The Cognitive Basis of Grammar. New York: Berlin and Mouton.Google Scholar
Langacker, R. 1992. The symbolic nature of cognitive grammar: the meaning of of and of-periphrasis. In Putz, M. (ed.), Thirty Years of Linguistic Evolution, Amsterdam: John Benjamins.Google Scholar
Langacker, R. 1993. Reference-point constructions. Cognitive Linguistics 4:138.CrossRefGoogle Scholar
Langacker, R. 1995. Possession and possessive constructions. In Taylor, J. and MacLaury, R. (eds.), Language and the Cognitive Construal of the World, Berlin: Mouton de Gruyter.Google Scholar
Lapata, M. 2002. The disambiguation of nominalisations. Computational Linguistics 28 (3):357388.CrossRefGoogle Scholar
Lauer, M. 1995. Designing Statistical Language Learners: Experiments on Noun Compounds. PhD Thesis, Macquarie University, Australia.Google Scholar
Leacock, C., and Chodorow, M. 1998. Combining local context and WordNet similarity for word sense identification. In Fellbaum, C. (ed.), WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press.Google Scholar
Li, H., and Abe, N. 1998. Generalizing case frames using a thesaurus and the MDL principle. Computational Linguistics 24 (2):217224.Google Scholar
Light, M., and Greiff, W. 2002. Statistical models for the induction and use of selectional preferences. Cognitive Science 87:113.Google Scholar
Litkowski, K. 2004. Senseval-3 task. Automatic labeling of semantic roles. In Proceedings of Senseval 3, Barcelona, Spain.Google Scholar
Maki, W. S., McKinley, L. N., and Thompson, A. G. 2004. Semantic distance norms computed from an electronic dictionary (WordNet). Behavior Research Methods, Instruments & Computers, Web-Based Archive of Norms, Stimuli, and Data 36 (3):421431.Google ScholarPubMed
McCarthy, D. 2001. Lexical Acqusition at the Syntax–Semantics Interface: Diathesis Alternations, Subcategorization Frames and Selectional Preferences. PhD Dissertation, Univesity of Sussex.Google Scholar
Moldovan, D., and Badulescu, A. 2005. A Semantic Scattering model for the automatic interpretation of genitives. In Proceedings of the Human Language Technology Conference (HLT-NAACL) 2006, Vancouver, Canada.CrossRefGoogle Scholar
Moldovan, D., Badulescu, A., Tatu, M., Antohe, D., and Girju, R. 2004. Models for the semantic classification of noun phrases. In Proceedings of the Human Language Technology Conference (HLT-NAACL) 2004, Computational Lexical Semantics Workshop, Boston, Massachusetts.CrossRefGoogle Scholar
Moldovan, D., Clark, C., and Bowden, M. 2007. Lymba's PowerAnswer 4 in TREC 2007. In Proceedings of TREC 2007 Conference, Gaithersburg, Maryland.Google Scholar
Moldovan, D., and Novischi, A. 2002. Lexical chains for question answering. In Proceedings of COLING 2002, Taipei, Taiwan.CrossRefGoogle Scholar
Nikiforidou, K. 1991. The meanings of the genitive: a case study in the semantic structure and semantic change. Cognitive Linguistics 2 (149):149205.CrossRefGoogle Scholar
Novischi, A., Moldovan, D., Parker, P., Badulescu, A., and Hauser, B. 2004. LCC's WSD systems for senseval 3. In Proceedings of Senseval 3, Barcelona, Spain.Google Scholar
Partee, B., and Borschev, V. 1999. Possessives, favorite, and coercion. In Proceedings of ESCOL99, Ithaca, New York.Google Scholar
Quinlan, R. 2002. Data mining tools see5 and c5.0. http://www.rulequest.com/see5-info.html (accessed December 29, 2004).Google Scholar
Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. 1985. A Comprehensive Grammar of English Language. Harlow, England: Longman.Google Scholar
Rissanen, J. 1978. Modeling by shortest data description. Automatica 14:149205.CrossRefGoogle Scholar
Rosario, B., Hearst, M., and Fillmore, C. 2002. The descendent of hierarchy and selection in relational semantics. In Proceedings of ACL 2002, Philadelphia, Pennsylvania.CrossRefGoogle Scholar
Siegel, S., and Castellan, N. J. 1988. Non Parametric Statistics for the Behavioral Sciences. New York: McGraw-Hill.Google Scholar
Stefanowitsch, A. 2001. Constructional semantics as a limit to grammatical alternation: two genitives of English. In Rohdenburg, G. and Mondorf, B. (eds.), Determinants of Grammatical Variation in English, Berlin: Mouton de Gruyter.Google Scholar
Strang, B. 1962. Modern English Structure. London: Edward Arnold.Google Scholar
Taylor, J. 1996. Possessives in English. An Exploration in Cognitive Grammar. Oxford: Clarendon Press.CrossRefGoogle Scholar
Vikner, C., and Jensen, P. A. 1999. A Semantic Analysis of the English Genitive: Interaction of Lexical and Formal Semantics. Denmark: Ms. Copenhagen and Kolding.Google Scholar
Williams, E. 1982. The NP cycle. Linguistic Inquiry 13:277295.Google Scholar