Learning opinions in user-generated web content

M. SOKOLOVA; G. LAPALME

doi:10.1017/S135132491100012X

Learning opinions in user-generated web content

Published online by Cambridge University Press: 11 March 2011

M. SOKOLOVA and

G. LAPALME

Show author details

M. SOKOLOVA: Affiliation:
Department of Pediatrics, Faculty of Medicine, Children's Hospital of Eastern Ontario Research Institute, University of Ottawa, 401 Smyth Rd., Ottawa, Ontario, Canada, K1H 8L1 email: sokolova@uottawa.ca
G. LAPALME: Affiliation:
Département d'informatique et de recherche opérationnelle, Université de Montréal, C.P. 6128, Succ Centre-Ville, Montréal, Quebec, Canada, H3C 3J7 email: lapalme@iro.umontreal.ca

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The user-generated Web content has been intensively analyzed in Information Extraction and Natural Language Processing research. Web-posted reviews of consumer goods are studied to find customer opinions about the products. We hypothesize that nonemotionally charged descriptions can be applied to predict those opinions. The descriptions may include indicators of product size (tall), commonplace (some), frequency of happening (often), and reviewer certainty (maybe). We first construct patterns of how the descriptions are used in consumer-written texts and then represent individual reviews through these patterns. We propose a semantic hierarchy that organizes individual words into opinion types. We run machine learning algorithms on five data sets of user-written product reviews: four are used in classification experiments, another one for regression and classification. The obtained results support the use of non-emotional descriptions in opinion learning.

Type: Articles
Information: Natural Language Engineering , Volume 17 , Issue 4 , October 2011 , pp. 541 - 567

DOI: https://doi.org/10.1017/S135132491100012X [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2011

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Andreevskaia, A., and Bergler, S. 2008. When specialists and generalists work together: overcoming domain dependence in sentiment tagging. In Proceedings of the ACL-08: HLT, pp. 290–98.Google Scholar

Bednarek, M. 2006. Evaluation in Media Discourse. New York, NY: Continuum.Google Scholar

Bednarek, M. 2009. Dimensions of evaluation. Cognitive and linguistic perspectives. Pragmatics & Cognition 17 (1): 146–75.Google Scholar

Benamara, F., Cesarano, C., Picariello, A., Reforgiato, D., and Subrahmanian, V. 2007. Sentiment analysis: adjectives and adverbs are better than the adjectives alone. In Proceedings of the International Conference on Weblogs and Social Media (ICWSM-2007).Google Scholar

Ben-David, S., Blitzer, J., Crammer, K., and Pereira, F. 2006. Analysis of representations for domain adaptation. In Proceedings of the Neural Information Processing Systems.Google Scholar

Bethard, S., Yu, H., Thornton, A., Hatzivassiloglou, V., and Jurafsky, D. 2004. Automatic extraction of opinion propositions and their holders. In Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text: Theories and Applications.Google Scholar

Biber, D., Johansson, S., Leech, G., Conrad, S., and Finegan, E. 1999. Longman Grammar of Spoken and Written English. Upper Saddle River, NJ: Longman.Google Scholar

Blitzer, J., Dredze, M., and Pereira, F. 2007. Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Association for Computational Linguistics, pp. 440–47.Google Scholar

Bolinger, D. 1972. Degree Words. The Netherlands: Mouton De Gruyter.CrossRef Google Scholar

Breck, E., Choi, Y., and Cardie, C. 2007. Identifying expressions of opinion in context. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI-2007), pp. 2683–88.Google Scholar

Crystal, D. 2006. Language and the Internet. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Dasgupta, S., and Ng, V. 2009. Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification. In Proceedings of ACL 2009, Association for Computational Linguistics, pp. 701–709.Google Scholar

De Houwer, J. 2009. How do people evaluate objects? A brief review. Social and Personallity Psycology Compass 3 (1): 36–48.CrossRef Google Scholar

Directorate for Science, Technology and Industry. 2007. Participative Web: User-Created Content, Committee for Information, Computer and Communication Policy. Working Party on the Information Economy.Google Scholar

Esuli, A. and Sebastiani, F. 2006a. SentiWordNet: a publicly available lexical resource for opinion mining. In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC-2006).Google Scholar

Esuli, A. and Sebastiani, F. 2006b. Determining term subjectivity and term orientation for opinion mining. In Proceedings of the European Chapter of the Association for Computational Linguistics (EACL-2006).Google Scholar

Esuli, A., and Sebastiani, F. 2007. Random-walk models of term semantics: an application to opinion-related properties. In Proceedings of the 3rd Language and Technology Conference (LTC-2003).Google Scholar

Feiguina, O., and Lapalme, G. 2007. Query-based summarization of customer reviews. In Proceedings of the 20th Canadian Conference on Artificial Intelligence (AI-2007), pp. 452–63, New Mexico: Springer.Google Scholar

Firth, J. et al. (eds). A synopsis of linguistic theory 1930–1955. In Studies in Linguistic Analysis, pp. 1–32. Oxford, UK: Basil Blackwell (for the Philological Society).Google Scholar

Hall, M., and Holmes, G. 2003. Benchmarking attribute selection techniques for discrete class data mining. IEEE Transactions on Knowledge and Data Engineering 15 (6): 1437–47.Google Scholar

Halliday, M. 1994. An Introduction to Functional Grammar, 2nd ed.New York, NY: Edward Arnold.Google Scholar

Hine, M., Murphy, S., Weber, M., and Kersten, G. 2009. The role of emotion and language in dyadic E-negotiations. Group Decision and Negotiation 18: 193–211.Google Scholar

Hu, M., and Liu, B. 2004. Mining and summarizing customer reviews. In Proceedings of the 10th ACM SIGKDD International Conf on Knowledge Discovery and Data Mining (KDD-2004), pp. 168–77.Google Scholar

Huddleston, R., and Pullum, G. 2002. The Cambridge Grammar of the English Language. Cambridge, UK: Cambridge University Press.CrossRef Google Scholar

Hunston, S., and Francis, G. 2000. Pattern Grammar: A Corpus-Driven Approach to the Lexical Grammar of English. Philadelphia PA: John Benjamins.Google Scholar

Inui, K., Abe, S., Hara, K., Morita, H., Sao, C., Eguchi, M., Sumida, A., Murakami, K., and Matsuyoshi, S. 2008. Experience mining: building a large-scale database of personal experiences and opinions from web documents. In Proceedings of IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Vol. 1, pp. 314–21.Google Scholar

Jindal, N., and Liu, B. 2006. Identifying comparative sentences in text documents. In Proceedings of SIGIR 2006, pp. 244–51.Google Scholar

Jindal, N., and Liu, B. 2008. Opinion spam and analysis. In Proceedings of WSDM 2008, pp. 219–30.Google Scholar

Jurafsky, D., and Martin, J. 2009. Speech and Language Processing. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 2nd ed.New Jersey: Pearson Prentice Hall.Google Scholar

Katz, S. 1987. Estimation of probabilities from sparse data for the language model component of a speech recogniser. IEEE Transactions on Acoustics, Speech, and Signal Processing 35 (3): 400–401.CrossRef Google Scholar

Kessler, J., and Nicolov, N. 2009. Targeting sentiment expressions through supervised ranking of linguistic configurations. In Proceedings of the 3rd International AAAI Conference on Weblogs and Social Media (ICWSM-2009).CrossRef Google Scholar

Kim, S.-M., and Hovy, E. 2007. Crystal: analyzing predictive opinions on the web. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 1056–64.Google Scholar

Labov, W. 1972. Language in the Inner City: Studies in the Black English Vernacular. Oxford, UK: Blackwell.Google Scholar

Lasersohn, P. 2005. Context dependence, disagreement, and predicates of personal taste. Linguistics and Philosophy 28: 643–86 (Springer).CrossRef Google Scholar

Leech, G., Deuchar, M., and Hoogenraad, R. 1982. English Grammar for Today. New York, NY: Macmillan.Google Scholar

Leech, G., Rayson, P., and Wilson, A. 2001. Word Frequencies in Written and Spoken English: Based on the British National Corpus. Longman.Google Scholar

Leech, G., and Svartvik, J. 2002. A Communicative Grammar of English, 3rd ed.Upper Saddle River, NJ: Longman.Google Scholar

Li, S., Huang, C-R., Zhou, G., and Lee, S. Y. M. 2010a. Employing personal/impoersonal views in supervised and semi-supervised sentiment classification. In Proceedings of ACL 2010, pp. 414–23.Google Scholar

Li, S., Lee, S. Y. M., Chen, Y., Huang, C-R., and Zhou, G. 2010b. Sentiment classification and polarity shifting. In Proceedings of COLING 2010, Association for Computational Linguistics, pp. 635–43.Google Scholar

Liu, B. 2007. Web Data Mining: Exploring Hyperlinks, Contents and Usage Data. New York, NY: Springer.Google Scholar

de Marneffe, M.-C., Manning, C., and Potts, C. 2010. “Was it good? It was provocative.” Learning the meaning of scalar adjectives. In Proceedings of the ACL 2010, pp. 167–176.Google Scholar

Ng, V., Dasgupta, S. and Arifin, S. M. N. 2006. Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews. In Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, Association for Computational Linguistics, pp. 611–18.CrossRef Google Scholar

Nigam, K., and Hurst, M. 2004. Towards a robust metric of opinion. In Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text, pp. 98–105.Google Scholar

Pang, B., and Lee, L. 2005. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL-2005), pp. 115–24.Google Scholar

Pang, B., and Lee, L. 2008. Opinion Mining and Sentiment Analysis. The Netherlands: Now Publishers.Google Scholar

Pang, B., Lee, L., and Vaithyanathan, S. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of Empirical Methods of Natural Language Processing (EMNLP-2002), pp. 79–86.Google Scholar

Popescu, A., and Etzioni, O. 2005. Extracting product features and opinions from reviews. In Proceedings of HLTC/EMNLP 2005, Vancouver, B.C., Canada, pp. 339–46.Google Scholar

Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. 1985. A Comprehensive Grammar of the English Language. Upper Saddle River, NJ: Longman.Google Scholar

Reinratz, T. 1999. Focusing Solutions for Data Mining. New York, NY: Springer.CrossRef Google Scholar

Riloff, E., Patwardhan, S., and Wiebe, J. 2006. Feature subsumption for opinion analysis. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp. 440–48.Google Scholar

Roget, 2006. Roget's Interactive Thesaurus. http://thesaurus.reference.com/Google Scholar

Scriven, M. 1981. The Logic of Evaluation. Alberta, Canada: Edgepress.Google Scholar

Sokolova, M., and Lapalme, G. 2008. Verbs speak loud: verb categories in learning polarity and strength of opinions. In Proceedings of the 21st Canadian Conference on Artificial Intelligence (AI-2008), pp. 320–31, New York, NY: Springer.Google Scholar

Sokolova, M. and Lapalme, G. 2009a. Learning opinions without using emotional words. In Proceedings of the 22nd Canadian Conference on Artificial Intelligence (AI-2009), pp. 253–56, New York, NY: Springer.Google Scholar

Sokolova, M. and Lapalme, G. 2009b. Opinion classification with non-affective adjectives and adverbs. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP-2009), pp. 420–26.Google Scholar

Sokolova, M., and Szpakowicz, S. 2007. Strategies and language trends in learning success and failure of negotiations. Group Decision and Negotiation 16: 469–84 (Springer).CrossRef Google Scholar

Somasundaran, S., and Wiebe, J. 2009. Recognizing stances in online debates. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL Suntec, Singapore, August and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Vol. 1, pp. 226–34.Google Scholar

Stenvall, M. 2008. On emotions and the journalistic ideals of factuality and objectivity – tools for analysis. Journal of Pragmatics 40: 1569–86 (Elsevier).Google Scholar

Stern, H. 1983. Fundamental Concepts of Language Teaching. Oxford, UK: Oxford University Press.Google Scholar

Stoyanov, V., and Cardie, C. 2008. Topic identification for fine-grained opinion analysis. In Proceedings of the COLING 2008, Vol. 1, pp. 817–24.Google Scholar

Strapparava, C., and Valitutti, A. 2004. Wordnet-affect: an affective extension of wordnet. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC-2004).Google Scholar

Thompson, G., and Hunston, S. 2000. Evaluation: an introduction. In Hunston, S., and Thompson, G. (eds.), Evaluation in Text. Authorial Stance and the Construction of Discourse, pp. 1–27, Oxford, UK: Oxford University Press.Google Scholar

Turney, P. 2002. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of ACL 2002, pp. 417–24.Google Scholar

Turney, P., and Littman, M. 2003. Measuring praise and criticism: inference of semantic orientation from association. ACM Transactions on Information Systems 21 (4): 315–46 (Association for Computing Machinery).Google Scholar

van Dijck, J. 2009. Users like you? Theorizing agency in user-generated content. Media, Culture and Society 31: 41–59.CrossRef Google Scholar

Wilson, T., Wiebe, J., and Hwa, R. 2006. Recognizing strong and weak opinion clauses. Computational Intelligence 22 (2): 73–99 (Wiley-Blackwell).CrossRef Google Scholar

Witten, I., and Frank, E. 2005. Data Mining: Practical Machine Learning Tools and Techniques, 2nd ed., Massachusetts: Morgan Kaufmann.Google Scholar

Yu, H., and Hatzivassiloglou, V. 2003. Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences. In Proceedings of EMNLP 2003, pp. 129–36.Google Scholar

Article contents

Learning opinions in user-generated web content

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests