Hostname: page-component-8448b6f56d-c4f8m Total loading time: 0 Render date: 2024-04-16T05:31:18.878Z Has data issue: false hasContentIssue false

Extending relational algebra with similarities

Published online by Cambridge University Press:  25 April 2012

MELITA HAJDINJAK
Affiliation:
Faculty of Electrical Engineering, University of Ljubljana, Slovenia Email: Melita.Hajdinjak@fe.uni-lj.si
GAVIN BIERMAN
Affiliation:
Microsoft Research, Cambridge, United Kingdom Email: gmb@microsoft.com

Abstract

In this paper we propose various extensions to the relational model to support similarity-based querying. We build upon the -relation model, where tuples are assigned values from an arbitrary semiring , and its associated positive relational algebra . We consider a recently proposed extension to using a monus operation on the semiring to support negative queries, and show how, surprisingly, it fails for important ‘fuzzy’ semirings. Instead, we suggest using a negation operator. We also consider the identities satisfied by the relational algebra . We show that moving from a semiring to a particular form of lattice (a De Morgan frame) yields a relational algebra that satisfies all the classical (positive) relational algebra identities. We claim that to support real-world similarity queries realistically, one must move from tuple-level annotations to attribute-level annotations. We show in detail how our De Morgan frame-based model can be extended to support attribute-level annotations and give worked examples of similarity queries in this setting.

Type
Paper
Copyright
Copyright © Cambridge University Press 2012

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adali, S., Bonatti, P., Sapino, M. L. and Subrahmanian, V. S. (1998) A multi-similarity algebra. In: Tiwary, A. and Franklin, M. (eds.) SIGMOD '98: Proceedings of the 1998 ACM SIGMOD international conference on Management of data. ACM SIGMOD Record 27 402413.CrossRefGoogle Scholar
Amer, K. (1984) Equationally complete classes of commutative monoids with monus. Algebra Universalis 18 (1)129131.CrossRefGoogle Scholar
Belohlavek, R. and Vychodil, V. (2006) Relational model of data over domains with similarities: An extension for similarity queries and knowledge extraction. In: Proceedings of IEEE International Conference on Information Reuse and Integration 207–213.CrossRefGoogle Scholar
Bosbach, B. (1965) Komplementäre halbgruppen: Ein beitrag zur instruktiven idealtheorie kommutativer halbgruppen. Mathematische Annalen 161 (4)279295.CrossRefGoogle Scholar
Buneman, P., Khanna, S. and Tan, W. (2001) Why and where: A characterization of data provenance. In: Proceedings of the International Conference on Database Theory 316–330.CrossRefGoogle Scholar
Codd, E. (1970) A relational model of data for large shared data banks. Communications of the ACM 13 (6)377387.CrossRefGoogle Scholar
Cui, Y., Widom, J. and Wiener, J. (2000) Tracing the lineage of view data in a warehousing environment. ACM Transactions on Database Systems 25 179227.CrossRefGoogle Scholar
Davey, B. and Priestley, H. (1990) Introduction to Lattices and Order, Cambridge University Press.Google Scholar
Geerts, F. and Poggi, A. (2010) On database query languages for k-relations. Journal of Applied Logic 8 173185.CrossRefGoogle Scholar
Green, T., Karvounarakis, G. and Tannen, V. (2007) Provenance semirings. In: Proceedings of the Symposium on Principles of Database Systems 31–40.CrossRefGoogle Scholar
Hajdinjak, M. and Bauer, A. (2009) Similarity measures for relational databases. Informatica 33 (2)135141.Google Scholar
Hajdinjak, M. and Mihelič, F. (2006) The PARADISE evaluation framework: Issues and findings. Computational Linguistics 32 (2)263272.CrossRefGoogle Scholar
Hutton, B. (1975) Normality in fuzzy topological spaces. Journal of Mathematical Analysis and Applications 50 7479.CrossRefGoogle Scholar
Ilyas, I., Beskales, G. and Soliman, M. (2008) A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys 40 (4).CrossRefGoogle Scholar
Imielinski, T. and Lipski, W. (1984) Incomplete information in relational databases. Journal of the ACM 31 (4).CrossRefGoogle Scholar
Kuper, G., Libkin, L. and Paredaens, J. (2000) Constraint Databases, Springer-Verlag.CrossRefGoogle Scholar
Levenshtein, V. (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10 707710.Google Scholar
Ma, Z. (2006) Fuzzy database modeling of imprecise and uncertain engineering information. Studies in Fuzziness and Soft Computing 195 137158.Google Scholar
Ma, Z. and Yan, L. (2008) A literature overview of fuzzy database models. Journal of Information Science and Engineering 24 189202.Google Scholar
Minker, J. (1998) An overview of cooperative answering in databases. In Proceedings of Conference on Flexible Query Answering Systems 282–285.CrossRefGoogle Scholar
Montagna, F. and Sebastiani, V. (2001) Equational fragments of systems for arithmetic. Algebra Universalis 46 (3)417441.CrossRefGoogle Scholar
Papadias, D., Tao, Y., Fu, G. and Seeger, B. (2005) Progressive skyline computation in database systems. ACM Transactions on Database Systems 30 (1)4182.CrossRefGoogle Scholar
Peeva, K. and Kyosev, Y. (2004) Fuzzy Relational Calculus: Theory, Applications and Software, Advances in Fuzzy Systems Applications and Theory 22, World Scientific Publishing Company.Google Scholar
Penzo, W. (2005) Rewriting rules to permeate complex similarity and fuzzy queries within a relational database system. IEEE Transactions on Knowledge and Data Engineering 17 (2)255270.CrossRefGoogle Scholar
Rosado, A., Ribeiro, R. A., Zadrozny, S. and Kacprzyk, J. (2006) Flexible query languages for relational databases: An overview. In: Bordogna, G. and Psaila, G. (eds.) Flexible databases supporting imprecision and uncertainty, Springer-Verlag 353.CrossRefGoogle Scholar
Salii, V. (1983) Quasi-boolean lattices and associations. In: Proceedings of Colloquia Mathematica Societatis János Bolyai. Lectures in Universal Algebra 43 429454.Google Scholar
Schmitt, I. and Schulz, N. (2004) Similarity relational calculus and its reduction to a similarity algebra. In: Proceedings of Symposium on Foundations of Information and Knowledge Systems 252–272.CrossRefGoogle Scholar
Shenoi, S. and Melton, A. (1989) Proximity relations in the fuzzy relational database model. Fuzzy Sets and Systems 31 (3)285296.CrossRefGoogle Scholar
Stanley, R. (1997) Enumerative Combinatorics, Cambridge Studies in Advanced Mathematics 49 (1), Cambridge University Press.CrossRefGoogle Scholar
Suciu, D. (2008) Probabilistic databases. SIGACT News 39 (2)111124.CrossRefGoogle Scholar
Ullman, J. (1988) Principles of Database and Knowledge-Base Systems: Volume 1, Computer Science Press.Google Scholar
Ullman, J. (1989) Principles of Database and Knowledge-Base Systems: Volume 2, Computer Science Press.Google Scholar
Wang, G. (1986) On the structure of fuzzy lattices. Acta Mathematica 29 539543.Google Scholar
Zadeh, L. (1965) Fuzzy sets. Information and Control 8 338353.CrossRefGoogle Scholar