Skip to main content Accessibility help

Extending relational algebra with similarities



In this paper we propose various extensions to the relational model to support similarity-based querying. We build upon the -relation model, where tuples are assigned values from an arbitrary semiring , and its associated positive relational algebra . We consider a recently proposed extension to using a monus operation on the semiring to support negative queries, and show how, surprisingly, it fails for important ‘fuzzy’ semirings. Instead, we suggest using a negation operator. We also consider the identities satisfied by the relational algebra . We show that moving from a semiring to a particular form of lattice (a De Morgan frame) yields a relational algebra that satisfies all the classical (positive) relational algebra identities. We claim that to support real-world similarity queries realistically, one must move from tuple-level annotations to attribute-level annotations. We show in detail how our De Morgan frame-based model can be extended to support attribute-level annotations and give worked examples of similarity queries in this setting.



Hide All
Adali, S., Bonatti, P., Sapino, M. L. and Subrahmanian, V. S. (1998) A multi-similarity algebra. In: Tiwary, A. and Franklin, M. (eds.) SIGMOD '98: Proceedings of the 1998 ACM SIGMOD international conference on Management of data. ACM SIGMOD Record 27 402413.
Amer, K. (1984) Equationally complete classes of commutative monoids with monus. Algebra Universalis 18 (1)129131.
Belohlavek, R. and Vychodil, V. (2006) Relational model of data over domains with similarities: An extension for similarity queries and knowledge extraction. In: Proceedings of IEEE International Conference on Information Reuse and Integration 207–213.
Bosbach, B. (1965) Komplementäre halbgruppen: Ein beitrag zur instruktiven idealtheorie kommutativer halbgruppen. Mathematische Annalen 161 (4)279295.
Buneman, P., Khanna, S. and Tan, W. (2001) Why and where: A characterization of data provenance. In: Proceedings of the International Conference on Database Theory 316–330.
Codd, E. (1970) A relational model of data for large shared data banks. Communications of the ACM 13 (6)377387.
Cui, Y., Widom, J. and Wiener, J. (2000) Tracing the lineage of view data in a warehousing environment. ACM Transactions on Database Systems 25 179227.
Davey, B. and Priestley, H. (1990) Introduction to Lattices and Order, Cambridge University Press.
Geerts, F. and Poggi, A. (2010) On database query languages for k-relations. Journal of Applied Logic 8 173185.
Green, T., Karvounarakis, G. and Tannen, V. (2007) Provenance semirings. In: Proceedings of the Symposium on Principles of Database Systems 31–40.
Hajdinjak, M. and Bauer, A. (2009) Similarity measures for relational databases. Informatica 33 (2)135141.
Hajdinjak, M. and Mihelič, F. (2006) The PARADISE evaluation framework: Issues and findings. Computational Linguistics 32 (2)263272.
Hutton, B. (1975) Normality in fuzzy topological spaces. Journal of Mathematical Analysis and Applications 50 7479.
Ilyas, I., Beskales, G. and Soliman, M. (2008) A survey of top-k query processing techniques in relational database systems. ACM Computing Surveys 40 (4).
Imielinski, T. and Lipski, W. (1984) Incomplete information in relational databases. Journal of the ACM 31 (4).
Kuper, G., Libkin, L. and Paredaens, J. (2000) Constraint Databases, Springer-Verlag.
Levenshtein, V. (1966) Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10 707710.
Ma, Z. (2006) Fuzzy database modeling of imprecise and uncertain engineering information. Studies in Fuzziness and Soft Computing 195 137158.
Ma, Z. and Yan, L. (2008) A literature overview of fuzzy database models. Journal of Information Science and Engineering 24 189202.
Minker, J. (1998) An overview of cooperative answering in databases. In Proceedings of Conference on Flexible Query Answering Systems 282–285.
Montagna, F. and Sebastiani, V. (2001) Equational fragments of systems for arithmetic. Algebra Universalis 46 (3)417441.
Papadias, D., Tao, Y., Fu, G. and Seeger, B. (2005) Progressive skyline computation in database systems. ACM Transactions on Database Systems 30 (1)4182.
Peeva, K. and Kyosev, Y. (2004) Fuzzy Relational Calculus: Theory, Applications and Software, Advances in Fuzzy Systems Applications and Theory 22, World Scientific Publishing Company.
Penzo, W. (2005) Rewriting rules to permeate complex similarity and fuzzy queries within a relational database system. IEEE Transactions on Knowledge and Data Engineering 17 (2)255270.
Rosado, A., Ribeiro, R. A., Zadrozny, S. and Kacprzyk, J. (2006) Flexible query languages for relational databases: An overview. In: Bordogna, G. and Psaila, G. (eds.) Flexible databases supporting imprecision and uncertainty, Springer-Verlag 353.
Salii, V. (1983) Quasi-boolean lattices and associations. In: Proceedings of Colloquia Mathematica Societatis János Bolyai. Lectures in Universal Algebra 43 429454.
Schmitt, I. and Schulz, N. (2004) Similarity relational calculus and its reduction to a similarity algebra. In: Proceedings of Symposium on Foundations of Information and Knowledge Systems 252–272.
Shenoi, S. and Melton, A. (1989) Proximity relations in the fuzzy relational database model. Fuzzy Sets and Systems 31 (3)285296.
Stanley, R. (1997) Enumerative Combinatorics, Cambridge Studies in Advanced Mathematics 49 (1), Cambridge University Press.
Suciu, D. (2008) Probabilistic databases. SIGACT News 39 (2)111124.
Ullman, J. (1988) Principles of Database and Knowledge-Base Systems: Volume 1, Computer Science Press.
Ullman, J. (1989) Principles of Database and Knowledge-Base Systems: Volume 2, Computer Science Press.
Wang, G. (1986) On the structure of fuzzy lattices. Acta Mathematica 29 539543.
Zadeh, L. (1965) Fuzzy sets. Information and Control 8 338353.

Related content

Powered by UNSILO

Extending relational algebra with similarities



Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.