A linear algebraic approach to datalog evaluation

TAISUKE SATO

doi:10.1017/S1471068417000023

A linear algebraic approach to datalog evaluation

Published online by Cambridge University Press: 22 May 2017

TAISUKE SATO

Show author details

TAISUKE SATO*: Affiliation:
AI research center AIST/National Institute of Informatics, Tokyo, Japan (e-mails: satou.taisuke@aist.go.jp)

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We propose a fundamentally new approach to Datalog evaluation. Given a linear Datalog program DB written using N constants and binary predicates, we first translate if-and-only-if completions of clauses in DB into a set E q (DB) of matrix equations with a non-linear operation, where relations in M DB, the least Herbrand model of DB, are encoded as adjacency matrices. We then translate E q (DB) into another, but purely linear matrix equations Ẽ q (DB). It is proved that the least solution of Ẽ q (DB) in the sense of matrix ordering is converted to the least solution of E q (DB) and the latter gives M DB as a set of adjacency matrices. Hence, computing the least solution of Ẽ q (DB) is equivalent to computing M DB specified by DB. For a class of tail recursive programs and for some other types of programs, our approach achieves O(N 3) time complexity irrespective of the number of variables in a clause since only matrix operations costing O(N 3) or less are used. We conducted two experiments that compute the least Herbrand models of linear Datalog programs. The first experiment computes transitive closure of artificial data and real network data taken from the Koblenz Network Collection. The second one compared the proposed approach with the state-of-the-art symbolic systems including two Prolog systems and two ASP systems, in terms of computation time for a transitive closure program and the same generation program. In the experiment, it is observed that our linear algebraic approach runs 101 ~ 104 times faster than the symbolic systems when data is not sparse. Our approach is inspired by the emergence of big knowledge graphs and expected to contribute to the realization of rich and scalable logical inference for knowledge graphs.

Keywords

Datalog least model matrix vector space

Type: Regular Papers
Information: Theory and Practice of Logic Programming , Volume 17 , Issue 3 , May 2017 , pp. 244 - 265

DOI: https://doi.org/10.1017/S1471068417000023 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Alviano, M., Faber, W., Leone, N., Perri, S., Pfeifer, G. and Terracina, G. 2010. The disjunctive datalog system DLV. In Datalog Reloaded, LNCS 6702, de Moor, O., Gottlob, G., Furche, T., and Sellers, A., Eds. Springer, Berlin, 282–301.Google Scholar

Bartels, R. and Stewart, G. 1972. Solution of the matrix equation AX + XB = C. Communication of the ACM 15, 9.CrossRef Google Scholar

Bollacker, K., Evans, C., Paritosh, P., Sturge, T. and Taylor, J. 2008. Freebase: A collaboratively created graph database for structuring human knowledge. In Proc. of the 2008 ACM SIGMOD International Conference on Management of data, ACM, New York, NY, USA, 1247–1250.Google Scholar

Ceri, S., Gottlob, G. and Tanca, L. 1989. What you always wanted to know about datalog (and never dared to ask). IEEE Transactions on Knowledge and Data Engineering 1, 1, 146–166.CrossRef Google Scholar

Cichocki, A., Zdunek, R., Phan, A.-H. and Amari, S. 2009. Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation. Chichester, West Sussex, UK, John Wiley & Sons, Ltd. CrossRef Google Scholar

Coppersmith, D. and Winograd, S. 1990. Matrix multiplication via arithmetic progressions. Journal of Symbolic Computation 9, 3, 251–280.CrossRef Google Scholar

Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S. and Zhang, W. 2014. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proc. of 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD2014, ACM, New York, NY, USA, 601–610.Google Scholar

Gebser, M., Kaminski, R., Kaufmann, B. and Schaub, T. 2014. Clingo = ASP + Control: Preliminary Report. In Leuschel, M. and Schrijvers, T., Eds. Technical Communications of the Thirtieth International Conference on Logic Programming (ICLP'14), Vol. 14(4–5). 1–9.Google Scholar

Golub, G., Nash, S. and Van Loan, C. 1979. A Hessenberg-Schur method for the problem AX + XB = C. IEEE Transion Automated Control AC-24, Vol. 24 (6). 909–913.CrossRef Google Scholar

Granat, R., Jonsson, I. and Kågström, B. 2009. RECSY and SCASY library software: Recursive blocked and parallel algorithms for Sylvester-type matrix equations with some applications. Parallel Scientific Computing and Optimization, Vol. 27, 3–24.CrossRef Google Scholar

Grefenstette, E. 2013. Towards a formal distributional semantics: Simulating logical calculi with tensors. In Proc. of the 2nd Joint Conference on Lexical and Computational Semantics, Association for Computational Linguistics (ACL), Stroudsburg, PA 18360 USA, 1–10.Google Scholar

Jonsson, I. and Kågström, B. 2002. Recursive blocked algorithms for solving triangular systems – Part II: Two-sided and generalized Sylvester and Lyapunov matrix equations. ACM Transactions on Mathematical Software 28, 4, 392–415.CrossRef Google Scholar

Kolda, T. G. and Bader, B. W. 2009. Tensor decompositions and applications. SIAM Review 51, 3, 455–500.CrossRef Google Scholar

Krompass, D., Nickel, M. and Tresp, V. 2014. Querying factorized probabilistic triple databases. In Proc. of the 13th International Semantic Web Conference(ISWC'14), Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandeac, D., Groth, P., Noy, N., Janowicz, K., Goble, C., Eds. Springer-Verlag New York, Inc., New York, NY, USA, 114–129.Google Scholar

Kunegis, J. 2013. KONECT – The Koblenz network collection. In Proc. of the International Conference on World Wide Web Companion, ACM, New York, NY, USA, 1343–1350.Google Scholar

Lin, F. 2013. From Satisfiability to Linear Algebra. Technical report, Hong Kong University of Science and Technology.Google Scholar

Lloyd, J. 1993. Foundations of Logic Programming, 2nd ed. Springer-Verlag, New York, Inc.Google Scholar

Nickel, M. 2013. Tensor factorization for relational learning. PhD Thesis, Ludwig-Maximilians-Universitat Munchen.Google Scholar

Nickel, M., Murphy, K., Tresp, V. and Gabrilovich, E. 2015. A review of relational machine learning for knowledge graphs: From multi-relational link prediction to automated knowledge graph construction. CoRR abs/1503.00759, Proceedings of the IEEE, 104(1), pp. 11–33.Google Scholar

Rocktäschel, T., Bosnjak, M., Singh, S. and Riedel, S. 2014. Low-dimensional embeddings of logic. In Proceedings of the ACL 2014 ACL Workshop on Semantic Parsing (SP'14), Association for Computational Linguistics, pp. 45–49.Google Scholar

Rocktäschel, T., Singh, S. and Riedel, S. 2015. Injecting logical background knowledge into embeddings for relation extraction. Association for Computational Linguistics Eds. In Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pp. 1119–1129.Google Scholar

Saberi, A., Stoorvogel, A. and Sannuti, P. 2007. Filtering Theory. With Applications to Fault Detection, Isolation, and Estimation, Birkhauser, Boston, Mass, USA, 2007. Birkhäuser, Boston.CrossRef Google Scholar

Simoncini, V. 2013. Computational Methods for Linear Matrix Equations. Technical report, SIAM REVIEW, 58(3), pp. 377–441.Google Scholar

Suchanek, F. M., Kasneci, G. and Weikum, G. 2007. YAGO: A core of semantic knowledge unifying WordNet and Wikipedia. In Proc. of the 16th International World Wide Web Conference(WWW'07), ACM, New York, NY, USA, 697–706.Google Scholar

Swift, T. and Warren, D. 2012. XSB: Extending prolog with tabled logic programming. Theory and Practice of Logic Programming (TPLP) 12, 1–2, 157–187.CrossRef Google Scholar

Tarjan, R. E. 1972. Depth-first search and linear graph algorithms. SIAM Journal on Computing 1, 2, 146–160.CrossRef Google Scholar

Tekle, K. T. and Liu, Y. A. 2010. Precise complexity analysis for efficient datalog queries. In Proc. of the 12th International ACM SIGPLAN Symposium on Principles and Practice of Declarative Programming, ACM, New York, NY, USA, 35–44.Google Scholar

Warren, D. S. 1999. Programming in Tabled Prolog (very) DRAFT 1. Technical Report, Stony Brook University.Google Scholar

Yang, B., Yih, W., He, X., Gao, J. and Deng, L. 2015. Embedding entities and relations for learning and inference in knowledge bases. In Proc. of the International Conference on Learning Representations (ICLR) 2015.Google Scholar

Zhou, N.-F., Kameya, Y. and Sato, T. 2010. Mode-directed tabling for dynamic programming, machine learning, and constraint solving. In Proc. of the 22nd International Conference on Tools with Artificial Intelligence (ICTAI-2010), IEEE Computer Society, Washington DC, USA, 213–218.Google Scholar

Article contents

A linear algebraic approach to datalog evaluation

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests