Swift Markov Logic for Probabilistic Reasoning on Knowledge Graphs

LUIGI BELLOMARINI; ELEONORA LAURENZA; EMANUEL SALLINGER; EVGENY SHERKHONOV

doi:10.1017/S1471068422000412

Swift Markov Logic for Probabilistic Reasoning on Knowledge Graphs

Published online by Cambridge University Press: 09 November 2022

LUIGI BELLOMARINI

ELEONORA LAURENZA ,

EMANUEL SALLINGER and

EVGENY SHERKHONOV

Show author details

LUIGI BELLOMARINI: Affiliation:
Banca d’Italia, Rome, Italy (e-mails: luigi.bellomarini@bancaditalia.it, eleonora.laurenza@bancaditalia.it)
ELEONORA LAURENZA: Affiliation:
Banca d’Italia, Rome, Italy (e-mails: luigi.bellomarini@bancaditalia.it, eleonora.laurenza@bancaditalia.it)
EMANUEL SALLINGER: Affiliation:
TU Wien, Vienna, Austria University of Oxford, United Kingdom (e-mail: sallinger@dbai.tuwien.ac.at)
EVGENY SHERKHONOV: Affiliation:
University of Oxford, United Kingdom (e-mail: evgeny.sherkhonov@cs.ox.ac.uk)

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We provide a framework for probabilistic reasoning in Vadalog-based Knowledge Graphs (KGs), satisfying the requirements of ontological reasoning: full recursion, powerful existential quantification, expression of inductive definitions. Vadalog is a Knowledge Representation and Reasoning (KRR) language based on Warded Datalog+/–, a logical core language of existential rules, with a good balance between computational complexity and expressive power. Handling uncertainty is essential for reasoning with KGs. Yet Vadalog and Warded Datalog+/– are not covered by the existing probabilistic logic programming and statistical relational learning approaches for several reasons, including insufficient support for recursion with existential quantification and the impossibility to express inductive definitions. In this work, we introduce Soft Vadalog, a probabilistic extension to Vadalog, satisfying these desiderata. A Soft Vadalog program induces what we call a Probabilistic Knowledge Graph (PKG), which consists of a probability distribution on a network of chase instances, structures obtained by grounding the rules over a database using the chase procedure. We exploit PKGs for probabilistic marginal inference. We discuss the theory and present MCMC-chase, a Monte Carlo method to use Soft Vadalog in practice. We apply our framework to solve data management and industrial problems and experimentally evaluate it in the Vadalog system.

Keywords

knowledge graphs and reasoning and Datalog+/− and Markov logic networks.

Type: Original Article
Information: Theory and Practice of Logic Programming , Volume 23 , Issue 3: Special Issue on Logic Rules and Reasoning: Selected Papers from the 4th International Joint Conference on Rules and Reasoning (RuleML+RR 2020) , May 2023 , pp. 507 - 534

DOI: https://doi.org/10.1017/S1471068422000412 [Opens in a new window]
Copyright: © The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agresti, A. and Kateri, M. 2011. Categorical data analysis. In International Encyclopedia of Statistical Science. Springer, 206–208.CrossRef Google Scholar

Alberti, M., Bellodi, E., Cota, G., Riguzzi, F. and Zese, R. 2017. cplint on SWISH: probabilistic logical inference with a web browser. IA 11, 1, 47–64.CrossRef Google Scholar

Angles, R. 2018. The property graph database model. In AMW. Vol. 2100.Google Scholar

Bacchus, F. 1990. Representing and Reasoning with Probabilistic Knowledge - A Logical Approach to Probabilities. MIT Press.Google Scholar

Beame, P., den Broeck, G. V., Gribkoff, E. and Suciu, D. 2014. Symmetric weighted first-order model counting. CoRR abs/1412.1505.CrossRef Google Scholar

Bellomarini, L., Fakhoury, D., Gottlob, G. and Sallinger, E. 2019. Knowledge graphs and enterprise AI: the promise of an enabling technology. In ICDE. IEEE, 26–37.Google Scholar

Bellomarini, L., Fayzrakhmanov, R. R., Gottlob, G., Kravchenko, A., Laurenza, E., Nenov, Y., Reissfelder, S., Sallinger, E., Sherkhonov, E. and Wu, L. 2018. Data science with Vadalog: Bridging machine learning and reasoning. In MEDI. Vol. 11163. Springer, 3–21.Google Scholar

Bellomarini, L., Gottlob, G., Pieris, A. and Sallinger, E. 2017. Swift logic for big data and knowledge graphs. In IJCAI, 2–10.Google Scholar

Bellomarini, L., Laurenza, E., Sallinger, E. and Sherkhonov, E. 2020. Reasoning under uncertainty in knowledge graphs. In RuleML+RR. Vol. 12173. Springer, 131–139.Google Scholar

Bellomarini, L., Sallinger, E. and Gottlob, G. 2018. The Vadalog system: Datalog-based reasoning for knowledge graphs. In VLDB.CrossRef Google Scholar

Berti-Équille, L., Sarma, A. D., Dong, X., Marian, A. and Srivastava, D. 2009. Sailing the information ocean with awareness of currents: Discovery and application of source dependence. CoRR abs/0909.1776.Google Scholar

Bleiholder, J. and Naumann, F. 2008. Data fusion. ACM Computing Surveys 41, 1, 1:1–1:41.Google Scholar

Bollobás, B., Borgs, C., Chayes, J. and Riordan, O. 2003. Directed scale-free graphs. In SODA, 132–139.Google Scholar

Borgwardt, S., Ceylan, I. I. and Lukasiewicz, T. 2017. Ontology-mediated queries for probabilistic databases. In AAAI. AAAI Press, 1063–1069.Google Scholar

Borgwardt, S., Ceylan, I. I. and Lukasiewicz, T. 2018. Recent advances in querying probabilistic knowledge bases. In IJCAI, 5420–5426.Google Scholar

Calì, A., Gottlob, G. and Pieris, A. 2012. Towards more expressive ontology languages: The query answering problem. Artificial Intelligence 193, 87–128.CrossRef Google Scholar

Ceri, S., Gottlob, G., Tanca, L., et al. 1989. What you always wanted to know about datalog (and never dared to ask). KDE 1, 1, 146–166.Google Scholar

Ceylan, I. I. and Peñaloza, R. 2015. Probabilistic query answering in the bayesian description logic BEl. In SUM. Lecture Notes in Computer Science, vol. 9310. Springer, 21–35.Google Scholar

Christen, P. 2012. Data Matching - Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer.CrossRef Google Scholar

Culotta, A. and McCallum, A. 2005. Joint deduplication of multiple record types in relational data. In CIKM. ACM, 257–258.Google Scholar

Dalvi, N. N. and Suciu, D. 2007. Management of probabilistic data: Foundations and challenges. In PODS, 1–12.CrossRef Google Scholar

Dalvi, N. N. and Suciu, D. 2012. The dichotomy of probabilistic inference for unions of conjunctive queries. J. ACM 59, 6, 30:1–30:87.Google Scholar

d’Amato, C., Fanizzi, N. and Lukasiewicz, T. 2008. Tractable reasoning with bayesian description logics. In SUM. Lecture Notes in Computer Science, vol. 5291. Springer, 146–159.Google Scholar

Dantsin, E. 1991. Probabilistic logic programs and their semantics. In RCLP. Lecture Notes in Computer Science, vol. 592. Springer, 152–164.Google Scholar

Dantsin, E., Eiter, T., Gottlob, G. and Voronkov, A. 2001. Complexity and expressive power of logic programming. ACM Computing Surveys 33, 3, 374–425.CrossRef Google Scholar

De Raedt, L. and Kimmig, A. 2015. Probabilistic (logic) programming concepts. ML 100, 1, 5–47.Google Scholar

den Broeck, G. V. and Suciu, D. 2017. Query processing on probabilistic data: A survey. Found. Trends Databases 7, 3-4, 197–341.CrossRef Google Scholar

Domingos, P. M. and Lowd, D. 2019. Unifying logical and statistical AI with markov logic. CACM 62, 7, 74–83.CrossRef Google Scholar

Dong, X. L., Berti-Équille, L. and Srivastava, D. 2015. Data fusion: Resolving conflicts from multiple sources. CoRR abs/1503.00310.Google Scholar

Fagin, R., Kolaitis, P. G., Miller, R. J. and Popa, L. 2005. Data exchange: semantics and query answering. Theoretical Computer Science 336, 1, 89–124.CrossRef Google Scholar

Fayzrakhmanov, R. R., Sallinger, E., Spencer, B., Furche, T. and Gottlob, G. 2018. Browserless web data extraction: Challenges and opportunities. In WWW. ACM, 1095–1104.Google Scholar

Fellegi, I. and Sunter, A. 1969. A theory for record linkage. Journal of American Statistical Association 64, 1183–1210.CrossRef Google Scholar

Fierens, D., den Broeck, G. V., Renkens, J., Shterionov, D. S., Gutmann, B., Thon, I., Janssens, G. and Raedt, L. D. 2015. Inference and learning in probabilistic logic programs using weighted boolean formulas. TPLP.CrossRef Google Scholar

Gilks, W., Richardson, S. and Spiegelhalter, D. 1995. Markov Chain Monte Carlo in Practice . Chapman & Hall/CRC Interdisciplinary Statistics. Taylor & Francis.Google Scholar

Goodman, N. D., Mansinghka, V. K., Roy, D. M., Bonawitz, K. and Tenenbaum, J. B. 2008. Church: a language for generative models. In UAI.Google Scholar

Gottlob, G., Lukasiewicz, T., Martinez, M. V. and Simari, G. I. 2013. Query answering under probabilistic uncertainty in datalog+/- ontologies. Annals of Mathematics and Artificial Intelligence 69, 1, 37–72.CrossRef Google Scholar

Gottlob, G. and Pieris, A. 2015. Beyond SPARQL under OWL 2 QL entailment regime: Rules to the rescue. In IJCAI. 2999–3007.Google Scholar

Green, T. J. and Tannen, V. 2006. Models for incomplete and probabilistic information. IEEE Database Engineering Bulletin 29, 1, 17–24.Google Scholar

Gribkoff, E. and Suciu, D. 2016. Slimshot: In-database probabilistic inference for knowledge bases. PVLDB 9, 7, 552–563.Google Scholar

Halpern, J. Y. 1989. An analysis of first-order logics of probability. In IJCAI, 1375–1381.Google Scholar

Hastings, W. K. 1970. Monte carlo sampling methods using markov chains and their applications. Biometrika 57, 1, 97–109.CrossRef Google Scholar

Hidalgo, C. A. and Barabási, A. 2008. Scale-free networks. Scholarpedia 3, 1, 1716.Google Scholar

Huang, J., Antova, L., Koch, C. and Olteanu, D. 2009. Maybms: a probabilistic database management system. In SIGMOD Conference, 1071–1074.Google Scholar

Jaeger, M. 2018. Probabilistic logic and relational models. In Encyclopedia of Social Network Analysis and Mining. 2nd Ed. Springer.CrossRef Google Scholar

Jung, J. C. and Lutz, C. 2012. Ontology-based access to probabilistic data with OWL QL. In ISWC (1). Lecture Notes in Computer Science, vol. 7649. Springer, 182–197.Google Scholar

Kersting, K. and Raedt, L. D. 2008. Basic principles of learning bayesian logic programs. In Probabilistic Inductive Logic Programming.CrossRef Google Scholar

Koller, D. and Friedman, N. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT.Google Scholar

Krompaß, D., Nickel, M. and Tresp, V. 2014. Querying factorized probabilistic triple databases. In ISWC (2). Lecture Notes in Computer Science, vol. 8797. Springer, 114–129.Google Scholar

Latour, A. L. D., Babaki, B., Dries, A., Kimmig, A., den Broeck, G. V. and Nijssen, S. 2017. Combining stochastic constraint optimization and probabilistic programming - from knowledge compilation to constraint solving. In CP. LNCS, vol. 10416. Springer, 495–511.Google Scholar

Laurenza, E. 2015. Solving conflicts in database fusion with bayesian networks. In FUSION, 399–406.Google Scholar

Lee, J. and Wang, Y. 2016. Weighted rules under the stable model semantics. In KR, 145–154.Google Scholar

Marx, M., Krötzsch, M. and Thost, V. 2017. Logic on MARS: ontologies for generalised property graphs. In IJCAI. 1188–1194.Google Scholar

McCallum, A., Tejada, S. and Quass, D., Eds. 2003. Proceedings of the KDD-2003 Workshop on Data Cleaning, Record Linkage, and Object Consolidation. ACM Press.Google Scholar

McCallum, A. and Wellner, B. 2004. Conditional models of identity uncertainty with application to noun coreference. In NIPS. 905–912.Google Scholar

Michels, C., Fayzrakhmanov, R. R., Ley, M., Sallinger, E. and Schenkel, R. 2017. Oxpath-based data acquisition for dblp. In JCDL. IEEE Computer Society, 319–320.Google Scholar

Milch, B., Marthi, B., Russell, S. J., Sontag, D., Ong, D. L. and Kolobov, A. 2005. BLOG: probabilistic models with unknown objects. In IJCAI.Google Scholar

Mumick, I. S., Pirahesh, H. and Ramakrishnan, R. 1990. The magic of duplicates and aggregates. In VLDB (2002-01-03), D. McLeod, R. Sacks-Davis, and H.-J. Schek, Eds. Kaufmann, Morgan, 264–277.Google Scholar

Niu, F., Ré, C., Doan, A. and Shavlik, J. W. 2011. Tuffy: Scaling up statistical inference in markov logic networks using an RDBMS. PVLDB 4, 6, 373–384.Google Scholar

Nocedal, J. and Wright, S. J. 1999. Numerical Optimization. Springer.CrossRef Google Scholar

Olteanu, D. 2016. Factorized databases: A knowledge compilation perspective. In AAAI Workshop: Beyond NP. AAAI Workshops, vol. WS-16-05. AAAI Press.Google Scholar

Olteanu, D. and Schleich, M. 2016. Factorized databases. SIGMOD Rec. 45, 2, 5–16.CrossRef Google Scholar

Pfeffer, A. and River Analytics, C. 2009. Figaro: An object-oriented probabilistic programming language.Google Scholar

Poggi, A., Lembo, D., Calvanese, D., Giacomo, G. D., Lenzerini, M. and Rosati, R. 2008. Linking data to ontologies. J. Data Semant. 10, 133–173.Google Scholar

Poole, D. 1993. Logic programming, abduction and probability - A top-down anytime algorithm for estimating prior and posterior probabilities. New Generation Computing 11, 3, 377–400.CrossRef Google Scholar

Poole, D. 2008. The independent choice logic and beyond. In Probabilistic Inductive Logic Progr. LNCS, vol. 4911. Springer, 222–243.Google Scholar

Provan, J. S. and Ball, M. O. 1983. The complexity of counting cuts and of computing the probability that a graph is connected. SIAM Journal on Computing 12, 4, 777–788.CrossRef Google Scholar

Richardson, M. and Domingos, P. M. 2006. Markov logic networks. Machine Learning 62, 1–2, 107–136.CrossRef Google Scholar

Riguzzi, F. 2007. A top down interpreter for LPAD and cp-logic. In AI*IA. Vol. 4733. Springer, 109–120.Google Scholar

Sato, T. 1995. A statistical learning method for logic programs with distribution semantics. In ICLP, 715–729.Google Scholar

Sato, T. and Kameya, Y. 1997. PRISM: A language for symbolic-statistical modeling. In IJCAI, 1330–1339.Google Scholar

Singla, P. and Domingos, P. M. 2005. Object identification with attribute-mediated dependences. In PKDD. Lecture Notes in Computer Science, vol. 3721. Springer, 297–308.Google Scholar

Singla, P. and Domingos, P. M. 2006. Entity resolution with markov logic. In ICDM. IEEE Computer Society, 572–582.Google Scholar

Stuart, A. and Ord, K. 1991. Kendall’s advanced theory of statistics, Fifth ed. Vol. 2, Classical Inference and Relationship.Google Scholar

Suciu, D., Olteanu, D., Ré, C. and Koch, C. 2011. Probabilistic Databases . Synthesis Lectures on Data Management. Morgan & Claypool Publishers.Google Scholar

Tierney, L. 1994. Markov chains for exploring posterior distributions. Annals of Statistics 22, 1701–1728.Google Scholar

Ullman, J. D. 1997. Information integration using logical views. In ICDT, 19–40.Google Scholar

Vennekens, J., Denecker, M. and Bruynooghe, M. 2009. Cp-logic: A language of causal probabilistic events and its relation to logic programming. Theory and Practice of Logic Programming 9, 3, 245–308.CrossRef Google Scholar

Vennekens, J., Verbaeten, S. and Bruynooghe, M. 2004. Logic programs with annotated disjunctions. In ICLP.CrossRef Google Scholar

Yin, X., Han, J. and Yu, P. S. 2008. Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering. 20, 6, 796–808.Google Scholar

Article contents

Swift Markov Logic for Probabilistic Reasoning on Knowledge Graphs

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests