Hostname: page-component-8448b6f56d-tj2md Total loading time: 0 Render date: 2024-04-20T01:11:42.278Z Has data issue: false hasContentIssue false

Optimizing large knowledge networks in spatial computers

Published online by Cambridge University Press:  07 December 2016

Dominic Pacher
Affiliation:
Databases and Information Systems, University of Innsbruck, Technikerstrasse 21a, A-6020 Innsbruck, Austria e-mail: domi@tracefog.de, robert.binna@uibk.ac.at, guenther.specht@uibk.ac.at
Robert Binna
Affiliation:
Databases and Information Systems, University of Innsbruck, Technikerstrasse 21a, A-6020 Innsbruck, Austria e-mail: domi@tracefog.de, robert.binna@uibk.ac.at, guenther.specht@uibk.ac.at
Günther Specht
Affiliation:
Databases and Information Systems, University of Innsbruck, Technikerstrasse 21a, A-6020 Innsbruck, Austria e-mail: domi@tracefog.de, robert.binna@uibk.ac.at, guenther.specht@uibk.ac.at

Abstract

This paper presents a novel concept of a Spatially Aware Graph Store, which realizes a Graph Store on top of a spatial computer architecture to manage graphs in one, two or three physical dimensions. In this environment, the physical distance between graph nodes strongly affects graph traversal performance. Consequently, a Spatially Aware Graph Store needs to minimize these distances to operate efficiently. We show that this minimization can be achieved in two ways. First, by increasing the dimensionality of the spatial computer and second by applying optimization methods. For the latter, this work introduces a novel Mid Point Optimization method to quickly optimize large real-world knowledge networks by rearranging nodes in a way that distances between linked nodes are reduced. In addition, a Local Optimization method is subsequently applied to refine the result. Finally, the Node Decomposition method is presented that splits nodes with many edges into several smaller nodes to achieve a further reduction of distances between linked nodes.

Our results show that the overall distances between nodes can be reduced by three orders of magnitude for 3D in comparison to one-dimensional (1D) Spatially Aware Graph Stores. The suggested Mid Point Optimization method achieves a reduction by another order of magnitude. In a 3D spatial computer, Local Optimization is capable of reducing distances by another 20%. However, in 1D and 2D spatial computers it becomes a prohibitive time consuming method. Finally, the Node Decomposition enables an additional distance reduction by 40% in Scale Free Graph Data sets.

Type
Articles
Copyright
© Cambridge University Press, 2016 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abadi, D. J., Marcus, A., Madden, S. R. & Hollenbach, K. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd International Conference on Very Large Data Bases, 411–422. VLDB Endowment.Google Scholar
Abou-Rjeili, A. & Karypis, G. 2006. Multilevel algorithms for partitioning power-law graphs. In Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, IPDPS’06, 10pp. IEEE.CrossRefGoogle Scholar
Binna, R. et al. (2010). SpiderStore: exploiting main memory for efficient RDF graph representation and fast querying. In Proceedings of the Workshop on Semantic Data Management (SemData). VLDB.Google Scholar
Brandt, A., Ron, D. & Amit, D. 1986. Multi-level approaches to discrete-state and stochastic problems. Multigrid Methods II 1228, 6598.CrossRefGoogle Scholar
Broekstra, J., Kampman, A. & van Harmelen, F. 2002. Sesame: a generic architecture for storing and querying RDF and RDF schema. In Semantic Web – ISWC’02, 2342, 54–68. Aidministrator Nederland BV.CrossRefGoogle Scholar
Butera, B. 2002. Programming a paintable computer. PhD thesis, Massachusetts Institute of Technology.Google Scholar
DeHon, A., Giavitto, J.-L. & Gruau, F. 2007. Executive Report. In: Dagstuhl Seminar Proceedings, 06361—Computing Media and Languages for Space-Oriented Computation.Google Scholar
Delorimier, M. & Kapre, N. 2006. GraphStep: a system architecture for sparse-graph algorithms. In Field-Programmable Custom Computing Machines, 2006, FCCM’06, 14th Annual IEEE Symposium on IEEE, 143–151.Google Scholar
Delorimier, M., Kapre, N., Mehta, N. & Dehon, A. (2011). Spatial hardware implementation for sparse graph algorithms in GraphStep. ACM Transactions on Autonomous and Adaptive Systems, 6. 1-20.Google Scholar
Erling, O. 2008. Towards web scale RDF. In The 4th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS).Google Scholar
Erling, O. & Mikhailov, I. 2009. RDF support in the Virtuoso DBMS. Networked Knowledge – Networked Media. Studies in Computational Intelligence. 221, 7–24.Google Scholar
Fishburn, P., Tetali, P. & Winkler, P. 2000. Optimal linear arrangement of arectangular grid. Discrete Mathematics 213, 123139.Google Scholar
Garey, M. R. & Johnson, D. S. 1979. Computers and Intractability; A Guide to theTheory of NP-Completeness. W. H. Freeman & Co.Google Scholar
Harel, D. & Koren, Y. 2004. Graph drawing by high-dimensional embedding. Journal of Graph Algorithms and Applications 8, 195214.Google Scholar
Harris, S., Lamb, N. & Shadbolt, N. 2009. 4store: the design and implementation of a clustered RDF store. In 5th International Workshop on Scalable Semantic Web KnowledgeBase Systems (SSWS2009), 94–109.Google Scholar
Harth, A., Umbrich, J., Hogan, A. & Decker, S. (2007). Yars2: a federated repository for searching and querying graphstructured data. In Proceedings of the 6th International Semantic Web Conference—ISWC 06, 211–224.Google Scholar
Hu, Y. 2006. Efficient, high-quality force-directed graph drawing. The Mathematica Journal 10, 3771.Google Scholar
Huang, J., Abadi, D. J. & Ren, K. 2011. Scalable SPARQL querying of large RDF graphs. In Proceedings of the VLDB Endowment, 4, 1123–1134.Google Scholar
Husain, M. & McGlothlin, J. 2011. Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Transactions on Knowledge and Data Engineering 23, 13121327.Google Scholar
Janik, M. & Kochut, K. 2005. BRAHMS: a work bench RDF store and high performance memory system for Semantic Association Discovery. In Fourth International Semantic Web Conference, 431–445.Google Scholar
Johnson, D. S., Aragon, C. R., Mcgeoch, L. A. & Schevon, C. (1989). Optimization by simulated annealing: an experimental evaluation; part I, graph partitioning. Operations Research 37, 865-892.Google Scholar
Koren, Y. & David, H. 2002. A multi-scale algorithm for the linear arrangement problem. In Graph-Theoretic Concepts in Computer Science, 296–309.Google Scholar
Mondal, J. & Deshpande, A. 2012. Managing large dynamic graphs efficiently. In Proceedings of the 2012 International Conference on Management of Data—SIGMOD 12, 145–156.Google Scholar
Neumann, T. & Weikum, G. 2008. RDF-3X: a RISC-style engine for RDF. In Proceedings of the VLDB Endowment 1, 647–-659.Google Scholar
Nguyen, Q., Eades, P., Hong, S. & Huang, W. (2011). Large crossing angles in circular layouts. In Proceedings of the18th International Conference on Graph Drawing, 397–399.CrossRefGoogle Scholar
Ousterhout, J., Agrawal, P., Erickson, D., Kozyrakis, C., Leverich, J., Mazières, D., Mitra, S., Narayanan, A., Ongaro, D., Parulkar, G., Rosenblum, M., Rumble, S., Stratmann, E. & Stutsman, R. (2011). The case for RAM Cloud. Communications of the ACM 54, 121.CrossRefGoogle Scholar
Petit, J. 2003. Experiments on the minimum linear arrangement problem. Journal of Experimental Algorithmics 8, 112128.Google Scholar
Rodriguez-Tello, E., Hao, J.-K. & Torres-Jimenez, J. 2006. A refined evaluation function for the MinLA problem. In MICAI2006: Advances in Artificial Intelligence, 392–403.Google Scholar
Rodriguez-Tello, E., Hao, J.-K. & Torres-Jimenez, J. 2008. An effective two-stage simulated annealing algorithm for the minimum linear arrangement problem. Computers & Operations Research 35, 33313346.Google Scholar
Safro, I., Ron, D. & Brandt, A. 2006. Graph minimum linear arrangement by multilevel weighted edge contractions. Journal of Algorithms 60, 2441.Google Scholar
Safro, I., Ron, D. & Brandt, A. 2009. Multilevel algorithms for linear ordering problems. Journal of Experimental Algorithmics 13, 1.4:11.4:20.Google Scholar
Shao, B., Wang, H. & Li, Y. 2012. The Trinity Graph Engine. Technical Report, Microsoft Research.Google Scholar
Shvachko, K. & Kuang, H. 2010. The hadoop distributed file system. In IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 1–10.Google Scholar
Walshaw, C. 2003. A multilevel algorithm for force-directed graph drawing. Journal of Graph Algorithms and Applications 7, 253285.CrossRefGoogle Scholar
Wilkinson, K., Sayers, C., Kuno, H. & Reynolds, D. (2003). Efficient RDF storage and retrieval in Jena2. In Proceedings of the 1st International Workshop on Semantic Web and Databases, 35–43.Google Scholar
Wulf, W. A. & McKee, S. A. 1995. Hitting the memory wall: implications of the obvious. ACMSIGARCH Computer Architecture News 23, 2024.Google Scholar
Zeng, K., Yang, J., Wang, H., Shao, B. & Wang, Z. (2013). A distributed graph engine for web scale RDF data. In Proceedings of the 39th International Conference on Very Large Data Bases, 265–276.Google Scholar
Supplementary material: File

Pacher Supplementary Material 1

Pacher Supplementary Material 1

Download Pacher Supplementary Material 1(File)
File 7.8 MB
Supplementary material: File

Pacher Supplementary Material 2

Pacher Supplementary Material 2

Download Pacher Supplementary Material 2(File)
File 7.5 MB