Decentralized Query Processing over Heterogeneous Sources of Knowledge Graphs
Knowledge graphs are increasingly used in scientific and industrial applications. The large number and size of knowledge graphs published as Linked Data in autonomous sources has led to the development of various interfaces to query these knowledge graphs. Therefore, effective query processing approaches that enable efficient information retrieval from these knowledge graphs need to address the capabilities and limitations of different Linked Data Fragment interfaces.
This book investigates novel approaches to addressing the challenges that arise in the presence of decentralized, heterogeneous sources of knowledge graphs. The effectiveness of these approaches is empirically evaluated and demonstrated using various real world and synthetic large-scale knowledge graphs throughout. First, a sample-based approach for generating fine-grained performance profiles is proposed, and it is demonstrated how the information from such profiles can be leveraged in cost model-based query planning. In addition, a sample-based data distribution profiling approach is advocated which aims to estimate the statistical profile features of large knowledge graphs and the applicability of these estimations in federated querying processing is demonstrated.
The remainder of the book focuses on techniques to devise efficient query processing approaches when heterogeneous interfaces need to be queried but no fine-grained statistics are available. Robust techniques to support efficient query processing in these circumstances are investigated and results are shared to demonstrate the way in which these techniques can outperform state-of-the-art approaches. Finally, the author describes a framework for federated query processing over heterogeneous federations of Linked Data Fragments to exploit the capabilities of different sources by defining interface-aware approaches.