Loading…
On the Impact of Data Distribution in Federated SPARQL Queries
With the growing number of publicly available SPARQL endpoints, federated queries become more and more attractive and feasible. Compared to queries against a single endpoint, queries that range over a number of endpoints pose new challenges, ranging from the type and number of datasets involved to t...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Citations: | Items that cite this one |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | With the growing number of publicly available SPARQL endpoints, federated queries become more and more attractive and feasible. Compared to queries against a single endpoint, queries that range over a number of endpoints pose new challenges, ranging from the type and number of datasets involved to the data distribution across the datasets. Existing research focuses on the data distribution in a central store and is mainly concerned with adopting well-known, traditional database techniques. In this work we investigate the impact of the data distribution in the context of federated SPARQL queries.We perform a number of experiments with four federation frameworks (Sesame Alibaba, Splendid, FedX, and Darq) against an RDF dataset, Dailymed, that we partition by graph and class.Our preliminary results confirm the intuition that the more datasets involved in query processing, the worse performance of federation query is and that the data distribution significantly influences the performance. |
---|---|
DOI: | 10.1109/ICSC.2012.72 |