Loading…
Top-k graph pattern matching over large graphs
There exist many graph-based applications including bioinformatics, social science, link analysis, citation analysis, and collaborative work. All need to deal with a large data graph. Given a large data graph, in this paper, we study finding top-k answers for a graph pattern query (kGPM), and in par...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | There exist many graph-based applications including bioinformatics, social science, link analysis, citation analysis, and collaborative work. All need to deal with a large data graph. Given a large data graph, in this paper, we study finding top-k answers for a graph pattern query (kGPM), and in particular, we focus on top-k cyclic graph queries where a graph query is cyclic and can be complex. The capability of supporting kGPM provides much more flexibility for a user to search graphs. And the problem itself is challenging. In this paper, we propose a new framework of processing kGPM with on-the-fly ranked lists based on spanning trees of the cyclic graph query. We observe a multidimensional representation for using multiple ranked lists to answer a given kGPM query. Under this representation, we propose a cost model to estimate the least number of tree answers to be consumed in each ranked list for a given kGPM query. This leads to a query optimization approach for kGPM processing, and a top-k algorithm to process kGPM with the optimal query plan. We conducted extensive performance studies using a synthetic dataset and a real dataset, and we confirm the efficiency of our proposed approach. |
---|---|
ISSN: | 1063-6382 2375-026X |
DOI: | 10.1109/ICDE.2013.6544895 |