Workshop Paper, ICDE Workshop on Graph Data Management, 2011, Hannover, Germany
Authors: Matthias Broecheler, Andrea Pugliese, VS Subrahmanian
Direct link to paper
As social network and RDF data grow dramatically in size to billions of edges, the ability to scalably answer queries posed over graph datasets becomes increasingly important.
In this paper, we consider subgraph matching queries which are often posed to social networks and RDF databases - for such queries, we want to find all matching instances in a graph database. Past work on subgraph matching queries uses static cost models which can be very inaccurate due to long-tailed degree distributions commonly found in real world networks. We propose the BudgetMatch query answering algorithm. BudgetMatch costs and recosts query parts adaptively as it executes and learns more about the search space. We show that using this strategy, BudgetMatch can quickly answer complex subgraph queries on very large graph data. Specifically, on a real world social media data set consisting of 1.12 billion edges, we can answer complex subgraph queries in under one second and significantly outperform existing subgraph matching algorithms.
Where is the knowledge we have lost in information?
- T.S. Elliot, The Rock
We are drowning in data - exabytes of it. My research explores technologies that can help us organize, structure, and efficiently search huge amounts of information as well as automatically deduce actionable pieces of knowledge from it. Learn more
I was a PhD student at the University of Maryland with research interests in databases, artificial intelligence, and machine learning. Learn more