COSI: Cloud Oriented Subgraph Identification in Massive Social Networks

In: Publication

1 Aug 2010

Conference Paper, Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense, Denmark
Authors: Matthias Bröcheler, Andrea Pugliese and V.S. Subrahmanian
Direct link to paper

3rd Best Paper Award

Subgraph matching is a key operation on graph data. Social network (SN) providers may want to find all subgraphs within their social network that “match” certain query graph patterns. Unfortunately, subgraph matching is NP-complete, making its application to massive SNs a major challenge. Past work has shown how to implement subgraph matching on a single processor when the graph has 10-25M edges. In this paper, we show how to use cloud computing in conjunction with such existing single processor methods to efficiently match complex subgraphs on graphs as large as 778M edges. A cloud consists of one “master” compute node and k “slave” compute nodes. We first develop a probabilistic method to estimate probabilities that a vertex will be retrieved by a random query and that a pair of vertices will be successively retrieved by a random query. We use these probability estimates to define edge weights in an SN and to compute minimal edge cuts to partition the graph amongst k slave nodes. We develop algorithms for both master and slave nodes that try to minimize communication overhead. The resulting COSI system can answer complex queries over real-world SN data containing over 778M edges very efficiently.

Presentation at ASONAM 2010

View more presentations from Matthias Broecheler.
Share and Enjoy:
  • Digg
  • Technorati
  • description
  • MisterWong
  • Netvouz
  • ThisNext
  • StumbleUpon

Comment Form

About this blog

Where is the knowledge we have lost in information?
- T.S. Elliot, The Rock

We are drowning in data - exabytes of it. My research explores technologies that can help us organize, structure, and efficiently search huge amounts of information as well as automatically deduce actionable pieces of knowledge from it. Learn more

I was a PhD student at the University of Maryland with research interests in databases, artificial intelligence, and machine learning. Learn more

My Homepage:
Also on Twitter: @MBroecheler