Abstract
Finding dense subgraphs is an important problem in graph mining and has
many practical applications. At the same time, while large real-world
networks are known to have many communities that are not well-separated,
the majority of the existing work focuses on the problem of finding a single densest subgraph. Hence, it is natural to consider the question of finding the top-kdensest subgraphs.
One major challenge in addressing this question is how to handle
overlaps: eliminating overlaps completely is one option, but this may
lead to extracting subgraphs not as dense as it would be possible by
allowing a limited amount of overlap. Furthermore, overlaps are
desirable as in most real-world graphs there are vertices that belong to
more than one community, and thus, to more than one densest subgraph.
In this paper we study the problem of finding top-koverlapping densest subgraphs,
and we present a new approach that improves over the existing
techniques, both in theory and practice. First, we reformulate the
problem definition in a way that we are able to obtain an algorithm with
constant-factor approximation guarantee. Our approach relies on using techniques for solving the max-sum diversification
problem, which however, we need to extend in order to make them
applicable to our setting. Second, we evaluate our algorithm on a
collection of benchmark datasets and show that it convincingly
outperforms the previous methods, both in terms of quality and
efficiency.
Original language | English |
---|---|
Pages (from-to) | 1134–1165 |
Number of pages | 31 |
Journal | Data Mining and Knowledge Discovery |
Volume | 30 |
Issue number | 5 |
DOIs | |
Publication status | Published - Sept 2016 |
MoE publication type | A1 Journal article-refereed |
Keywords
- Approximation algorithm
- Community detection
- Dense subgraphs
- Diverse subgraphs
- Overlapping communities
- Social network analysis