TY - JOUR

T1 - Top-k overlapping densest subgraphs

AU - Galbrun, Esther

AU - Gionis, Aristides

AU - Tatti, Nikolaj

PY - 2016/9

Y1 - 2016/9

N2 - Finding dense subgraphs is an important problem in graph mining and has
many practical applications. At the same time, while large real-world
networks are known to have many communities that are not well-separated,
the majority of the existing work focuses on the problem of finding a single densest subgraph. Hence, it is natural to consider the question of finding the top-kdensest subgraphs.
One major challenge in addressing this question is how to handle
overlaps: eliminating overlaps completely is one option, but this may
lead to extracting subgraphs not as dense as it would be possible by
allowing a limited amount of overlap. Furthermore, overlaps are
desirable as in most real-world graphs there are vertices that belong to
more than one community, and thus, to more than one densest subgraph.
In this paper we study the problem of finding top-koverlapping densest subgraphs,
and we present a new approach that improves over the existing
techniques, both in theory and practice. First, we reformulate the
problem definition in a way that we are able to obtain an algorithm with
constant-factor approximation guarantee. Our approach relies on using techniques for solving the max-sum diversification
problem, which however, we need to extend in order to make them
applicable to our setting. Second, we evaluate our algorithm on a
collection of benchmark datasets and show that it convincingly
outperforms the previous methods, both in terms of quality and
efficiency.

AB - Finding dense subgraphs is an important problem in graph mining and has
many practical applications. At the same time, while large real-world
networks are known to have many communities that are not well-separated,
the majority of the existing work focuses on the problem of finding a single densest subgraph. Hence, it is natural to consider the question of finding the top-kdensest subgraphs.
One major challenge in addressing this question is how to handle
overlaps: eliminating overlaps completely is one option, but this may
lead to extracting subgraphs not as dense as it would be possible by
allowing a limited amount of overlap. Furthermore, overlaps are
desirable as in most real-world graphs there are vertices that belong to
more than one community, and thus, to more than one densest subgraph.
In this paper we study the problem of finding top-koverlapping densest subgraphs,
and we present a new approach that improves over the existing
techniques, both in theory and practice. First, we reformulate the
problem definition in a way that we are able to obtain an algorithm with
constant-factor approximation guarantee. Our approach relies on using techniques for solving the max-sum diversification
problem, which however, we need to extend in order to make them
applicable to our setting. Second, we evaluate our algorithm on a
collection of benchmark datasets and show that it convincingly
outperforms the previous methods, both in terms of quality and
efficiency.

KW - Approximation algorithm

KW - Community detection

KW - Dense subgraphs

KW - Diverse subgraphs

KW - Overlapping communities

KW - Social network analysis

UR - http://www.scopus.com/inward/record.url?scp=84969983810&partnerID=8YFLogxK

U2 - 10.1007/s10618-016-0464-z

DO - 10.1007/s10618-016-0464-z

M3 - Article

AN - SCOPUS:84969983810

SN - 1384-5810

VL - 30

SP - 1134

EP - 1165

JO - Data Mining and Knowledge Discovery

JF - Data Mining and Knowledge Discovery

IS - 5

ER -