TY - JOUR
T1 - Top-k overlapping densest subgraphs
AU - Galbrun, Esther
AU - Gionis, Aristides
AU - Tatti, Nikolaj
PY - 2016/9
Y1 - 2016/9
N2 - Finding dense subgraphs is an important problem in graph mining and has
many practical applications. At the same time, while large real-world
networks are known to have many communities that are not well-separated,
the majority of the existing work focuses on the problem of finding a single densest subgraph. Hence, it is natural to consider the question of finding the top-kdensest subgraphs.
One major challenge in addressing this question is how to handle
overlaps: eliminating overlaps completely is one option, but this may
lead to extracting subgraphs not as dense as it would be possible by
allowing a limited amount of overlap. Furthermore, overlaps are
desirable as in most real-world graphs there are vertices that belong to
more than one community, and thus, to more than one densest subgraph.
In this paper we study the problem of finding top-koverlapping densest subgraphs,
and we present a new approach that improves over the existing
techniques, both in theory and practice. First, we reformulate the
problem definition in a way that we are able to obtain an algorithm with
constant-factor approximation guarantee. Our approach relies on using techniques for solving the max-sum diversification
problem, which however, we need to extend in order to make them
applicable to our setting. Second, we evaluate our algorithm on a
collection of benchmark datasets and show that it convincingly
outperforms the previous methods, both in terms of quality and
efficiency.
AB - Finding dense subgraphs is an important problem in graph mining and has
many practical applications. At the same time, while large real-world
networks are known to have many communities that are not well-separated,
the majority of the existing work focuses on the problem of finding a single densest subgraph. Hence, it is natural to consider the question of finding the top-kdensest subgraphs.
One major challenge in addressing this question is how to handle
overlaps: eliminating overlaps completely is one option, but this may
lead to extracting subgraphs not as dense as it would be possible by
allowing a limited amount of overlap. Furthermore, overlaps are
desirable as in most real-world graphs there are vertices that belong to
more than one community, and thus, to more than one densest subgraph.
In this paper we study the problem of finding top-koverlapping densest subgraphs,
and we present a new approach that improves over the existing
techniques, both in theory and practice. First, we reformulate the
problem definition in a way that we are able to obtain an algorithm with
constant-factor approximation guarantee. Our approach relies on using techniques for solving the max-sum diversification
problem, which however, we need to extend in order to make them
applicable to our setting. Second, we evaluate our algorithm on a
collection of benchmark datasets and show that it convincingly
outperforms the previous methods, both in terms of quality and
efficiency.
KW - Approximation algorithm
KW - Community detection
KW - Dense subgraphs
KW - Diverse subgraphs
KW - Overlapping communities
KW - Social network analysis
UR - http://www.scopus.com/inward/record.url?scp=84969983810&partnerID=8YFLogxK
U2 - 10.1007/s10618-016-0464-z
DO - 10.1007/s10618-016-0464-z
M3 - Article
AN - SCOPUS:84969983810
SN - 1384-5810
VL - 30
SP - 1134
EP - 1165
JO - Data Mining and Knowledge Discovery
JF - Data Mining and Knowledge Discovery
IS - 5
ER -