Top-k overlapping densest subgraphs

Esther Galbrun, Aristides Gionis, Nikolaj Tatti*

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

29 Citations (Scopus)

Abstract

Finding dense subgraphs is an important problem in graph mining and has many practical applications. At the same time, while large real-world networks are known to have many communities that are not well-separated, the majority of the existing work focuses on the problem of finding a single densest subgraph. Hence, it is natural to consider the question of finding the top-kdensest subgraphs. One major challenge in addressing this question is how to handle overlaps: eliminating overlaps completely is one option, but this may lead to extracting subgraphs not as dense as it would be possible by allowing a limited amount of overlap. Furthermore, overlaps are desirable as in most real-world graphs there are vertices that belong to more than one community, and thus, to more than one densest subgraph. In this paper we study the problem of finding top-koverlapping densest subgraphs, and we present a new approach that improves over the existing techniques, both in theory and practice. First, we reformulate the problem definition in a way that we are able to obtain an algorithm with constant-factor approximation guarantee. Our approach relies on using techniques for solving the max-sum diversification problem, which however, we need to extend in order to make them applicable to our setting. Second, we evaluate our algorithm on a collection of benchmark datasets and show that it convincingly outperforms the previous methods, both in terms of quality and efficiency.
Original languageEnglish
Pages (from-to)1134–1165
Number of pages31
JournalData Mining and Knowledge Discovery
Volume30
Issue number5
DOIs
Publication statusPublished - Sep 2016
MoE publication typeA1 Journal article-refereed

Keywords

  • Approximation algorithm
  • Community detection
  • Dense subgraphs
  • Diverse subgraphs
  • Overlapping communities
  • Social network analysis

Cite this