Cluster ensemble selection with constraints

Tutkimustuotos: Lehtiartikkeli

Tutkijat

Organisaatiot

  • Xiamen University
  • Nanjing University of Posts and Telecommunications
  • Florida International University

Kuvaus

Clustering ensemble has emerged as an important tool for data analysis, by which a more robust and accurate consensus clustering can be generated. On forming the ensembles, empirical studies have suggested that better ensembles can be obtained by simultaneously considering the quality of the ensembles and the diversity among ensemble members. However, little research efforts have been paid to incorporate prior background knowledge. In this paper, we first provide a theoretical analysis on the effect of the diversity and quality of the ensemble members. We then propose a unified framework to solve constraint-based clustering ensemble selection problem, where some instance level must-link and cannot-link constraints are given as prior knowledge or background information. We formalize this problem as a combinatorial optimization problem in terms of the consistency under the constraints, the diversity among ensemble members, and the overall quality of ensembles. Our proposed framework brings together two distinct yet interrelated themes from clustering: ensemble clustering and semi-supervised clustering. We study different techniques for searching high-quality solutions. Experiments on benchmark datasets demonstrate the effectiveness of our framework.

Yksityiskohdat

AlkuperäiskieliEnglanti
Sivut59-70
Sivumäärä12
JulkaisuNeurocomputing
Vuosikerta235
TilaJulkaistu - 26 huhtikuuta 2017
OKM-julkaisutyyppiA1 Julkaistu artikkeli, soviteltu

ID: 10957158