Explaining mixture models through semantic pattern mining and banded matrix visualization

Prem Raj Adhikari*, Anže Vavpetič, Jan Kralj, Nada Lavrač, Jaakko Hollmén

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

8 Citations (Scopus)

Abstract

This paper presents an approach to semi-automated data analysis, supported by tools for pattern construction, exploration and explanation. The proposed three-part methodology for multiresolution 0–1 data analysis consists of data clustering with mixture models, extraction of rules from clusters, as well as data and rule visualization using banded matrices. The results of the three-part process: clusters, rules from clusters, and banded structure of the data matrix are finally merged in a unified visual banded matrix display. The incorporation of multiresolution data is enabled by the supporting ontology, describing the relationships between the different resolutions, which is used as background knowledge in the semantic pattern mining process of descriptive rule induction. The presented experimental use case highlights the usefulness of the proposed methodology for analyzing complex DNA copy number amplification data, studied in previous research, for which we provide new insights in terms of induced semantic patterns and cluster/pattern visualization. The methodology is successfully evaluated on four other publicly available data sets, which further demonstrates the utility of the proposed approach.

Original languageEnglish
Pages (from-to)3-39
Number of pages36
JournalMachine Learning
Volume105
Issue number1
Early online date10 Jun 2016
DOIs
Publication statusPublished - Oct 2016
MoE publication typeA1 Journal article-refereed

Keywords

  • Banded matrix
  • Clustering
  • Mixture models
  • Pattern visualization
  • Semantic pattern mining

Fingerprint

Dive into the research topics of 'Explaining mixture models through semantic pattern mining and banded matrix visualization'. Together they form a unique fingerprint.

Cite this