Skopus: Mining top-k sequential patterns under leverage

François Petitjean*, Tao Li, Nikolaj Tatti, Geoffrey I. Webb

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

18 Citations (Scopus)

Abstract

This paper presents a framework for exact discovery of the top-k sequential patterns under Leverage. It combines (1) a novel definition of the expected support for a sequential pattern—a concept on which most interestingness measures directly rely—with (2) Skopus: a new branch-and-bound algorithm for the exact discovery of top-k sequential patterns under a given measure of interest. Our interestingness measure employs the partition approach. A pattern is interesting to the extent that it is more frequent than can be explained by assuming independence between any of the pairs of patterns from which it can be composed. The larger the support compared to the expectation under independence, the more interesting is the pattern. We build on these two elements to exactly extract the k sequential patterns with highest leverage, consistent with our definition of expected support. We conduct experiments on both synthetic data with known patterns and real-world datasets; both experiments confirm the consistency and relevance of our approach with regard to the state of the art.

Original languageEnglish
Pages (from-to)1086–1111
Number of pages26
JournalData Mining and Knowledge Discovery
Volume30
Issue number5
Early online date14 Jun 2016
DOIs
Publication statusPublished - Sep 2016
MoE publication typeA1 Journal article-refereed

Keywords

  • Data mining
  • Exact discovery
  • Interestingness measures
  • Pattern mining
  • Sequential data

Fingerprint Dive into the research topics of 'Skopus: Mining top-k sequential patterns under leverage'. Together they form a unique fingerprint.

Cite this