Discovering bursts revisited: Guaranteed optimization of the model parameters

Nikolaj Tatti*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Abstract

One of the classic data mining tasks is to discover bursts, time intervals, where events occur at abnormally high rate. In this paper we revisit Kleinberg's seminal work, where bursts are discovered by using exponential distribution with a varying rate parameter: the regions where it is more advantageous to set the rate higher are deemed bursty. The model depends on two parameters, the initial rate and the change rate. The initial rate, that is, the rate that is used when there are no burstiness was set to the average rate over the whole sequence. The change rate is provided by the user. We argue that these choices are suboptimal: it leads to worse likelihood, and may lead to missing some existing bursts. We propose an alternative problem setting, where the model parameters are selected by optimizing the likelihood of the model. While this tweak is trivial from the problem definition point of view, this changes the optimization problem greatly. To solve the problem in practice, we propose efficient (1+ϵ) approximation schemes. Finally, we demonstrate empirically that with this setting we are able to discover bursts that would have otherwise be undetected.

Original languageEnglish
Title of host publicationProceedings of the 17th SIAM International Conference on Data Mining, SDM 2017
Pages831-839
Number of pages9
ISBN (Electronic)9781611974874
DOIs
Publication statusPublished - 2017
MoE publication typeA4 Article in a conference publication
EventSIAM International Conference on Data Mining - Houston, United States
Duration: 27 Apr 201729 Apr 2017
Conference number: 17

Conference

ConferenceSIAM International Conference on Data Mining
Abbreviated titleSDM
CountryUnited States
CityHouston
Period27/04/201729/04/2017

Fingerprint Dive into the research topics of 'Discovering bursts revisited: Guaranteed optimization of the model parameters'. Together they form a unique fingerprint.

Cite this