Abstract
The exploration of high-dimensional real-valued data is one of the fundamental exploratory data analysis (EDA) tasks. Existing methods use predefined criteria for the representation of data. There is a lack of methods eliciting the user's knowledge from the data and showing patterns the user does not know yet. We provide a theoretical model where the user can input the patterns she has learned as knowledge. The background knowledge is used to find a MaxEnt distribution of the data, and the user is shown maximally informative projections in which the MaxEnt distribution and the data differ the most. We provide an interactive open source EDA system, study its performance, and present use cases on real data.
Original language | English |
---|---|
Title of host publication | Proceedings of the 34th IEEE International Conference on Data Engineering (ICDE 2018) |
Publisher | IEEE |
Pages | 1212-1215 |
Number of pages | 4 |
ISBN (Electronic) | 9781538655207 |
DOIs | |
Publication status | Published - 24 Oct 2018 |
MoE publication type | A4 Conference publication |
Event | International Conference on Data Engineering - Paris, France Duration: 16 Apr 2018 → 19 Apr 2018 Conference number: 34 |
Conference
Conference | International Conference on Data Engineering |
---|---|
Abbreviated title | ICDE |
Country/Territory | France |
City | Paris |
Period | 16/04/2018 → 19/04/2018 |
Keywords
- Dimensionality reduction
- Exploratory data analysis
- Information theory
- Subjective interestingness