SuperSketch: A Multi-Dimensional Reversible Data Structure for Super Host Identification

Xuyang Jing, Hui Han, Zheng Yan, Witold Pedrycz

    Research output: Contribution to journalArticleScientificpeer-review

    23 Citations (Scopus)
    10 Downloads (Pure)

    Abstract

    Facing big network traffic data, effective data compression becomes crucially important and urgently needed for estimating host cardinalities and identifying super hosts. However, the current literature confronts several challenges: incapability of simultaneously measuring various types of host cardinalities and inability to efficiently reconstruct super host addresses. To address these challenges, in this paper, we propose a novel sketch data structure, named SuperSketch, to simultaneously measure multiple types of host cardinalities with the purpose of efficiently identifying super hosts. SuperSketch has two significant characteristics: multi-dimensionality and reversibility. The multi-dimensionality makes SuperSketch capable of simultaneously measuring Source Cardinality, Destination Cardinality and Destination Port Cardinality. The reversibility allows SuperSketch to accurately and quickly reconstruct the original addresses of super hosts once they are identified. We conduct both theoretical analysis and performance evaluation based on real-world network traffic. Experimental results show that SuperSketch achieves outstanding performance for multi-cardinality measurement, super host identification and host address reconstruction.

    Original languageEnglish
    Pages (from-to)2741-2754
    Number of pages14
    JournalIEEE Transactions on Dependable and Secure Computing
    Volume19
    Issue number4
    Early online date2021
    DOIs
    Publication statusPublished - Jul 2022
    MoE publication typeA1 Journal article-refereed

    Keywords

    • Host Cardinality
    • Network Traffic Measurement
    • Reversible Sketch
    • Super Host Identification

    Fingerprint

    Dive into the research topics of 'SuperSketch: A Multi-Dimensional Reversible Data Structure for Super Host Identification'. Together they form a unique fingerprint.

    Cite this