PhishStorm - phishing / legitimate URL dataset



URLs dataset with features built and used for evaluation in the paper "PhishStorm: Detecting Phishing with Streaming Analytics" published in IEEE TNSM.
The dataset contains 96,018 URLs: 48,009 legitimate URLs and 48,009 phishing URLs.

This is a CSV file where the "domain" column provides a unique identifier for each entry (which is actually a URL). The "label" column provides the domain entry status, 0: legitimate / 1:phishing.
Other columns provide computed values for features introduced in [1].

Please refer to the following publication when citing this dataset:
[1] S. Marchal, J. Francois, R. State, and T. Engel. PhishStorm: Detecting Phishing with Streaming Analytics. IEEE Transactions on Network and Service Management (TNSM), 11(4):458-471, 2014.
Date made available2014
PublisherAalto University
Date of data production2014

Dataset Licences

  • Notspecified

Cite this