TY - JOUR
T1 - Category tree distance: a taxonomy-based transaction distance for web user analysis
AU - Zhang, Yinjia
AU - Zhao, Qinpei
AU - Shi, Yang
AU - Li, Jiangfeng
AU - Rao, Weixiong
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature.
PY - 2023/1
Y1 - 2023/1
N2 - With the emergence of webpage services, huge amounts of customer transaction data are flooded in cyberspace, which are getting more and more useful for profiling users and making recommendations. Since web user transaction data are usually multi-modal, heterogeneous and large-scale, the traditional data analysis methods meet new challenges. One of the challenges is the distance definition on two transaction data or two web users. The distance definition takes an important role in further analysis, such as the cluster analysis or k-nearest neighbor query. We introduce a category tree distance in this paper, which makes use of the product taxonomy information to convert the user transaction data to vectors. Then, the similarity between web users can be evaluated by the vectors from their transaction data. The properties of the distance like upper and lower bounds and the complexity analysis are also given in the paper. To investigate the performance of the proposal, we conduct experiments on real web user transaction data. The results show that the proposed distance outperforms the other distances on user transaction analysis.
AB - With the emergence of webpage services, huge amounts of customer transaction data are flooded in cyberspace, which are getting more and more useful for profiling users and making recommendations. Since web user transaction data are usually multi-modal, heterogeneous and large-scale, the traditional data analysis methods meet new challenges. One of the challenges is the distance definition on two transaction data or two web users. The distance definition takes an important role in further analysis, such as the cluster analysis or k-nearest neighbor query. We introduce a category tree distance in this paper, which makes use of the product taxonomy information to convert the user transaction data to vectors. Then, the similarity between web users can be evaluated by the vectors from their transaction data. The properties of the distance like upper and lower bounds and the complexity analysis are also given in the paper. To investigate the performance of the proposal, we conduct experiments on real web user transaction data. The results show that the proposed distance outperforms the other distances on user transaction analysis.
KW - Cluster analysis
KW - Distance metric
KW - k-nearest neighbor query
KW - Taxonomy
KW - Transaction data
KW - Tree structure
UR - http://www.scopus.com/inward/record.url?scp=85139880070&partnerID=8YFLogxK
U2 - 10.1007/s10618-022-00874-9
DO - 10.1007/s10618-022-00874-9
M3 - Article
AN - SCOPUS:85139880070
SN - 1384-5810
VL - 37
SP - 39
EP - 66
JO - Data Mining and Knowledge Discovery
JF - Data Mining and Knowledge Discovery
IS - 1
ER -