TY - CONF
T1 - Analyzing and visualizing web server access log file
AU - Nguyen, Tri
AU - Diep, Thanh Dang
AU - Tran, Hoang Vinh
AU - Nakajima, Takuma
AU - Thoai, Nam
PY - 2018/11/28
Y1 - 2018/11/28
N2 - Websites have endlessly multiplied during the recent decades and the number of visitors to the websites keeps the pace with them simultaneously, which leads to the process of huge data creation. The data are believed to consist of hidden knowledge well worth considering in various activities related to e-Business, e-CRM, e-Services, e-Newspapers, e-Government, Digital Libraries, and so on. In order to extract knowledge from the web data efficiently, a process called web usage mining is applied to such data. In this literature, we use the process to uncover interesting patterns in web server access log file gathered from Ho Chi Minh City University of Technology (HCMUT) in Vietnam. Moreover, we propose a novel model to construct and add new attributes encompassing country, province (or city), Internet Service Provider (ISP) from the existing attribute IP. The model belongs to attribute construction (or feature construction) which is one of strategies of data transformation being a data pre-processing technique. By utilizing the aforementioned mining process, we have wide knowledge about user access patterns for every country, province and ISP. Such knowledge can be leveraged for optimizing system performance as well as enhancing personalization. Furthermore, the valuable knowledge can be useful for deciding reasonable caching policies for web proxies.
AB - Websites have endlessly multiplied during the recent decades and the number of visitors to the websites keeps the pace with them simultaneously, which leads to the process of huge data creation. The data are believed to consist of hidden knowledge well worth considering in various activities related to e-Business, e-CRM, e-Services, e-Newspapers, e-Government, Digital Libraries, and so on. In order to extract knowledge from the web data efficiently, a process called web usage mining is applied to such data. In this literature, we use the process to uncover interesting patterns in web server access log file gathered from Ho Chi Minh City University of Technology (HCMUT) in Vietnam. Moreover, we propose a novel model to construct and add new attributes encompassing country, province (or city), Internet Service Provider (ISP) from the existing attribute IP. The model belongs to attribute construction (or feature construction) which is one of strategies of data transformation being a data pre-processing technique. By utilizing the aforementioned mining process, we have wide knowledge about user access patterns for every country, province and ISP. Such knowledge can be leveraged for optimizing system performance as well as enhancing personalization. Furthermore, the valuable knowledge can be useful for deciding reasonable caching policies for web proxies.
U2 - https://doi.org/10.1007/978-3-030-03192-3_27
DO - https://doi.org/10.1007/978-3-030-03192-3_27
M3 - Paper
SP - 349
EP - 367
ER -