Adaptive Cache Policy Optimization Through Deep Reinforcement Learning in Dynamic Cellular Networks

Ashvin Srinivasan*, Mohsen Amidzade, Junshan Zhang, Olav Tirkkonen

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: LehtiartikkeliArticleScientificvertaisarvioitu

23 Lataukset (Pure)

Abstrakti

We explore the use of caching both at the network edge and within User Equipment (UE) to alleviate traffic load of wireless networks. We develop a joint cache placement and delivery policy that maximizes the Quality of Service (QoS) while simultaneously minimizing backhaul load and UE power consumption, in the presence of an unknown time-variant file popularity. With file requests in a time slot being affected by download success in the previous slot, the caching system becomes a non-stationary Partial Observable Markov Decision Process (POMDP). We solve the problem in a deep reinforcement learning framework based on the Advantageous Actor-Critic (A2C) algorithm, comparing Feed Forward Neural Networks (FFNN) with a Long Short-Term Memory (LSTM) approach specifically designed to exploit the correlation of file popularity distribution across time slots. Simulation results show that using LSTM-based A2C outperforms FFNN-based A2C in terms of sample efficiency and optimality, demonstrating superior performance for the non-stationary POMDP problem. For caching at the UEs, we provide a distributed algorithm that reaches the objectives dictated by the agent controlling the network, with minimum energy consumption at the UEs, and minimum communication overhead.

AlkuperäiskieliEnglanti
Sivut81-99
Sivumäärä19
JulkaisuIntelligent and Converged Networks
Vuosikerta5
Numero2
DOI - pysyväislinkit
TilaJulkaistu - 2024
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Sormenjälki

Sukella tutkimusaiheisiin 'Adaptive Cache Policy Optimization Through Deep Reinforcement Learning in Dynamic Cellular Networks'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä