Projekteja vuodessa
Abstrakti
We consider radio resource scheduling in a network
of multiple non-coordinated in-X subnetworks which move with
respect to each other. Each subnetwork is controlled by an
independent agent, scheduling resources to devices within the
subnetwork. The only information about decisions of other agents
is through interference measurements which are non-stationary
due to subnetwork mobility and fast fading effects. The agents
aim is to serve the devices in their subnetwork with a fixed data
rate and a high reliability. The problem is cast as a multi-agent
non-stationary Markov Decision Process (MDP), with unknown
transition functions. We approach the problem via Multi-Agent
Deep Reinforcement Learning (DRL), leveraging Long Short
Term Memory (LSTM) networks to handle the non-stationarity
and Deep Deterministic Policy Gradient (DDPG) to manage highdimensional
continuous action spaces. Candidate actions given
by DRL are quantized to discrete actions by a novel binary tree
search method subject to reliability constraints. Simulation results
indicate that the proposed LSTM-based DRL scheduling strategy
outperforms strategies based on Feed Forward Neural Networks,
Centralized Training with Decentralized Execution approaches
found in the literature, and conventional heuristic approaches.
of multiple non-coordinated in-X subnetworks which move with
respect to each other. Each subnetwork is controlled by an
independent agent, scheduling resources to devices within the
subnetwork. The only information about decisions of other agents
is through interference measurements which are non-stationary
due to subnetwork mobility and fast fading effects. The agents
aim is to serve the devices in their subnetwork with a fixed data
rate and a high reliability. The problem is cast as a multi-agent
non-stationary Markov Decision Process (MDP), with unknown
transition functions. We approach the problem via Multi-Agent
Deep Reinforcement Learning (DRL), leveraging Long Short
Term Memory (LSTM) networks to handle the non-stationarity
and Deep Deterministic Policy Gradient (DDPG) to manage highdimensional
continuous action spaces. Candidate actions given
by DRL are quantized to discrete actions by a novel binary tree
search method subject to reliability constraints. Simulation results
indicate that the proposed LSTM-based DRL scheduling strategy
outperforms strategies based on Feed Forward Neural Networks,
Centralized Training with Decentralized Execution approaches
found in the literature, and conventional heuristic approaches.
Alkuperäiskieli | Englanti |
---|---|
Otsikko | Proceedings of the IEEE Vehicular Technology Conference |
Kustantaja | IEEE |
Sivumäärä | 7 |
Tila | Hyväksytty/In press - 2024 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
Sormenjälki
Sukella tutkimusaiheisiin 'Multi-Agent Reinforcement Learning Approach Scheduling for In-X Subnetworks'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Projektit
- 1 Aktiivinen
-
RILREW: Reinforcement Learning for Real-time Wireless Scheduling and Edge Caching: Theory and Algorithm Design
Tirkkonen, O. (Vastuullinen tutkija), Amidzade, M. (Projektin jäsen), Srinivasan, A. (Projektin jäsen), Singh, U. (Projektin jäsen), Shaikh, B. (Projektin jäsen) & Al-Tous, H. (Projektin jäsen)
01/01/2022 → 31/12/2024
Projekti: Academy of Finland: Other research funding