Projects per year
Abstract
We consider radio resource scheduling in a network
of multiple non-coordinated in-X subnetworks which move with
respect to each other. Each subnetwork is controlled by an
independent agent, scheduling resources to devices within the
subnetwork. The only information about decisions of other agents
is through interference measurements which are non-stationary
due to subnetwork mobility and fast fading effects. The agents
aim is to serve the devices in their subnetwork with a fixed data
rate and a high reliability. The problem is cast as a multi-agent
non-stationary Markov Decision Process (MDP), with unknown
transition functions. We approach the problem via Multi-Agent
Deep Reinforcement Learning (DRL), leveraging Long Short
Term Memory (LSTM) networks to handle the non-stationarity
and Deep Deterministic Policy Gradient (DDPG) to manage highdimensional
continuous action spaces. Candidate actions given
by DRL are quantized to discrete actions by a novel binary tree
search method subject to reliability constraints. Simulation results
indicate that the proposed LSTM-based DRL scheduling strategy
outperforms strategies based on Feed Forward Neural Networks,
Centralized Training with Decentralized Execution approaches
found in the literature, and conventional heuristic approaches.
of multiple non-coordinated in-X subnetworks which move with
respect to each other. Each subnetwork is controlled by an
independent agent, scheduling resources to devices within the
subnetwork. The only information about decisions of other agents
is through interference measurements which are non-stationary
due to subnetwork mobility and fast fading effects. The agents
aim is to serve the devices in their subnetwork with a fixed data
rate and a high reliability. The problem is cast as a multi-agent
non-stationary Markov Decision Process (MDP), with unknown
transition functions. We approach the problem via Multi-Agent
Deep Reinforcement Learning (DRL), leveraging Long Short
Term Memory (LSTM) networks to handle the non-stationarity
and Deep Deterministic Policy Gradient (DDPG) to manage highdimensional
continuous action spaces. Candidate actions given
by DRL are quantized to discrete actions by a novel binary tree
search method subject to reliability constraints. Simulation results
indicate that the proposed LSTM-based DRL scheduling strategy
outperforms strategies based on Feed Forward Neural Networks,
Centralized Training with Decentralized Execution approaches
found in the literature, and conventional heuristic approaches.
Original language | English |
---|---|
Title of host publication | Proceedings of the IEEE Vehicular Technology Conference |
Publisher | IEEE |
Number of pages | 7 |
Publication status | Accepted/In press - 2024 |
MoE publication type | A4 Conference publication |
Fingerprint
Dive into the research topics of 'Multi-Agent Reinforcement Learning Approach Scheduling for In-X Subnetworks'. Together they form a unique fingerprint.Projects
- 1 Active
-
RILREW: Reinforcement Learning for Real-time Wireless Scheduling and Edge Caching: Theory and Algorithm Design
Tirkkonen, O. (Principal investigator), Amidzade, M. (Project Member), Srinivasan, A. (Project Member), Singh, U. (Project Member), Shaikh, B. (Project Member) & Al-Tous, H. (Project Member)
01/01/2022 → 31/12/2024
Project: Academy of Finland: Other research funding