Projects per year
Abstract
We address radio resource scheduling in a network of
multiple in-X subnetworks providing wireless Ultra-Reliable Low-
Latency Communication (URLLC) service. Each subnetwork is
controlled by an agent responsible for scheduling resources to its
devices. Agents rely solely on interference measurements for information
about other agents, with no explicit coordination. Subnetwork
mobility and fast-fading effects create a non-stationary
environment, adding to the complexity of the scheduling problem.
This scenario is modeled as a multi-agent Markov Decision
Process (MDP). To address the problem, we propose a Multi-
Agent Deep Reinforcement Learning (MADRL) approach under
URLLC constraints, which integrates Long Short-Term Memory
(LSTM) with the Deep Deterministic Policy Gradient (DDPG)
algorithm to manage non-stationarity and high-dimensional action
spaces. We apply an asynchronous update strategy, where
one agent is updating at a time. This reduces learning variability,
resolves policy conflicts, and improves the interpretability of the
MADRL approach. Simulation results demonstrate that the asynchronous
update mechanism outperforms synchronous updates
and baseline methods, achieving superior reliability, resource
utilization, and explainability.
multiple in-X subnetworks providing wireless Ultra-Reliable Low-
Latency Communication (URLLC) service. Each subnetwork is
controlled by an agent responsible for scheduling resources to its
devices. Agents rely solely on interference measurements for information
about other agents, with no explicit coordination. Subnetwork
mobility and fast-fading effects create a non-stationary
environment, adding to the complexity of the scheduling problem.
This scenario is modeled as a multi-agent Markov Decision
Process (MDP). To address the problem, we propose a Multi-
Agent Deep Reinforcement Learning (MADRL) approach under
URLLC constraints, which integrates Long Short-Term Memory
(LSTM) with the Deep Deterministic Policy Gradient (DDPG)
algorithm to manage non-stationarity and high-dimensional action
spaces. We apply an asynchronous update strategy, where
one agent is updating at a time. This reduces learning variability,
resolves policy conflicts, and improves the interpretability of the
MADRL approach. Simulation results demonstrate that the asynchronous
update mechanism outperforms synchronous updates
and baseline methods, achieving superior reliability, resource
utilization, and explainability.
Original language | English |
---|---|
Title of host publication | Proceedings of the IEEE 101st Vehicular Technology Conference |
Publisher | IEEE |
Number of pages | 6 |
Publication status | Accepted/In press - 2025 |
MoE publication type | A4 Conference publication |
Event | IEEE Vehicular Technology Conference - Oslo, Norway Duration: 17 Jun 2025 → 20 Jun 2025 Conference number: 101 |
Conference
Conference | IEEE Vehicular Technology Conference |
---|---|
Abbreviated title | VTC |
Country/Territory | Norway |
City | Oslo |
Period | 17/06/2025 → 20/06/2025 |
Fingerprint
Dive into the research topics of 'Asynchronous Multi-Agent Reinforcement Learning for Scheduling in Subnetworks'. Together they form a unique fingerprint.Projects
- 2 Finished
-
6G-eMTC: Extreme Machine Type Communications for 6G
Jäntti, R. (Principal investigator)
01/01/2023 → 31/12/2024
Project: BF Other
-
RILREW: Reinforcement Learning for Real-time Wireless Scheduling and Edge Caching: Theory and Algorithm Design
Tirkkonen, O. (Principal investigator)
01/01/2022 → 31/12/2024
Project: RCF Other