Asynchronous Multi-Agent Reinforcement Learning for Scheduling in Subnetworks

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

22 Downloads (Pure)

Abstract

We address radio resource scheduling in a network of
multiple in-X subnetworks providing wireless Ultra-Reliable Low-
Latency Communication (URLLC) service. Each subnetwork is
controlled by an agent responsible for scheduling resources to its
devices. Agents rely solely on interference measurements for information
about other agents, with no explicit coordination. Subnetwork
mobility and fast-fading effects create a non-stationary
environment, adding to the complexity of the scheduling problem.
This scenario is modeled as a multi-agent Markov Decision
Process (MDP). To address the problem, we propose a Multi-
Agent Deep Reinforcement Learning (MADRL) approach under
URLLC constraints, which integrates Long Short-Term Memory
(LSTM) with the Deep Deterministic Policy Gradient (DDPG)
algorithm to manage non-stationarity and high-dimensional action
spaces. We apply an asynchronous update strategy, where
one agent is updating at a time. This reduces learning variability,
resolves policy conflicts, and improves the interpretability of the
MADRL approach. Simulation results demonstrate that the asynchronous
update mechanism outperforms synchronous updates
and baseline methods, achieving superior reliability, resource
utilization, and explainability.
Original languageEnglish
Title of host publicationProceedings of the IEEE 101st Vehicular Technology Conference
PublisherIEEE
Number of pages6
Publication statusAccepted/In press - 2025
MoE publication typeA4 Conference publication
EventIEEE Vehicular Technology Conference - Oslo, Norway
Duration: 17 Jun 202520 Jun 2025
Conference number: 101

Conference

ConferenceIEEE Vehicular Technology Conference
Abbreviated titleVTC
Country/TerritoryNorway
CityOslo
Period17/06/202520/06/2025

Fingerprint

Dive into the research topics of 'Asynchronous Multi-Agent Reinforcement Learning for Scheduling in Subnetworks'. Together they form a unique fingerprint.

Cite this