Projekteja vuodessa
Abstrakti
We address radio resource scheduling in a network of
multiple in-X subnetworks providing wireless Ultra-Reliable Low-
Latency Communication (URLLC) service. Each subnetwork is
controlled by an agent responsible for scheduling resources to its
devices. Agents rely solely on interference measurements for information
about other agents, with no explicit coordination. Subnetwork
mobility and fast-fading effects create a non-stationary
environment, adding to the complexity of the scheduling problem.
This scenario is modeled as a multi-agent Markov Decision
Process (MDP). To address the problem, we propose a Multi-
Agent Deep Reinforcement Learning (MADRL) approach under
URLLC constraints, which integrates Long Short-Term Memory
(LSTM) with the Deep Deterministic Policy Gradient (DDPG)
algorithm to manage non-stationarity and high-dimensional action
spaces. We apply an asynchronous update strategy, where
one agent is updating at a time. This reduces learning variability,
resolves policy conflicts, and improves the interpretability of the
MADRL approach. Simulation results demonstrate that the asynchronous
update mechanism outperforms synchronous updates
and baseline methods, achieving superior reliability, resource
utilization, and explainability.
multiple in-X subnetworks providing wireless Ultra-Reliable Low-
Latency Communication (URLLC) service. Each subnetwork is
controlled by an agent responsible for scheduling resources to its
devices. Agents rely solely on interference measurements for information
about other agents, with no explicit coordination. Subnetwork
mobility and fast-fading effects create a non-stationary
environment, adding to the complexity of the scheduling problem.
This scenario is modeled as a multi-agent Markov Decision
Process (MDP). To address the problem, we propose a Multi-
Agent Deep Reinforcement Learning (MADRL) approach under
URLLC constraints, which integrates Long Short-Term Memory
(LSTM) with the Deep Deterministic Policy Gradient (DDPG)
algorithm to manage non-stationarity and high-dimensional action
spaces. We apply an asynchronous update strategy, where
one agent is updating at a time. This reduces learning variability,
resolves policy conflicts, and improves the interpretability of the
MADRL approach. Simulation results demonstrate that the asynchronous
update mechanism outperforms synchronous updates
and baseline methods, achieving superior reliability, resource
utilization, and explainability.
Alkuperäiskieli | Englanti |
---|---|
Otsikko | Proceedings of the IEEE 101st Vehicular Technology Conference |
Kustantaja | IEEE |
Sivumäärä | 6 |
Tila | Hyväksytty/In press - 2025 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
Tapahtuma | IEEE Vehicular Technology Conference - Oslo, Norja Kesto: 17 kesäk. 2025 → 20 kesäk. 2025 Konferenssinumero: 101 |
Conference
Conference | IEEE Vehicular Technology Conference |
---|---|
Lyhennettä | VTC |
Maa/Alue | Norja |
Kaupunki | Oslo |
Ajanjakso | 17/06/2025 → 20/06/2025 |
Sormenjälki
Sukella tutkimusaiheisiin 'Asynchronous Multi-Agent Reinforcement Learning for Scheduling in Subnetworks'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Projektit
- 2 Päättynyt
-
6G-eMTC: Extreme Machine Type Communications for 6G
Jäntti, R. (Vastuullinen tutkija)
01/01/2023 → 31/12/2024
Projekti: BF Other
-
RILREW: Reinforcement Learning for Real-time Wireless Scheduling and Edge Caching: Theory and Algorithm Design
Tirkkonen, O. (Vastuullinen tutkija)
01/01/2022 → 31/12/2024
Projekti: RCF Other