Evaluation of reinforcement learning and model predictive control for apartment heating with heat pump and water storage tank

Research output: Contribution to journalArticleScientificpeer-review

1 Downloads (Pure)

Abstract

Optimising apartment heating systems is becoming increasingly crucial due to growing use of heat pumps and heat storages. This study compares reinforcement learning (RL) and model predictive control (MPC) to optimise and control a simulated apartment heating system. We explore various RL designs, including binary and continuous action spaces, different reward functions, and three storage sizes. We find that MPC outperforms RL when the optimisation period aligns with the actual problem. This is shown in the lower electricity costs of MPC compared to RL with small and base storage sizes. However, the two methods yielded similar electricity costs with the large storage size. We also find that the RL design significantly affects its performance and robustness. The RL model with binary actions and a reward function promoting the active use of storage and profit maximisation outperforms other RL configurations. In turn, a reward function representing the actual problem of cost minimisation was found to be ineffective for agent training. Future studies that compare the two methods for the optimisation of heating systems with long-term storage could focus on extending the prediction horizon or enhancing the terminal cost term in MPC.

Original languageEnglish
Article number120509
Number of pages11
JournalJournal of Energy Storage
Volume152
DOIs
Publication statusPublished - 30 Mar 2026
MoE publication typeA1 Journal article-refereed

Funding

This project has received funding from the European Union – NextGenerationEU instrument and is funded by the Research Council of Finland under grant number 353299 .

Keywords

  • Apartment space heating
  • Heat pump
  • Heat storage
  • Model predictive control
  • Proximal policy optimisation
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'Evaluation of reinforcement learning and model predictive control for apartment heating with heat pump and water storage tank'. Together they form a unique fingerprint.

Cite this