A partially observable multi-ship collision avoidance decision-making model based on deep reinforcement learning

Kangjie Zheng, Xinyu Zhang*, Chengbo Wang*, Mingyang Zhang, Hao Cui

*Tämän työn vastaava kirjoittaja

Tutkimustuotos: LehtiartikkeliArticleScientificvertaisarvioitu

31 Sitaatiot (Scopus)

Abstrakti

Unmanned ships have drawn widespread attention for their potential to enhance navigational safety, minimize human errors, and improve shipping efficiency. Nevertheless, the complexity and uncertainty of mixed obstacle environments present significant challenges to developing unmanned ships, particularly in collision avoidance decision-making. This paper proposes a new model using the Partially Observable Markov Decision Process (POMDP) to construct a collision avoidance decision-making model in mixed obstacle environments for autonomous ships, which can address the environment's complexity and uncertainty and improve decision accuracy. An image-state observation method is proposed as images can provide more accurate, rich, and reliable information. A dense reward function is designed to address the issue of sparse rewards in fitting the algorithm. The Proximal Policy Optimization (PPO) algorithm is utilized for model training. Based on this, a route guidance method called the PPO for POMDP with guidelines under dense reward (G-IPOMDP-PPO) is proposed, which can improve training efficiency. Simulations are conducted in various mixed obstacle environments and compared with conventional algorithms. The results show that the proposed model can safely and efficiently make collision avoidance decisions in complex and uncertain environments. This research provides a new solution and theoretical foundation for developing autonomous ships and can be extended to achieving dynamic interactive collision avoidance in mixed obstacle environments.

AlkuperäiskieliEnglanti
Artikkeli106689
Sivumäärä13
JulkaisuOcean and Coastal Management
Vuosikerta242
DOI - pysyväislinkit
TilaJulkaistu - 1 elok. 2023
OKM-julkaisutyyppiA1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä

Sormenjälki

Sukella tutkimusaiheisiin 'A partially observable multi-ship collision avoidance decision-making model based on deep reinforcement learning'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä