航线交换机制下多船避碰的策略学习与博弈决策

doi:10.3785/j.issn.1008-973X.2026.05.006

浙江大学学报(工学版)

2026, Vol. 60

Issue (5): 964-976 DOI: 10.3785/j.issn.1008-973X.2026.05.006

交通工程

航线交换机制下多船避碰的策略学习与博弈决策

汪洋1,2,3(

),刘红超1,2,3,田池4,吴兵1,2,3,张笛1,2,3,*(

)

1. 武汉理工大学水路交通控制全国重点实验室，湖北武汉 430063
2. 武汉理工大学智能交通系统研究中心，湖北武汉 430063
3. 武汉理工大学交通与物流工程学院，湖北武汉 430063
4. 中船鹏力（南京）科技集团有限公司，江苏南京 211106

Multi-ship collision avoidance via route exchange mechanism: strategy learning and game-theoretic decision making

Yang WANG1,2,3(

),Hongchao LIU1,2,3,Chi TIAN4,Bing WU1,2,3,Di ZHANG1,2,3,*(

)

1. State Key Laboratory of Maritime Technology and Safety, Wuhan University of Technology, Wuhan 430063, China
2. Intelligent Transportation Systems Research Center, Wuhan University of Technology, Wuhan 430063, China
3. School of Transportation and Logistics Engineering, Wuhan University of Technology, Wuhan 430063, China
4. CSSC Pride (Nanjing) Technology Group Co., Ltd, Nanjing 211106, China

全文: PDF(4113 KB) HTML

摘要：

针对船舶智能化水平不断提升背景下面临的多船避碰问题，通过航线交换机制，构建基于多智能体强化学习的协同避碰博弈模型，以实现船舶间意向航线信息的实时共享与协商. 由于每艘船舶具备独立的决策与执行能力，在理性与经济性的联合驱动下，将多船避碰决策转化为多智能体协同博弈模型. 各船舶旨在优化航线便捷性、最小化碰撞风险并遵循避让规则，采用多智能体深度确定性策略梯度算法，通过集中训练-分布执行框架优化避碰策略，逐步逼近Pareto最优解. 仿真结果显示，通过合理调整航向得到的优化航线能够有效规避碰撞区域，兼顾安全性与合规性，提升航行效率. 融合多智能体强化学习与博弈论的避碰模型为E-航海条件下智能船舶避碰决策提供了较好可行性的实施方案.

关键词： 水路交通; 多船避碰; 航线交换; 多智能体强化学习; 博弈论

Abstract:

To address the multi-ship collision avoidance problem in the context of growing onboard intelligence, a cooperative collision avoidance game model based on multi-agent reinforcement learning was developed using the route exchange mechanism. Real-time sharing and negotiation of intended route information among ships were enabled. The multi-ship collision avoidance decision was transformed into a multi-agent cooperative game model, with each ship possessing independent decision-making and execution capabilities and being driven by rationality and economic considerations. The objective is to optimize navigational efficiency, minimize collision risk, and comply with anti-collision rules. The multi-agent deep deterministic policy gradient algorithm was employed within a centralized training with decentralized execution framework to optimize collision avoidance strategies, enabling an approach to the Pareto optimal solution. Simulation results demonstrate that optimized routes obtained through reasonable heading adjustments effectively avoid collision zones, balancing safety, compliance, and navigational efficiency. The model that integrates multi-agent reinforcement learning and game theory provides a feasible solution for intelligent ship collision avoidance decisions under the E-navigation paradigm.

Key words: waterway transportation multi-ship collision avoidance route exchange multi-agent reinforcement learning game theory

收稿日期: 2025-05-29 出版日期: 2026-05-06

CLC:

U 675.96

基金资助: 国家自然科学基金资助项目（52425210，52372320）；国家重点研发计划资助项目（2023YFB4301800，2023YFC3010803）.

通讯作者: 张笛 E-mail: wangyang.itsc@whut.edu.cn;zhangdi@whut.edu.cn

作者简介: 汪洋（1976—），男，研究员，从事水路交通安全研究. orcid.org/0000-0003-1997-3956. E-mail：wangyang.itsc@whut.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	汪洋
	刘红超
	田池
	吴兵
	张笛

引用本文:

汪洋,刘红超,田池,吴兵,张笛. 航线交换机制下多船避碰的策略学习与博弈决策[J]. 浙江大学学报(工学版), 2026, 60(5): 964-976.

Yang WANG,Hongchao LIU,Chi TIAN,Bing WU,Di ZHANG. Multi-ship collision avoidance via route exchange mechanism: strategy learning and game-theoretic decision making. Journal of ZheJiang University (Engineering Science), 2026, 60(5): 964-976.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2026.05.006 或 https://www.zjujournals.com/eng/CN/Y2026/V60/I5/964

图 1 航线交换原理

图 2 船舶间最近时空距离

图 3 典型会遇场景的船舶避让行动

图 4 船舶会遇直航/让行关系的解读

图 5 面向航线修改的航点调整范围

图 6 不同航线曲率下航点信息丢失对最近时空距离的影响

图 7 风浪流工况对船舶轨迹的影响

图 8 多智能体深度确定性策略梯度算法框架

图 9 奖励权重组合对避碰决策关键参数的影响

表 1 多智能体深度确定性策略梯度算法参数

图 10 多智能体深度确定性策略梯度算法网络设计

表 2 四船航点信息

图 11 多智能体深度确定性策略梯度算法下不同时间步长的航线规划方案

图 12 船舶两两间距随时间的变化

图 13 总奖励与平均航线偏移随训练回合的变化

图 14 多船会遇的初始场景

图 15 多智能体深度确定性策略梯度算法求解的Pareto最优航线

图 16 不同避碰算法的性能对比

图 17 速度障碍算法原理图

图 18 不同避碰算法生成的航线对比

1	中国船东互保协会. 2023船舶安全风险报告[EB/OL]. (2024−01−23)[2025−05−11]. https://www.chinapandi.com/index.php/cn/?option=com_attachments&task=download&id=590.
2	International Maritime Organization. Strategy for the development and implementation of e-navigation [EB/OL]. (2011−07−25)[2025−05−11]. https://wwwcdn.imo.org/localresources/en/OurWork/Safety/Documents/enavigation/MSC%2085%20-%20annex%2020%20-%20Strategy%20for%20the%20development%20and%20implementation%20of%20e-nav.pdf.
3	International Maritime Organization. E-navigation strategy implementation plan [EB/OL]. (2018−05−28)[2025−05−11]. https://wwwcdn.imo.org/localresources/en/OurWork/Safety/Documents/enavigation/MSC.1-Circ.1595%20-%20E-Navigation%20Strategy%20Implementation%20Plan%20-%20Update%201%20(Secretariat)%20(2).pdf.
4	贺益雄, 代永刚, 赵兴亚, 等河口深槽可航宽度变化水域航行决策方法[J]. 上海交通大学学报, 2025, 59 (4): 489- 502 HE Yixiong, DAI Yonggang, ZHAO Xingya, et al Navigation decision method in estuary deep trough with varying width of navigable waters[J]. Journal of Shanghai Jiaotong University, 2025, 59 (4): 489- 502 doi: 10.16183/j.cnki.jsjtu.2023.356
5	吴建军, 陈炎, 朱清华, 等紧迫危险威胁下交叉相遇局面应急操船方法[J]. 中国安全科学学报, 2024, 34 (5): 238- 246 WU Jianjun, CHEN Yan, ZHU Qinghua, et al Emergency ship maneuvering method for crossing encounter situation under immediate danger threat[J]. China Safety Science Journal, 2024, 34 (5): 238- 246 doi: 10.16265/j.cnki.issn1003-3033.2024.05.0910
6	HUANG Y, VAN GELDER P, WEN Y Velocity obstacle algorithms for collision prevention at sea[J]. Ocean Engineering, 2018, 151: 308- 321 doi: 10.1016/j.oceaneng.2018.01.001
7	WANG T, YAN X, WANG Y, et al Ship domain model for multi-ship collision avoidance decision-making with COLREGs based on artificial potential field[J]. TransNav: International Journal on Marine Navigation and Safety of Sea Transportation, 2017, 11 (1): 85- 92 doi: 10.12716/1001.11.01.09
8	NING J, CHEN H, LI T, et al COLREGs-compliant unmanned surface vehicles collision avoidance based on multi-objective genetic algorithm[J]. IEEE Access, 2020, 8: 190367- 190377 doi: 10.1109/ACCESS.2020.3030262
9	WANG T, WU Q, ZHANG J, et al Autonomous decision-making scheme for multi-ship collision avoidance with iterative observation and inference[J]. Ocean Engineering, 2020, 197: 106873 doi: 10.1016/j.oceaneng.2019.106873
10	欧阳旭东, 支云翔, 王腾飞, 等基于扩展式动态博弈的多船避碰决策模型[J]. 中国安全科学学报, 2020, 30 (1): 128- 135 OUYANG Xudong, ZHI Yunxiang, WANG Tengfei, et al Antensive form game theory based multi-ship collision avoidance scheme[J]. China Safety Science Journal, 2020, 30 (1): 128- 135 doi: 10.16265/j.cnki.issn1003-3033.2020.01.020
11	崔浩, 张新宇, 王警, 等自主船舶与有人驾驶船舶动态博弈避碰决策[J]. 中国舰船研究, 2024, 19 (1): 238- 247 CUI Hao, ZHANG Xinyu, WANG Jing, et al Dynamic game collision avoidance decision-making for autonomous and manned ships[J]. Chinese Journal of Ship Research, 2024, 19 (1): 238- 247
12	ZHANG X, WANG C, LIU Y, et al Decision-making for the autonomous navigation of maritime autonomous surface ships based on scene division and deep reinforcement learning[J]. Sensor, 2019, 19 (18): 4055 doi: 10.3390/s19184055
13	黄仁贤, 罗亮基于多智能体深度强化学习的多船协同避碰策略[J]. 计算机集成制造系统, 2024, 30 (6): 1972- 1988 HUANG Renxian, LUO Liang Multi-ship collaborative collision avoidance strategy based on multi-agent deep reinforcement learning[J]. Computer Integrated Manufacturing Systems, 2024, 30 (6): 1972- 1988 doi: 10.13196/j.cims.2023.0382
14	WANG Z, CHEN P, CHEN L, et al Collaborative collision avoidance approach for USVs based on multi-agent deep reinforcement learning[J]. IEEE Transactions on Intelligent Transportation Systems, 2025, 26 (4): 4780- 4794 doi: 10.1109/TITS.2025.3547775
15	Marine Safety Investigation Unit. Marine safety investigation report [EB/OL]. (2020−03−18)[2025−05−11]. https://www.marfag.no/k52/media/mt-aseem-final-safety-investigation-report.pdf.
16	刘立群, 吴超仲, 褚端峰, 等基于Vondrak滤波和三次样条插值的船舶轨迹修复研究[J]. 交通信息与安全, 2015, 33 (4): 100- 105 LIU Liqun, WU Chaozhong, CHU Duanfeng, et al A study of ship trajectory restoration based on Vondrak filtering and cubic spline interpolation[J]. Journal of Transport Information and Safety, 2015, 33 (4): 100- 105 doi: 10.3963/j.issn1674-4861.2015.04.016
17	WANG Y, YE Q, LAU H, et al Nash bargaining strategy in autonomous decision making for multi-ship collision avoidance based on route exchange[J]. IET Intelligent Transport Systems, 2025, 19 (1): e70025 doi: 10.1049/itr2.70025
18	WANG Y, ZHANG J, CHEN X, et al A spatial-temporal forensic analysis for inland-water ship collisions using AIS data[J]. Safety Science, 2013, 57: 187- 202
19	ZHANG K, HUANG L, HE Y, et al A real-time multi-ship collision avoidance decision-making system for autonomous ships considering ship motion uncertainty[J]. Ocean Engineering, 2023, 278: 114205 doi: 10.1016/j.oceaneng.2023.114205
20	LI G, ZHANG X Research on the influence of wind, waves, and tidal current on ship turning ability based on Norrbin model[J]. Ocean Engineering, 2022, 259: 111875 doi: 10.1016/j.oceaneng.2022.111875
21	符小卫, 王辉, 徐哲基于DE-MADDPG的多无人机协同追捕策略[J]. 宇航学报, 2022, 43 (5): 325311 FU Xiaowei, WANG Hui, XU Ze Cooperative pursuit strategy for multi-UAVs based on DE-MADDPG algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2022, 43 (5): 325311
22	WANG N An intelligent spatial collision risk based on the quaternion ship domain[J]. The Journal of Navigation, 2010, 63: 733- 749 doi: 10.1017/S0373463310000202
23	Sea Traffic Management. Route exchange ship-ship [EB/OL]. (2015−12−10)[2025−05−11]. https://stm-stmvalidation.s3.eu-west-1.amazonaws.com/uploads/20160420153149/Draft-description-of-test-bed-services-and-information-needs_2015-12-10.pdf.

[1]	董玉龙,陈璐,鲍中凯. 基于博弈论的飞机总装物流配送系统资源配置[J]. 浙江大学学报(工学版), 2025, 59(1): 120-129.
[2]	章军辉,郭晓满,王静贤,付宗杰,刘禹希. 基于非合作博弈的车道保持共享控制[J]. 浙江大学学报(工学版), 2024, 58(5): 1001-1008.
[3]	潘巨龙,李善平,张道远. 无线传感器网络簇内可疑节点的博弈检测方法[J]. J4, 2012, 46(1): 72-78.
[4]	谢俊白兴忠魏建详陈琳甘德强. 西北电网调峰成本补偿研究[J]. J4, 2009, 43(3): 584-588.

Viewed

Full text

Abstract

Cited

Shared

Discussed