3D underwater AUV path planning method integrating adaptive potential field method and deep reinforcement learning

doi:10.3785/j.issn.1008-973X.2025.07.013

Journal of ZheJiang University (Engineering Science)

2025, Vol. 59

Issue (7): 1451-1461 DOI: 10.3785/j.issn.1008-973X.2025.07.013

3D underwater AUV path planning method integrating adaptive potential field method and deep reinforcement learning

Kun HAO(

),Xuan MENG,Xiaofang ZHAO*(

),Zhisheng LI

School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384

Download:

HTML

PDF(4224KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A new 3D underwater AUV path planning method (IADQN) was proposed due to the low quality of the generated path and poor dynamic obstacle avoidance ability of AUV path planning methods in complex marine environments. In order to resolve the problem of insufficient obstacle recognition and avoidance ability of AUVs in unknown underwater environments, an adaptive potential field method was proposed to improve the efficiency of action selection of AUVs. In order to address the problem of low sample selection efficiency in the traditional deep Q network (DQN) experience replay strategy, a priority experience replay strategy was adopted to select samples with higher contributions to training from the experience pool to improve the efficiency of training. AUV dynamically adjusts the reward function according to the current state to accelerate the convergence speed of IADQN during training. Simulation results show that, compared with the DQN scheme, IADQN plans a time-saving and collision-free path efficiently in a real ocean environment; the AUV running time is reduced by 6.41 s, and the maximum angle with the ocean current is reduced by 10.39°.

Key words： path planning deep reinforcement learning adaptive potential field method autonomous underwater vehicle (AUV) dynamic reward function

Received: 21 June 2024 Published: 25 July 2025

CLC:	TP 18
	TP 242

Fund: 国家自然科学基金资助项目（61902273）；教育部春晖计划项目（HZKY20220590）.

Corresponding Authors: Xiaofang ZHAO E-mail: kunhao@tcu.edu.cn;xfzhao@tcu.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Kun HAO
	Xuan MENG
	Xiaofang ZHAO
	Zhisheng LI

Cite this article:

Kun HAO,Xuan MENG,Xiaofang ZHAO,Zhisheng LI. 3D underwater AUV path planning method integrating adaptive potential field method and deep reinforcement learning. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1451-1461.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.07.013 OR https://www.zjujournals.com/eng/Y2025/V59/I7/1451

融合自适应势场法和深度强化学习的三维水下AUV路径规划方法

在复杂海洋环境中，AUV路径规划方法的生成路径质量低、动态避障能力差，为此提出新的三维水下AUV路径规划方法（IADQN）. 针对AUV在未知水下环境中障碍物识别和规避能力不足的问题，提出自适应势场法以提高AUV的动作选择效率. 为了解决传统深度Q网络（DQN）经验回放策略中样本选择效率低的问题，采用优先经验回放策略，从经验池中选择对训练贡献较高的样本来提高训练的效率. AUV根据当前状态动态调整奖励函数，加快DQN在训练期间的收敛速度. 仿真结果表明，与DQN方案相比，IADQN能够在真实的海洋环境下高效规划出省时、无碰撞的路径，使AUV运行时间缩短6.41 s，与洋流的最大夹角减少10.39°.

关键词： 路径规划, 深度强化学习, 自适应势场法, 自主水下航行器(AUV), 动态奖励函数

Fig.1 3D underwater terrain and ocean current simulation

Fig.2 Six-degree-of-freedom model of AUV

Tab.1 Parameters of AUV six-degree-of-freedom model

Fig.3 AUV motion direction in 3D grid environment

Fig.4 Framework diagram of 3D underwater AUV path planning method integrating adaptive potential field method and deep reinforcement learning

Fig.5 Local minimum problem for obstacles

Fig.6 Schematic diagram of virtual target point selection

Fig.7 Schematic diagram of “no barriers to progress” rule

Fig.8 Resultant force of artificial potential field

Tab.2 Parameter value for performance comparison experiments of AUV path planning methods

Fig.9 3D underwater environment

Fig.10 Comparison of paths generated by different methods in 3D underwater environment

Tab.3 Comparison of performance indicators of different path planning methods in 3D underwater environment

Fig.11 Simulation diagram of local seabed environment

Fig.12 Comparison of paths generated by different methods in local seabed environment

Tab.4 Comparison of performance indicators of different path planning methods in local seabed environment

Fig.13 Dynamic obstacle avoidance of proposed path planning method

Fig.14 Comparison of paths generated by two methods in dynamic environments

Fig.15 Comparison of convergence speeds of two path planning methods in dynamic environments


[1]	杨波, 刘烨瑶, 廖佳伟载人潜水器: 面向深海科考和海洋资源开发利用的“国之重器”[J]. 中国科学院院刊, 2021, 36 (5): 622- 631 YANG Bo, LIU Yeyao, LIAO Jiawei Manned submersibles: deep-sea scientific research and exploitation of marine resources[J]. Bulletin of Chinese Academy of Sciences, 2021, 36 (5): 622- 631

[2]	CHENG C, SHA Q, HE B, et al Path planning and obstacle avoidance for AUV: a review[J]. Ocean Engineering, 2021, 235: 109355 doi: 10.1016/j.oceaneng.2021.109355

[3]	刘晨霞, 朱大奇, 周蓓, 等海流环境下多AUV多目标生物启发任务分配与路径规划算法[J]. 控制理论与应用, 2022, 39 (11): 2100- 2107 LIU Chenxia, ZHU Daqi, ZHOU Bei, et al A novel algorithm of multi-AUVs task assignment and path planning based on biologically inspired neural network for ocean current environment[J]. Control Theory and Applications, 2022, 39 (11): 2100- 2107 doi: 10.7641/CTA.2022.11019

[4]	MATSUO Y, LECUN Y, SAHANI M, et al Deep learning, reinforcement learning, and world models[J]. Neural Networks, 2022, 152: 267- 275 doi: 10.1016/j.neunet.2022.03.037

[5]	邢丽静, 李敏, 曾祥光, 等. 部分未知环境下基于行为克隆与改进DQN的AUV路径规划 [EB/OL]. (2024–11–06)[2025–06–20]. https://doi.org/10.16182/j.issn1004731x.joss.24-0678.

[6]	潘云伟, 李敏, 曾祥光, 等. 基于人工势场和改进强化学习的AUV避障和航迹规划 [EB/OL]. (2024–10–09)[2025–06–20]. https://link.cnki.net/urlid/11.2176.TJ.20241008.1329.002.

[7]	刘宇庭, 郭世杰, 唐术锋, 等改进A与ROA-DWA融合的机器人路径规划[J]. 浙江大学学报: 工学版, 2024, 58 (2): 360- 369 LIU Yuting, GUO Shijie, TANG Shufeng, et al Path planning based on fusion of improved A and ROA-DWA for robot[J]. Journal of Zhejiang University: Engineering Science, 2024, 58 (2): 360- 369

[8]	万俊, 孙薇, 葛敏, 等基于含避障角人工势场法的机器人路径规划[J]. 农业机械学报, 2024, 55 (1): 409- 418 WAN Jun, SUN Wei, GE Min, et al Robot path planning based on artificial potential field method with obstacle avoidance angles[J]. Transactions of the Chinese Society for Agricultural Machinery, 2024, 55 (1): 409- 418 doi: 10.6041/j.issn.1000-1298.2024.01.039

[9]	ZHANG W, WANG N, WU W A hybrid path planning algorithm considering AUV dynamic constraints based on improved A* algorithm and APF algorithm[J]. Ocean Engineering, 2023, 285: 115333 doi: 10.1016/j.oceaneng.2023.115333

[10]	CHEN G, CHENG D, CHEN W, et al Path planning for AUVs based on improved APF-AC algorithm[J]. Computers, Materials and Continua, 2024, 78 (3): 3721- 3741 doi: 10.32604/cmc.2024.047325

[11]	YU F, SHANG H, ZHU Q, et al An efficient RRT-based motion planning algorithm for autonomous underwater vehicles under cylindrical sampling constraints[J]. Autonomous Robots, 2023, 47 (3): 281- 297 doi: 10.1007/s10514-023-10083-y

[12]	QI C, WU C, LEI L, et al. UAV path planning based on the improved PPO algorithm [C]// Proceedings of the Asia Conference on Advanced Robotics, Automation, and Control Engineering. Qingdao: IEEE, 2022: 193–199.

[13]	YANG Y, LI J, PENG L Multi-robot path planning based on a deep reinforcement learning DQN algorithm[J]. CAAI Transactions on Intelligence Technology, 2020, 5 (3): 177- 183 doi: 10.1049/trit.2020.0024

[14]	WEN S, WEN Z, ZHANG D, et al A multi-robot path-planning algorithm for autonomous navigation using meta-reinforcement learning based on transfer learning[J]. Applied Soft Computing, 2021, 110: 107605 doi: 10.1016/j.asoc.2021.107605

[15]	祁璇, 周通, 王村松, 等基于改进近端策略优化算法的AGV路径规划与任务调度[J]. 计算机集成制造系统, 2025, 31 (3): 955- 964 QI Xuan, ZHOU Tong, WANG Cunsong, et al AGV path planning and task scheduling based on improved proximal policy optimization algorithm[J]. Computer Integrated Manufacturing Systems, 2025, 31 (3): 955- 964

[16]	YANG J, NI J, XI M, et al Intelligent path planning of underwater robot based on reinforcement learning[J]. IEEE Transactions on Automation Science and Engineering, 2023, 20 (3): 1983- 1996 doi: 10.1109/TASE.2022.3190901

[17]	XING B, WANG X, YANG L, et al An algorithm of complete coverage path planning for unmanned surface vehicle based on reinforcement learning[J]. Journal of Marine Science and Engineering, 2023, 11 (3): 645 doi: 10.3390/jmse11030645

[18]	YANG J, HUO J, XI M, et al A time-saving path planning scheme for autonomous underwater vehicles with complex underwater conditions[J]. IEEE Internet of Things Journal, 2023, 10 (2): 1001- 1013 doi: 10.1109/JIOT.2022.3205685

[19]	孙月平, 方正, 袁必康, 等基于FIA-APF算法的蟹塘投饵船动态路径规划[J]. 农业工程学报, 2024, 40 (9): 137- 145 SUN Yueping, FANG Zheng, YUAN Bikang, et al Dynamic path planning for feeding boat in crab pond using FIA-APF algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering, 2024, 40 (9): 137- 145 doi: 10.11975/j.issn.1002-6819.202312211

[1]	Jun YE,Zhibin XIAO,Xiaoyang LIN,Guan QUAN,Zhen WANG,Yueda WANG,Jiangfei HE,Yang ZHAO. Optimization methods of 3D self-supporting truss structure based on muti-axis 3D printing[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1333-1343.

[2]	Shaomeng YU,Ming YAN,Pengfei WANG,Jianxi ZHU,Xin YANG. 3D path planning of plant protection UAVs in hilly mountainous orchards[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 635-642.

[3]	Mingfang ZHANG,Jian MA,Nale ZHAO,Li WANG,Ying LIU. Intelligent connected vehicle motion planning at unsignalized intersections based on deep reinforcement learning[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(9): 1923-1934.

[4]	Baolin YE,Ruitao SUN,Weimin WU,Bin CHEN,Qing YAO. Traffic signal control method based on asynchronous advantage actor-critic[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(8): 1671-1680.

[5]	Yuting LIU,Shijie GUO,Shufeng TANG,Xuewei ZHANG,Tiantian LI. *Path planning based on fusion of improved A and ROA-DWA for robot**[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(2): 360-369.

[6]	Lifang CHEN,Huogen YANG,Zhichao CHEN,Jie YANG. Global path planning with integration of B-spline technique and genetic algorithm[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(12): 2520-2530.

[7]	Meng ZHANG,Dian-hai WANG,Sheng JIN. Deep reinforcement learning approach to signal control combined with domain experience[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(12): 2524-2532.

[8]	Yu-feng JIANG,Dong-sheng CHEN. Assembly strategy for large-diameter peg-in-hole based on deep reinforcement learning[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(11): 2210-2216.

[9]	Jia-ao JIN,Hong-yao SHEN,Yang-fan SUN,Jia-hao LIN,Jing-ni CHEN. Single-line laser scanning path planning for wire arc and additive manufacturing[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(1): 21-31.

[10]	Wei-xiang XU,Nan KANG,Ting XU. Optimal path planning method based on travel plan data[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(8): 1542-1552.

[11]	Xia HUA,Xin-qing WANG,Ting RUI,Fa-ming SHAO,Dong WANG. Vision-driven end-to-end maneuvering object tracking of UAV[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(7): 1464-1472.

[12]	Zhi-min LIU,Bao-Lin YE,Yao-dong ZHU,Qing YAO,Wei-min WU. Traffic signal control method based on deep reinforcement learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(6): 1249-1256.

[13]	Qi-lin DENG,Juan LU,Yong-hui CHEN,Jian FENG,Xiao-ping LIAO,Jun-yan MA. Optimization method of CNC milling parameters based on deep reinforcement learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(11): 2145-2155.

[14]	Yi-fan MA,Fan-yu ZHAO,Xin WANG,Zhong-he JIN. Satellite earth observation task planning method based on improved pointer networks[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 395-401.

[15]	Jing-li WU,Guo-dong YI,Le-miao QIU,Shu-you ZHANG. Path planning of mobile robots in mixed obstacle space with high temperature[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(10): 1806-1814.

Viewed

Full text

Abstract

Cited

Shared

Discussed