|
|
Path planning of agricultural robots based on improved deep reinforcement learning algorithm |
Wei ZHAO1,2( ),Wanzhi ZHANG1,2,*( ),Jialin HOU1,2,Rui HOU3,Yuhua LI1,4,Lejun ZHAO1,4,Jin Cheng1,2 |
1. College of Mechanical and Electronic Engineering, Shandong Agricultural University, Taian 271018, China 2. Shandong Engineering Research Center of Agricultural Equipment Intelligentization, Taian 271018, China 3. School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China 4. Shandong Key Laboratory of Intelligent Production Technology and Equipment for Facility Horticulture, Taian 271018, China |
|
|
Abstract In order to solve the problems of difficulty in finding target points, sparse rewards, and slow convergence when using deep reinforcement learning algorithms for path planning of agricultural robots, a path-planning method based on multi-target point navigation integrated improved deep Q-network algorithm (MPN-DQN) was proposed. The laser simultaneous localization and mapping (SLAM) was used to scan the global environment to construct a prior map and divide the walking row and crop row areas, and the map boundary was expanded and fitted to form a forward bow-shaped operation corridor. The middle target point was used to segment the global environment, and the complex environment was divided into a multi-stage short-range navigation environment to simplify the target point search process. The deep Q-network algorithm was improved from three aspects: action space, exploration strategy and reward function to improve the reward sparsity problem, accelerate the convergence speed of the algorithm, and improve the navigation success rate. Experimental results showed that the total number of collisions of agricultural robots equipped with the MPN-DQN algorithm was 1, the average navigation time was 104.27 s, the average navigation distance was 16.58 m, and the average navigation success rate was 95%.
|
Received: 04 September 2024
Published: 25 July 2025
|
|
Fund: 山东省重点研发计划(重大科技创新工程)项目(2022CXGC020703);山东省薯类产业技术体系农业机械岗位专家项目(SDAIT-16-10);山东省重点研发计划(乡村振兴科技创新提振行动计划)项目(2022TZXD006). |
Corresponding Authors:
Wanzhi ZHANG
E-mail: zhao868250709@163.com;zhangwanzhi@163.com
|
基于改进深度强化学习算法的农业机器人路径规划
农业机器人采用深度强化学习算法进行路径规划时存在难以找到目标点、稀疏奖励、收敛缓慢等问题,为此提出基于多目标点导航融合改进深度Q网络算法(MPN-DQN)的路径规划方法. 利用激光同步定位与建图(SLAM)扫描全局环境以构建先验地图,划分行走行和作物行区域;对地图边界进行膨胀拟合处理,形成前向弓字形作业走廊. 利用中间目标点分割全局环境,将复杂环境划分为多阶段短程导航环境以简化目标点搜索过程. 从动作空间、探索策略和奖励函数3个方面改进深度Q网络算法以改善奖励稀疏问题,加快算法收敛速度,提高导航成功率. 实验结果表明,搭载MPN-DQN的农业机器人自主行驶的总碰撞次数为1,平均导航时间为104.27 s,平均导航路程为16.58 m,平均导航成功率为95%.
关键词:
深度强化学习,
农业机器人,
中间目标点,
多目标点导航融合改进深度Q网络算法(MPN-DQN),
路径规划
|
|
[29] |
HUANG Yansong, YAO Xifan, JING Xuan, et al DQN-based AGV path planning for situations with multi-starts and multi-targets[J]. Computer Integrated Manufacturing Systems, 2023, 29 (8): 2550- 2562
|
|
|
[30] |
XING B, WANG X, LIU Z The wide-area coverage path planning strategy for deep-sea mining vehicle cluster based on deep reinforcement learning[J]. Journal of Marine Science and Engineering, 2024, 12 (2): 316
doi: 10.3390/jmse12020316
|
|
|
[31] |
王童, 李骜, 宋海荦, 等 基于分层深度强化学习的移动机器人导航方法[J]. 控制与决策, 2022, 37 (11): 2799- 2807 WANG Tong, LI Ao, SONG Hailuo, et al Navigation method for mobile robot based on hierarchical deep reinforcement learning[J]. Control and Decision, 2022, 37 (11): 2799- 2807
|
|
|
[32] |
徐杨, 熊举举, 李论, 等 采用改进的YOLOv5s检测花椒簇[J]. 农业工程学报, 2023, 39 (16): 283- 290 XU Yang, XIONG Juju, LI Lun, et al Detecting pepper cluster using improved YOLOv5s[J]. Transactions of the Chinese Society of Agricultural Engineering, 2023, 39 (16): 283- 290
doi: 10.11975/j.issn.1002-6819.202306119
|
|
|
[33] |
刘慧, 卢云志, 张雷 基于Dropout改进的SRGAN网络DrSRGAN[J]. 科学技术与工程, 2023, 23 (23): 10015- 10022 LIU Hui, LU Yunzhi, ZHANG Lei Improved SRGAN network based on Dropout called DrSRGAN[J]. Science Technology and Engineering, 2023, 23 (23): 10015- 10022
doi: 10.12404/j.issn.1671-1815.2023.23.23.10015
|
|
|
[1] |
刘宇庭, 郭世杰, 唐术锋, 等 改进A*与ROA-DWA融合的机器人路径规划[J]. 浙江大学学报: 工学版, 2024, 58 (2): 360- 369 LIU Yuting, GUO Shijie, TANG Shufeng, et al Path planning based on fusion of improved A* and ROA-DWA for robot[J]. Journal of Zhejiang University: Engineering Science, 2024, 58 (2): 360- 369
|
|
|
[2] |
章一鸣, 姚文广, 陈海进 动态环境下自主机器人的双机制切向避障[J]. 浙江大学学报: 工学版, 2024, 58 (4): 779- 789 ZHANG Yiming, YAO Wenguang, CHEN Haijin Dual-mechanism tangential obstacle avoidance of autonomous robots in dynamic environment[J]. Journal of Zhejiang University: Engineering Science, 2024, 58 (4): 779- 789
|
|
|
[3] |
侯文慧, 周传起, 程炎, 等 基于轻量化U-Net网络的果园垄间路径识别方法[J]. 农业机械学报, 2024, 55 (2): 16- 27 HOU Wenhui, ZHOU Chuanqi, CHENG Yan, et al Path recognition method of orchard ridges based on lightweight U-Net[J]. Transactions of the Chinese Society for Agricultural Machinery, 2024, 55 (2): 16- 27
doi: 10.6041/j.issn.1000-1298.2024.02.002
|
|
|
[4] |
张万枝, 赵威, 李玉华, 等 基于改进A*算法+LM-BZS算法的农业机器人路径规划[J]. 农业机械学报, 2024, 55 (8): 81- 92 ZHANG Wanzhi, ZHAO Wei, LI Yuhua, et al Path planning of agricultural robot based on improved A* and LM-BZS algorithms[J]. Transactions of the Chinese Society for Agricultural Machinery, 2024, 55 (8): 81- 92
doi: 10.6041/j.issn.1000-1298.2024.08.007
|
|
|
[5] |
张万枝, 白文静, 吕钊钦, 等 线性时变模型预测控制器提高农业车辆导航路径自动跟踪精度[J]. 农业工程学报, 2017, 33 (13): 104- 111 ZHANG Wanzhi, BAI Wenjing, LÜ Zhaoqin, et al Linear time-varying model predictive controller improving precision of navigation path automatic tracking for agricultural vehicle[J]. Transactions of the Chinese Society of Agricultural Engineering, 2017, 33 (13): 104- 111
doi: 10.11975/j.issn.1002-6819.2017.13.014
|
|
|
[6] |
刘正铎, 张万枝, 吕钊钦, 等 基于非线性模型的农用车路径跟踪控制器设计与试验[J]. 农业机械学报, 2018, 49 (7): 23- 30 LIU Zhengduo, ZHANG Wanzhi, LÜ Zhaoqin, et al Design and test of path tracking controller based on nonlinear model prediction[J]. Transactions of the Chinese Society for Agricultural Machinery, 2018, 49 (7): 23- 30
doi: 10.6041/j.issn.1000-1298.2018.07.003
|
|
|
[7] |
刘天湖, 张迪, 郑琰, 等 基于改进RRT*算法的菠萝采收机导航路径规划[J]. 农业工程学报, 2022, 38 (23): 20- 28 LIU Tianhu, ZHANG Di, ZHENG Yan, et al Navigation path planning of the pineapple harvester based on improved RRT* algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering, 2022, 38 (23): 20- 28
doi: 10.11975/j.issn.1002-6819.2022.23.003
|
|
|
[8] |
劳彩莲, 李鹏, 冯宇 基于改进A*与DWA算法融合的温室机器人路径规划[J]. 农业机械学报, 2021, 52 (1): 14- 22 LAO Cailian, LI Peng, FENG Yu Path planning of greenhouse robot based on fusion of improved A* algorithm and dynamic window approach[J]. Transactions of the Chinese Society for Agricultural Machinery, 2021, 52 (1): 14- 22
doi: 10.6041/j.issn.1000-1298.2021.01.002
|
|
|
[9] |
景云鹏, 金志坤, 刘刚 基于改进蚁群算法的农田平地导航三维路径规划方法[J]. 农业机械学报, 2020, 51 (Suppl.1): 333- 339 JING Yunpeng, JIN Zhikun, LIU Gang Three dimensional path planning method for navigation of farmland leveling based on improved ant colony algorithm[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020, 51 (Suppl.1): 333- 339
|
|
|
[10] |
高兴旺, 任力生, 王芳 番茄温室内移动喷药机器人的路径规划研究[J]. 计算机工程与应用, 2024, 60 (16): 325- 332 GAO Xingwang, REN Lisheng, WANG Fang Path planning study of mobile spraying robot in tomato greenhouse[J]. Computer Engineering and Applications, 2024, 60 (16): 325- 332
doi: 10.3778/j.issn.1002-8331.2306-0002
|
|
|
[11] |
崔永杰, 王寅初, 何智, 等 基于改进RRT算法的猕猴桃采摘机器人全局路径规划[J]. 农业机械学报, 2022, 53 (6): 151- 158 CUI Yongjie, WANG Yinchu, HE Zhi, et al Global path planning of kiwifruit harvesting robot based on improved RRT algorithm[J]. Transactions of the Chinese Society for Agricultural Machinery, 2022, 53 (6): 151- 158
doi: 10.6041/j.issn.1000-1298.2022.06.015
|
|
|
[12] |
陈凯, 解印山, 李彦明, 等 多约束情形下的农机全覆盖路径规划方法[J]. 农业机械学报, 2022, 53 (5): 17- 26 CHEN Kai, XIE Yinshan, LI Yanming, et al Full coverage path planning method of agricultural machinery under multiple constraints[J]. Transactions of the Chinese Society for Agricultural Machinery, 2022, 53 (5): 17- 26
doi: 10.6041/j.issn.1000-1298.2022.05.002
|
|
|
[13] |
谢秋菊, 王圣超, MUSABIMANA J, 等 基于深度强化学习的猪舍环境控制策略优化与能耗分析[J]. 农业机械学报, 2023, 54 (11): 376- 384 XIE Qiuju, WANG Shengchao, MUSABIMANA J, et al Pig building environment optimization control and energy consumption analysis based on deep reinforcement learning[J]. Transactions of the Chinese Society for Agricultural Machinery, 2023, 54 (11): 376- 384
doi: 10.6041/j.issn.1000-1298.2023.11.036
|
|
|
[14] |
熊俊涛, 李中行, 陈淑绵, 等 基于深度强化学习的虚拟机器人采摘路径避障规划[J]. 农业机械学报, 2020, 51 (Suppl.2): 1- 10 XIONG Juntao, LI Zhonghang, CHEN Shumian, et al Obstacle avoidance planning of virtual robot picking path based on deep reinforcement learning[J]. Transactions of the Chinese Society for Agricultural Machinery, 2020, 51 (Suppl.2): 1- 10
doi: 10.6041/j.issn.1000-1298.2020.S2.001
|
|
|
[15] |
IYENGAR K, SPURGEON S, STOYANOV D Deep reinforcement learning for concentric tube robot path following[J]. IEEE Transactions on Medical Robotics and Bionics, 2024, 6 (1): 18- 29
doi: 10.1109/TMRB.2023.3310037
|
|
|
[16] |
赵淼, 谢良, 林文静, 等 基于动态选择预测器的深度强化学习投资组合模型[J]. 计算机科学, 2024, 51 (4): 344- 352 ZHAO Miao, XIE Liang, LIN Wenjing, et al Deep reinforcement learning portfolio model based on dynamic selectors[J]. Computer Science, 2024, 51 (4): 344- 352
doi: 10.11896/jsjkx.230100048
|
|
|
[17] |
GAO A, LU S, XU R, et al Deep reinforcement learning based planning method in state space for lunar rovers[J]. Engineering Applications of Artificial Intelligence, 2024, 127: 107287
doi: 10.1016/j.engappai.2023.107287
|
|
|
[18] |
刘飞, 唐方慧, 刘琳婷, 等 基于Dueling DQN算法的列车运行图节能优化研究[J]. 都市快轨交通, 2024, 37 (2): 39- 46 LIU Fei, TANG Fanghui, LIU Linting, et al Energy saving optimization of train operation timetable based on a Dueling DQN algorithm[J]. Urban Rapid Rail Transit, 2024, 37 (2): 39- 46
doi: 10.3969/j.issn.1672-6073.2024.02.006
|
|
|
[19] |
李航, 廖映华, 黄波 基于改进DQN算法的茶叶采摘机械手路径规划[J]. 中国农机化学报, 2023, 44 (8): 198- 205 LI Hang, LIAO Yinghua, HUANG Bo Research on path planning of tea picking manipulator based on improved DQN[J]. Journal of Chinese Agricultural Mechanization, 2023, 44 (8): 198- 205
|
|
|
[20] |
林俊强, 王红军, 邹湘军, 等 基于DPPO的移动采摘机器人避障路径规划及仿真[J]. 系统仿真学报, 2023, 35 (8): 1692- 1704 LIN Junqiang, WANG Hongjun, ZOU Xiangjun, et al Obstacle avoidance path planning and simulation of mobile picking robot based on DPPO[J]. Journal of System Simulation, 2023, 35 (8): 1692- 1704
|
|
|
[21] |
熊春源, 熊俊涛, 杨振刚, 等 基于深度强化学习的柑橘采摘机械臂路径规划方法[J]. 华南农业大学学报, 2023, 44 (3): 473- 483 XIONG Chunyuan, XIONG Juntao, YANG Zhengang, et al Path planning method for citrus picking manipulator based on deep reinforcement learning[J]. Journal of South China Agricultural University, 2023, 44 (3): 473- 483
doi: 10.7671/j.issn.1001-411X.202206024
|
|
|
[22] |
WANG Y, LU C, WU P, et al Path planning for unmanned surface vehicle based on improved Q-Learning algorithm[J]. Ocean Engineering, 2024, 292: 116510
doi: 10.1016/j.oceaneng.2023.116510
|
|
|
[23] |
ZHOU Q, LIAN Y, WU J, et al An optimized Q-Learning algorithm for mobile robot local path planning[J]. Knowledge-Based Systems, 2024, 286: 111400
doi: 10.1016/j.knosys.2024.111400
|
|
|
[24] |
史殿习, 彭滢璇, 杨焕焕, 等 基于DQN的多智能体深度强化学习运动规划方法[J]. 计算机科学, 2024, 51 (2): 268- 277 SHI Dianxi, PENG Yingxuan, YANG Huanhuan, et al DQN-based multi-agent motion planning method with deep reinforcement learning[J]. Computer Science, 2024, 51 (2): 268- 277
doi: 10.11896/jsjkx.230500113
|
|
|
[25] |
MIRANDA V R F, NETO A A, FREITAS G M, et al Generalization in deep reinforcement learning for robotic navigation by reward shaping[J]. IEEE Transactions on Industrial Electronics, 2024, 71 (6): 6013- 6020
doi: 10.1109/TIE.2023.3290244
|
|
|
[26] |
王鑫, 仲伟志, 王俊智, 等 基于深度强化学习的无人机路径规划与无线电测绘[J]. 应用科学学报, 2024, 42 (2): 200- 210 WANG Xin, ZHONG Weizhi, WANG Junzhi, et al UAV path planning and radio mapping based on deep reinforcement learning[J]. Journal of Applied Sciences, 2024, 42 (2): 200- 210
doi: 10.3969/j.issn.0255-8297.2024.02.002
|
|
|
[27] |
SAGA R, KOZONO R, TSURUMI Y, et al Deep-reinforcement learning-based route planning with obstacle avoidance for autonomous vessels[J]. Artificial Life and Robotics, 2024, 29 (1): 136- 144
doi: 10.1007/s10015-023-00909-4
|
|
|
[28] |
胡洁, 张亚莉, 王团, 等 基于深度强化学习的农田节点数据无人机采集方法[J]. 农业工程学报, 2022, 38 (22): 41- 51 HU Jie, ZHANG Yali, WANG Tuan, et al UAV collection methods for the farmland nodes data based on deep reinforcement learning[J]. Transactions of the Chinese Society of Agricultural Engineering, 2022, 38 (22): 41- 51
doi: 10.11975/j.issn.1002-6819.2022.22.005
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|