| 计算机技术 |
|
|
|
|
| 动态环境无人机导航的安全分层强化学习框架 |
商益铭( ),杜昌平*( ),杨睿,方天睿,杜泽安,郑耀 |
| 浙江大学 航空航天学院,浙江 杭州 310027 |
|
| Safe hierarchical reinforcement learning framework for dynamic UAV navigation |
Yiming SHANG( ),Changping DU*( ),Rui YANG,Tianrui FANG,Ze’an DU,Yao ZHENG |
| School of Aeronautics and Astronautics, Zhejiang University, Hangzhou 310027, China |
引用本文:
商益铭,杜昌平,杨睿,方天睿,杜泽安,郑耀. 动态环境无人机导航的安全分层强化学习框架[J]. 浙江大学学报(工学版), 2026, 60(6): 1240-1250.
Yiming SHANG,Changping DU,Rui YANG,Tianrui FANG,Ze’an DU,Yao ZHENG. Safe hierarchical reinforcement learning framework for dynamic UAV navigation. Journal of ZheJiang University (Engineering Science), 2026, 60(6): 1240-1250.
链接本文:
https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2026.06.011
或
https://www.zjujournals.com/eng/CN/Y2026/V60/I6/1240
|
| 1 |
LI Y, ZENG Q, SHAO C, et al UAV localization method with keypoints on the edges of semantic objects for low-altitude economy[J]. Drones, 2024, 9 (1): 14
doi: 10.3390/drones9010014
|
| 2 |
WANG Z, XIANG X. Improved Astar algorithm for path planning of marine robot [C]//Proceedings of the 37th Chinese Control Conference. Wuhan: IEEE, 2018: 5410-5414.
|
| 3 |
QI J, YANG H, SUN H MOD-RRT*: a sampling-based algorithm for robot path planning in dynamic environment[J]. IEEE Transactions on Industrial Electronics, 2021, 68 (8): 7244- 7251
doi: 10.1109/TIE.2020.2998740
|
| 4 |
YANG Y, CHEN Z Optimization of dynamic obstacle avoidance path of multirotor UAV based on ant colony algorithm[J]. Wireless Communications and Mobile Computing, 2022, (1): 1299434
|
| 5 |
SHORAKAEI H, VAHDANI M, IMANI B, et al Optimal cooperative path planning of unmanned aerial vehicles by a parallel genetic algorithm[J]. Robotica, 2016, 34 (4): 823- 836
doi: 10.1017/S0263574714001878
|
| 6 |
YU Z, SI Z, LI X, et al A novel hybrid particle swarm optimization algorithm for path planning of UAVs[J]. IEEE Internet of Things Journal, 2022, 9 (22): 22547- 22558
doi: 10.1109/JIOT.2022.3182798
|
| 7 |
AZAR A T, KOUBAA A, MOHAMED N A, et al Drone deep reinforcement learning: a review[J]. Electronics, 2021, 10 (9): 999
doi: 10.3390/electronics10090999
|
| 8 |
OUBBATI O S, ATIQUZZAMAN M, BAZ A, et al Dispatch of UAVs for urban vehicular networks: a deep reinforcement learning approach[J]. IEEE Transactions on Vehicular Technology, 2021, 70 (12): 13174- 13189
doi: 10.1109/TVT.2021.3119070
|
| 9 |
SONNY A, YEDURI S R, CENKERAMADDI L R Q-learning-based unmanned aerial vehicle path planning with dynamic obstacle avoidance[J]. Applied Soft Computing, 2023, 147: 110773
doi: 10.1016/j.asoc.2023.110773
|
| 10 |
LI D, YIN W, WONG W E, et al Quality-oriented hybrid path planning based on A* and Q-learning for unmanned aerial vehicle[J]. IEEE Access, 2021, 10: 7664- 7674
doi: 10.1109/access.2021.3139534
|
| 11 |
THOMAS P S, DA SILVA B C, BARTO A G, et al Preventing undesirable behavior of intelligent machines[J]. Science, 2019, 366 (6468): 999- 1004
doi: 10.1126/science.aag3311
|
| 12 |
HE Y, HOU T, WANG M A new method for unmanned aerial vehicle path planning in complex environments[J]. Scientific Reports, 2024, 14: 9257
doi: 10.1038/s41598-024-60051-4
|
| 13 |
XU L, XI M, GAO R, et al Dynamic path planning of UAV with least inflection point based on adaptive neighborhood A* algorithm and multi-strategy fusion[J]. Scientific Reports, 2025, 15: 8563
doi: 10.1038/s41598-025-92406-w
|
| 14 |
HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor [EB/OL]. [2025-08-10]. https://arxiv.org/abs/1801.01290.
|
| 15 |
FOX D, BURGARD W, THRUN S The dynamic window approach to collision avoidance[J]. IEEE Robotics and Automation Magazine, 1997, 4 (1): 23- 33
doi: 10.1109/100.580977
|
| 16 |
KHATIB O. Real-time obstacle avoidance for manipulators and mobile robots [M]//Autonomous robot vehicles. New York: Springer, 1990: 396–404.
|
| 17 |
MATOUI F, BOUSSAID B, ABDELKRIM M N. Local minimum solution for the potential field method in multiple robot motion planning task [C]//Proceedings of the 16th International Conference on Sciences and Techniques of Automatic Control and Computer Engineering. Monastir: IEEE, 2016: 452–457.
|
| 18 |
ZENG J, ZHANG B, SREENATH K. Safety-critical model predictive control with discrete-time control barrier function [C]//Proceedings of the American Control Conference. New Orleans: IEEE, 2021: 3882–3889.
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|