Vision-driven end-to-end maneuvering object tracking of UAV |
Xia HUA(),Xin-qing WANG*(),Ting RUI,Fa-ming SHAO,Dong WANG |
College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China |
Abstract An end-to-end active object tracking control method of UAV with continuous motion output was proposed aiming at the autonomous motion control problem of UAV maneuvering object tracking. An end-to-end decision-making control model based on visual perception and deep reinforcement learning strategy was designed. The continuous visual images observed by UAV were taken as the input state, and the continuous control quantity of UAV flight action was output. An efficient transfer learning strategy based on task decomposition and pre training was proposed in order to improve the generalization ability of control model. The simulation results show that the method can realize the adaptive adjustment of UAV attitude in a variety of maneuvering object tracking tasks and make the UAV stably track the moving object in the air. The generalization ability and training efficiency of UAV tracking controller in unknown environment were significantly improved.
Received: 22 June 2021
Published: 26 July 2022
Fund: 国家重点研发计划资助项目(2016YFC0802904);国家自然科学基金资助项目(61671470);江苏省自然科学基金资助项目(BK20161470);中国博士后科学基金第62批面上资助项目(2017M623423) |
Corresponding Authors:
Xin-qing WANG
E-mail: huaxia120888@163.com;17626039818@163.com
针对无人机机动目标跟踪的自主运动控制问题,提出连续型动作输出的无人机端到端主动目标跟踪控制方法. 设计基于视觉感知和深度强化学习策略的端到端决策控制模型,将无人机观察的连续帧视觉图像作为输入状态,输出无人机飞行动作的连续型控制量. 为了提高控制模型的泛化能力,改进基于任务分解和预训练的高效迁移学习策略. 仿真结果表明,该方法能够在多种机动目标跟踪任务中实现无人机姿态的自适应调整,使得无人机在空中能够稳定跟踪移动目标,显著提高了无人机跟踪控制器在未知环境下的泛化能力和训练效率.
[1] |
LUO W, SUN P, ZHONG F, et al. End-to-end active object tracking and its real-world deployment via reinforcement learning [EB/OL]. [2021-05-20]. https://ieeexplore.ieee.org/document/8642452/footnotes#footnotes.
[2] |
李轶锟. 基于视觉的四旋翼飞行器地面目标跟踪技术[D]. 南京: 南京航空航天大学, 2019. LI Yi-kun. Ground target tracking technology of quadrotor based on vision [D]. Nanjing: Nanjing University of Aeronautics and Astronautics, 2019.
[3] |
刘亮. 四旋翼飞行器移动目标跟踪控制研究[D]. 西安: 西安电子科技大学, 2020. LIU Liang. Research on moving target tracking control of quadrotor aircraft [D]. Xi'an: Xi'an University of Electronic Science and Technology, 2020.
[4] |
LI B, YANG Z, CHEN D, et al Maneuvering target tracking of UAV based on MN-DDPG and transfer learning[J]. Defence Technology, 2021, 17 (2): 457- 466
[5] |
罗伟, 徐雪松, 张煌军 多旋翼无人机目标跟踪系统设计[J]. 华东交通大学学报, 2019, 36 (3): 72- 79 LUO Wei, XU Xue-song, ZHANG Huang-jun Design of multi rotor UAV target tracking system[J]. Journal of East China Jiaotong University, 2019, 36 (3): 72- 79
[6] |
张兴旺, 刘小雄, 林传健, 等 基于Tiny-YOLOV3的无人机地面目标跟踪算法设计[J]. 计算机测量与控制, 2021, 29 (2): 76- 81 ZHANG Xing-wang, LIU Xiao-xiong, LIN Chuan-jian, et al Design of UAV ground target tracking algorithm based on Tiny-YOLOV3[J]. Computer Measurement and Control, 2021, 29 (2): 76- 81
[7] |
张昕, 李沛, 蔡俊伟 基于非线性导引的多无人机协同目标跟踪控制[J]. 指挥信息系统与技术, 2019, 10 (4): 47- 54 ZHANG Xin, LI Pei, CAI Jun-wei Multi UAV cooperative target tracking control based on nonlinear guidance[J]. Command Information System and Technology, 2019, 10 (4): 47- 54
[8] |
黄志清, 曲志伟, 张吉, 等 基于深度强化学习的端到端无人驾驶决策[J]. 电子学报, 2020, 48 (9): 1711- 1719 HUANG Zhi-qing, QU Zhi-wei, ZHANG Ji, et al End-to-end autonomous driving decision based on deep reinforcement learning[J]. Acta Electronica Sinica, 2020, 48 (9): 1711- 1719
doi: 10.3969/j.issn.0372-2112.2020.09.007
[9] |
VOLODYMYR M, KORAY K, DAVID S, et al Human-level control through deep reinforcement learning[J]. Nature, 2019, 518 (7540): 529
[10] |
LECUN Y, BENGIO Y, HINTON G Deep learning[J]. Nature, 2015, 521 (7553): 436
doi: 10.1038/nature14539
[11] |
MATTHEW B, SAM R, et al Reinforcement learning, fast and slow[J]. Trends in Cognitive Sciences, 2019, 23 (5): 408- 422
[12] |
LIU Q, SHI L, SUN L, et al Path planning for UAV-mounted mobile edge computing with deep reinforcement learning[J]. IEEE Transactions on Vehicular Technology, 2020, 69 (5): 5723- 5728
doi: 10.1109/TVT.2020.2982508
[13] |
KHAN A, FENG Jiang, LIU Shao-hui, et al Playing a FPS Doom video game with deep visual reinforcement learning[J]. Automatic Control and Computer Sciences, 2019, 53 (3): 214- 222
doi: 10.3103/S0146411619030052
[14] |
SEWAK M. Deep Q network (DQN), double DQN, and dueling DQN: a step towards general artificial intelligence [M]. Singapore: Springer, 2019.
[15] |
ZENG Y, XU X, JIN S, et al Simultaneous navigation and radio mapping for cellular-connected UAV with deep reinforcement learning[J]. IEEE Transactions on Wireless Communications, 2021, 20 (7): 4205- 4220
doi: 10.1109/TWC.2021.3056573
[16] |
LUO C, JIN L, SUN Z MORAN: a multi-object rectified attention network for scene text recognition[J]. Pattern Recognition, 2019, 90: 109- 118
doi: 10.1016/j.patcog.2019.01.020
[17] |
DE BLASI S, KLÖSER S, MÜLLER A, et al KIcker: an industrial drive and control Foosball system automated with deep reinforcement learning[J]. Journal of Intelligent and Robotic Systems, 2021, 102 (1): 107
[18] |
HE G, LIU T, WANG Y, et al. Research on Actor-Critic reinforcement learning in RoboCup [C]// World Congress on Intelligent Control and Automation. Dalian: IEEE, 2006: 205.
[19] |
WAN K F, GAO X G, HU Z J, et al. Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning [EB/OL]. [2021-05-20]. https://www.mdpi.com/2072-4292/12/4/640.
[20] |
YANG Q, ZHU Y, ZHANG J, et al. UAV air combat autonomous maneuver decision based on DDPG algorithm [C]// 2019 IEEE 15th International Conference on Control and Automation. Edinburgh: IEEE, 2019.
[21] |
WAN K, GAO X, HU Z, et al Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning[J]. Remote Sensing, 2020, 12 (4): 640
doi: 10.3390/rs12040640
[22] |
SHIN S, KANG Y, KIM Y Reward-driven U-net training for obstacle avoidance drone[J]. Expert Systems with Applications, 2019, 143: 113064
[23] |
POLVARA R, PATACCHIOLA M, HANHEIDE M, et al. Sim-to-real quadrotor landing via sequential deep Q-networks and domain randomization [EB/OL]. [2021-05-20]. https://www.mdpi.com/2218-6581/9/1/8.
[24] |
SHAH S, KAPOOR A, DEY D, et al AirSim: high-fidelity visual and physical simulation for autonomous vehicles[J]. Field and Service Robotics, 2017, 11 (1): 621- 635
[25] |
林传健, 章卫国, 史静平, 等 无人机跟踪系统仿真平台的设计与实现[J]. 哈尔滨工业大学学报, 2020, 52 (10): 119- 127 LIN Chuan-jian, ZHANG Wei-guo, SHI Jing-ping, et al Design and implementation of UAV tracking system simulation platform[J]. Journal of Harbin Institute of Technology, 2020, 52 (10): 119- 127
[26] |
LIU H, WU Y, SUN F Extreme trust region policy optimization for active object recognition[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29 (6): 2253- 2258
doi: 10.1109/TNNLS.2017.2785233
[27] |
WANG Z, LI H, WU Z, et al A pretrained proximal policy optimization algorithm with reward shaping for aircraft guidance to a moving destination in three-dimensional continuous space[J]. International Journal of Advanced Robotic Systems, 2021, 18 (1): 1- 9
[28] |
WU Y, MANSIMOV E, GROSSE R B, et al Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation[J]. Advances in Neural Information Processing Systems, 2017, 30 (1): 5279- 5288
[29] |
BABENKO B, YANG M H, BELONGIE S. Visual tracking with online multiple instance learning [C]// IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009.
[30] |
D COMANICIU, RAMESH V, MEER P. Real-time tracking of non-rigid objects using mean shift [C]// IEEE Conference on Computer Vision and Pattern Recognition. Nice: IEEE, 2003.
[31] |
HENRIQUES J F, CASEIRO R, MARTINS P, et al High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37 (3): 583- 596
doi: 10.1109/TPAMI.2014.2345390
Viewed |
Full text
Cited |
Shared |
Discussed |