Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2022, Vol. 56 Issue (7): 1464-1472    DOI: 10.3785/j.issn.1008-973X.2022.07.022
    
Vision-driven end-to-end maneuvering object tracking of UAV
Xia HUA(),Xin-qing WANG*(),Ting RUI,Fa-ming SHAO,Dong WANG
College of Field Engineering, Army Engineering University of PLA, Nanjing 210007, China
Download: HTML     PDF(2109KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

An end-to-end active object tracking control method of UAV with continuous motion output was proposed aiming at the autonomous motion control problem of UAV maneuvering object tracking. An end-to-end decision-making control model based on visual perception and deep reinforcement learning strategy was designed. The continuous visual images observed by UAV were taken as the input state, and the continuous control quantity of UAV flight action was output. An efficient transfer learning strategy based on task decomposition and pre training was proposed in order to improve the generalization ability of control model. The simulation results show that the method can realize the adaptive adjustment of UAV attitude in a variety of maneuvering object tracking tasks and make the UAV stably track the moving object in the air. The generalization ability and training efficiency of UAV tracking controller in unknown environment were significantly improved.



Key wordsdeep reinforcement learning      machine vision      autonomous UAV      transfer learning      object tracking     
Received: 22 June 2021      Published: 26 July 2022
CLC:  TP 242  
Fund:  国家重点研发计划资助项目(2016YFC0802904);国家自然科学基金资助项目(61671470);江苏省自然科学基金资助项目(BK20161470);中国博士后科学基金第62批面上资助项目(2017M623423)
Corresponding Authors: Xin-qing WANG     E-mail: huaxia120888@163.com;17626039818@163.com
Cite this article:

Xia HUA,Xin-qing WANG,Ting RUI,Fa-ming SHAO,Dong WANG. Vision-driven end-to-end maneuvering object tracking of UAV. Journal of ZheJiang University (Engineering Science), 2022, 56(7): 1464-1472.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2022.07.022     OR     https://www.zjujournals.com/eng/Y2022/V56/I7/1464


视觉感知的无人机端到端目标跟踪控制技术

针对无人机机动目标跟踪的自主运动控制问题,提出连续型动作输出的无人机端到端主动目标跟踪控制方法. 设计基于视觉感知和深度强化学习策略的端到端决策控制模型,将无人机观察的连续帧视觉图像作为输入状态,输出无人机飞行动作的连续型控制量. 为了提高控制模型的泛化能力,改进基于任务分解和预训练的高效迁移学习策略. 仿真结果表明,该方法能够在多种机动目标跟踪任务中实现无人机姿态的自适应调整,使得无人机在空中能够稳定跟踪移动目标,显著提高了无人机跟踪控制器在未知环境下的泛化能力和训练效率.


关键词: 深度强化学习,  机器视觉,  自主无人机,  迁移学习,  目标跟踪 
Fig.1 Decision model block diagram
Fig.2 UAV control block diagram
Fig.3 Structure of U-Net segmentation network
Fig.4 Structure of Actor network and Critic network
Fig.5 Two-dimensional local coordinate system for object tracking
Fig.6 Tracking task transfer learning based on task decomposition and dual training
Fig.7 Airsim UAV virtual test scene
阶段 ${C_{\rm{a}}}$ $ {N_{\rm{b}}} $ $ {l_1} $ $ {l_2} $ $ {e_{\max }} $ $ s_{\rm{i}} $ $ R_{\rm{u}} $ $ T_{\rm{b}} $
预训练 10000 64 0.001 0.001 1000 500 0.03 20
训练 2000 32 0.0001 0.0002 1000 500 0.05 20
Tab.1 Experimental parameter settings of deep reinforcement learning experiment
Fig.8 Tracking effects of vehicles and personnel in virtual test scenarios
环境 tp/ms
低速直线运动车辆跟踪 66.3
低速直线运动人员跟踪 63.1
复杂曲线高速运动车辆跟踪 68.5
复杂曲线人员跟踪 63.6
Tab.2 Processing efficiency of single-frame image
Fig.9 Ablation experiment of designed strategy
Fig.10 Comparison results of reward with other advanced models
环境 AR
MIL Meanshift KCF TLD 本文算法
低速直线车辆 ?432.3 ?455.6 ?407.3 ?489.2 458.2
低速直线人员 ?358.2 ?367.8 ?330.2 ?355.7 646.3
曲线高速车辆 ?595.6 ?653.6 ?638.6 ?599.7 285.9
曲线高速人员 ?495.7 ?503.7 ?525.3 ?498.7 373.4
Tab.3 AR comparison results of different models in different tracking tasks and scenarios
环境 EL
MIL Meanshift KCF TLD 本文算法
低速直线车辆 64.7 55.8 40.7 69.2 168.3
低速直线人员 91.2 67.2 53.2 69.2 186.5
曲线高速车辆 33.6 31.2 38.6 69.2 165.9
曲线高速人员 35.7 33.8 37.3 69.2 172.4
Tab.4 EL comparison results of different models in different tracking tasks and scenarios
[1]   LUO W, SUN P, ZHONG F, et al. End-to-end active object tracking and its real-world deployment via reinforcement learning [EB/OL]. [2021-05-20]. https://ieeexplore.ieee.org/document/8642452/footnotes#footnotes.
[2]   李轶锟. 基于视觉的四旋翼飞行器地面目标跟踪技术[D]. 南京: 南京航空航天大学, 2019.
LI Yi-kun. Ground target tracking technology of quadrotor based on vision [D]. Nanjing: Nanjing University of Aeronautics and Astronautics, 2019.
[3]   刘亮. 四旋翼飞行器移动目标跟踪控制研究[D]. 西安: 西安电子科技大学, 2020.
LIU Liang. Research on moving target tracking control of quadrotor aircraft [D]. Xi'an: Xi'an University of Electronic Science and Technology, 2020.
[4]   LI B, YANG Z, CHEN D, et al Maneuvering target tracking of UAV based on MN-DDPG and transfer learning[J]. Defence Technology, 2021, 17 (2): 457- 466
[5]   罗伟, 徐雪松, 张煌军 多旋翼无人机目标跟踪系统设计[J]. 华东交通大学学报, 2019, 36 (3): 72- 79
LUO Wei, XU Xue-song, ZHANG Huang-jun Design of multi rotor UAV target tracking system[J]. Journal of East China Jiaotong University, 2019, 36 (3): 72- 79
[6]   张兴旺, 刘小雄, 林传健, 等 基于Tiny-YOLOV3的无人机地面目标跟踪算法设计[J]. 计算机测量与控制, 2021, 29 (2): 76- 81
ZHANG Xing-wang, LIU Xiao-xiong, LIN Chuan-jian, et al Design of UAV ground target tracking algorithm based on Tiny-YOLOV3[J]. Computer Measurement and Control, 2021, 29 (2): 76- 81
[7]   张昕, 李沛, 蔡俊伟 基于非线性导引的多无人机协同目标跟踪控制[J]. 指挥信息系统与技术, 2019, 10 (4): 47- 54
ZHANG Xin, LI Pei, CAI Jun-wei Multi UAV cooperative target tracking control based on nonlinear guidance[J]. Command Information System and Technology, 2019, 10 (4): 47- 54
[8]   黄志清, 曲志伟, 张吉, 等 基于深度强化学习的端到端无人驾驶决策[J]. 电子学报, 2020, 48 (9): 1711- 1719
HUANG Zhi-qing, QU Zhi-wei, ZHANG Ji, et al End-to-end autonomous driving decision based on deep reinforcement learning[J]. Acta Electronica Sinica, 2020, 48 (9): 1711- 1719
doi: 10.3969/j.issn.0372-2112.2020.09.007
[9]   VOLODYMYR M, KORAY K, DAVID S, et al Human-level control through deep reinforcement learning[J]. Nature, 2019, 518 (7540): 529
[10]   LECUN Y, BENGIO Y, HINTON G Deep learning[J]. Nature, 2015, 521 (7553): 436
doi: 10.1038/nature14539
[11]   MATTHEW B, SAM R, et al Reinforcement learning, fast and slow[J]. Trends in Cognitive Sciences, 2019, 23 (5): 408- 422
[12]   LIU Q, SHI L, SUN L, et al Path planning for UAV-mounted mobile edge computing with deep reinforcement learning[J]. IEEE Transactions on Vehicular Technology, 2020, 69 (5): 5723- 5728
doi: 10.1109/TVT.2020.2982508
[13]   KHAN A, FENG Jiang, LIU Shao-hui, et al Playing a FPS Doom video game with deep visual reinforcement learning[J]. Automatic Control and Computer Sciences, 2019, 53 (3): 214- 222
doi: 10.3103/S0146411619030052
[14]   SEWAK M. Deep Q network (DQN), double DQN, and dueling DQN: a step towards general artificial intelligence [M]. Singapore: Springer, 2019.
[15]   ZENG Y, XU X, JIN S, et al Simultaneous navigation and radio mapping for cellular-connected UAV with deep reinforcement learning[J]. IEEE Transactions on Wireless Communications, 2021, 20 (7): 4205- 4220
doi: 10.1109/TWC.2021.3056573
[16]   LUO C, JIN L, SUN Z MORAN: a multi-object rectified attention network for scene text recognition[J]. Pattern Recognition, 2019, 90: 109- 118
doi: 10.1016/j.patcog.2019.01.020
[17]   DE BLASI S, KLÖSER S, MÜLLER A, et al KIcker: an industrial drive and control Foosball system automated with deep reinforcement learning[J]. Journal of Intelligent and Robotic Systems, 2021, 102 (1): 107
[18]   HE G, LIU T, WANG Y, et al. Research on Actor-Critic reinforcement learning in RoboCup [C]// World Congress on Intelligent Control and Automation. Dalian: IEEE, 2006: 205.
[19]   WAN K F, GAO X G, HU Z J, et al. Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning [EB/OL]. [2021-05-20]. https://www.mdpi.com/2072-4292/12/4/640.
[20]   YANG Q, ZHU Y, ZHANG J, et al. UAV air combat autonomous maneuver decision based on DDPG algorithm [C]// 2019 IEEE 15th International Conference on Control and Automation. Edinburgh: IEEE, 2019.
[21]   WAN K, GAO X, HU Z, et al Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning[J]. Remote Sensing, 2020, 12 (4): 640
doi: 10.3390/rs12040640
[22]   SHIN S, KANG Y, KIM Y Reward-driven U-net training for obstacle avoidance drone[J]. Expert Systems with Applications, 2019, 143: 113064
[23]   POLVARA R, PATACCHIOLA M, HANHEIDE M, et al. Sim-to-real quadrotor landing via sequential deep Q-networks and domain randomization [EB/OL]. [2021-05-20]. https://www.mdpi.com/2218-6581/9/1/8.
[24]   SHAH S, KAPOOR A, DEY D, et al AirSim: high-fidelity visual and physical simulation for autonomous vehicles[J]. Field and Service Robotics, 2017, 11 (1): 621- 635
[25]   林传健, 章卫国, 史静平, 等 无人机跟踪系统仿真平台的设计与实现[J]. 哈尔滨工业大学学报, 2020, 52 (10): 119- 127
LIN Chuan-jian, ZHANG Wei-guo, SHI Jing-ping, et al Design and implementation of UAV tracking system simulation platform[J]. Journal of Harbin Institute of Technology, 2020, 52 (10): 119- 127
[26]   LIU H, WU Y, SUN F Extreme trust region policy optimization for active object recognition[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29 (6): 2253- 2258
doi: 10.1109/TNNLS.2017.2785233
[27]   WANG Z, LI H, WU Z, et al A pretrained proximal policy optimization algorithm with reward shaping for aircraft guidance to a moving destination in three-dimensional continuous space[J]. International Journal of Advanced Robotic Systems, 2021, 18 (1): 1- 9
[28]   WU Y, MANSIMOV E, GROSSE R B, et al Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation[J]. Advances in Neural Information Processing Systems, 2017, 30 (1): 5279- 5288
[29]   BABENKO B, YANG M H, BELONGIE S. Visual tracking with online multiple instance learning [C]// IEEE Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009.
[30]   D COMANICIU, RAMESH V, MEER P. Real-time tracking of non-rigid objects using mean shift [C]// IEEE Conference on Computer Vision and Pattern Recognition. Nice: IEEE, 2003.
[31]   HENRIQUES J F, CASEIRO R, MARTINS P, et al High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37 (3): 583- 596
doi: 10.1109/TPAMI.2014.2345390
[1] Xun ZHANG,Jian-sheng LI,Wen OUYANG,Run-ze CHEN,Zhen JI,Kai ZHENG. Efficient convolution operators integrating motion information and tracking evaluation[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(6): 1135-1143, 1167.
[2] Zhi-min LIU,Bao-Lin YE,Yao-dong ZHU,Qing YAO,Wei-min WU. Traffic signal control method based on deep reinforcement learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(6): 1249-1256.
[3] Yi-cong GAO,Yan-kun WANG,Shao-mei FEI,Qiong LIN. Intelligent proofreading method of engineering drawing based on transfer learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 856-863, 889.
[4] Zhi-chao CHEN,Hai-ning JIAO,Jie YANG,Hua-fu ZENG. Garbage image classification algorithm based on improved MobileNet v2[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(8): 1490-1499.
[5] Li-sheng JIN,Qiang HUA,Bai-cang GUO,Xian-yi XIE,Fu-gang YAN,Bo-tao WU. Multi-target tracking of vehicles based on optimized DeepSort[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(6): 1056-1064.
[6] Xun CHENG,Jian-bo YU. Monitoring method for machining tool wear based on machine vision[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 896-904.
[7] Yi-fan MA,Fan-yu ZHAO,Xin WANG,Zhong-he JIN. Satellite earth observation task planning method based on improved pointer networks[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 395-401.
[8] Jun DU,Chen MA,Zheng-ying WEI. Dynamic response of surface morphology of aluminum (Al) deposited layers in wire and arc additive manufacturing based on visual sensing[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(8): 1481-1489.
[9] Zhuang KANG,Jie YANG,Hao-qi GUO. Automatic garbage classification system based on machine vision[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1272-1280.
[10] Zong-li SHEN,Jian-bo YU. Wafer map defect recognition based on transfer learning and deep forest[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(6): 1228-1239.
[11] Xiao-feng FU,Li NIU,Zhuo-qun HU,Jian-jun LI,Qing WU. Deep micro-expression spotting network training based on concept of transition frame[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(11): 2128-2137.
[12] Kang-hao WANG,Hai-bing YIN,Xiao-feng HUANG. Visual object tracking based on policy gradient[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(10): 1923-1928.
[13] BI Yun bo, XU Chao, Fan Xin tian, YAN Wei miao. Method of countersink perpendicularity detection using vision measurement[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(2): 312-318.
[14] WANG Hai jun, GE Hong juan, ZHANG Sheng yan. Fast object tracking algorithm via kernel collaborative presentation[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(2): 399-407.
[15] LIU Ya nan, NI He peng, ZHANG Cheng rui WANG Yun fei; SUN Hao chun. PC-based open control platform design of integration of machine vision and motion control[J]. Journal of ZheJiang University (Engineering Science), 2016, 50(7): 1381-1386.