|
|
Attention aggregation siamese network with anchor free scheme for UAV object tracking |
Hai-jun WANG1( ),Wen-lai MA2,Sheng-yan ZHANG1 |
1. Key Laboratory of Aviation Information and Control in University of Shandong, Binzhou University, Binzhou 256603, China 2. College of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China |
|
|
Abstract A real-time UAV object tracker based on lightweight and attentional aggregation siamese network with anchor free scheme was proposed, aiming at the problems of viewpoint change, deformation and similar objects around in UAV tracking tasks. Considering the small number of object pixels in the view of UAV high-altitude platform, an efficient channel attention scheme was introduced to the two branches of template. Then semantics information and detail information can be effectively extracted. A spatial attention scheme was constructed to effectively aggregate attention and enlarge the visual field range after fusing the response of two layer. An anchor free mechanism was built to directly classify and predict the object box on each pixel, which can simplify the complexity of model and reduce the calculation cost. The proposed method was conducted on three public UAV data sets such as UAV123@10fps、UAV20L and DTB70, and compared with other state-of-the-art tracking algorithms. The experimental results show that the proposed method can track the target effectively in many challenging scenes with an average speed of 155.2 frame per second on three UAV benchmarks.
|
Received: 09 December 2022
Published: 18 October 2023
|
|
Fund: 山东省自然科学基金资助项目(ZR2020MF142);滨州学院博士启动基金资助项目(2021Y04);滨州学院重大科研基金资助项目(2019ZD03) |
注意力聚集无锚框的孪生网络无人机跟踪算法
针对无人机目标跟踪过程中经常出现角度变化、形变、相似物体干扰等问题,提出轻量级注意力聚集无锚框的孪生网络无人机实时目标跟踪算法. 考虑到无人机高空视角跟踪目标较小,在特征模板两分支中引入高效通道注意力机制,能够有效获取目标的语义信息和细节信息. 在融合两层响应的基础上,引入空间注意力机制,能够有效地聚合注意力特征,同时扩大模型的视野范围. 引入无锚框机制,针对每个像素进行分类和预测回归目标框,减少了模型复杂度,大大降低了计算量. 在UAV123@10fps、UAV20L和DTB70等无人机跟踪数据集上与多个当前比较流行的算法进行对比实验,结果表明,所提算法在3个无人机数据集上的平均跟踪速度达到155.2 帧/s,在多种复杂环境下,均能实现对目标的有效跟踪.
关键词:
无人机,
目标跟踪,
无锚框,
孪生网络,
通道注意力
|
|
[1] |
CHEN X, YAN B, ZHU J, et al. Transformer tracking[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 8122-8131.
|
|
|
[2] |
孙锐, 方林凤, 梁启丽, 等 孪生网络框架下融合显著性和干扰在线学习的航拍目标跟踪算法[J]. 电子与信息学报, 2021, 43 (5): 1414- 1423 SUN Rui, FANG Lin-feng, LIANG Qi-li, et al Siamese network combined learning saliency and online leaning interference for aerial object tracking algorithm[J]. Journal of Electronics and Information Technology, 2021, 43 (5): 1414- 1423
doi: 10.11999/JEIT200140
|
|
|
[3] |
王海军, 张圣燕, 杜玉杰 响应和滤波器偏差感知约束的无人机目标跟踪算法[J]. 浙江大学学报: 工学版, 2022, 56 (9): 1824- 1832 WANG Hai-jun, ZHANG Sheng-yan, DU Yu-jie UAV object tracking algorithm based on response and filter deviation-aware regularization[J]. Journal of Zhejiang University: Engineering Science, 2022, 56 (9): 1824- 1832
|
|
|
[4] |
刘芳, 王洪娟, 黄光伟, 等 基于自适应深度网络的无人机目标跟踪算法[J]. 航空学报, 2019, 40 (3): 322332 LIU Fang, WANG Hong-juan, HUANG Guang-wei, et al UAV target tracking algorithm based on adaptive depth network[J]. Acta Aeronautica et Astronautica Sinica, 2019, 40 (3): 322332
|
|
|
[5] |
BOLME D S, BEVERIDGE J R, DRAPER B A, et al. Visual object tracking using adaptive correlation filters[C]// IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2010: 2544-2550.
|
|
|
[6] |
HENRIQUES J F, CASEIRO R, MARTINS P, et al High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37 (3): 583- 596
doi: 10.1109/TPAMI.2014.2345390
|
|
|
[7] |
HUANG Z, FU C, LI Y, et al. Learning aberrance repressed correlation filters for real-time UAV tracking [C]// IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 2891-2900.
|
|
|
[8] |
LI Y, FU C, DING F, et al. Auto track: towards high-performance visual tracking for UAV with automatic spatio-temporal regularization [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11920-11929.
|
|
|
[9] |
LIN F, FU C, HE Y, et al. BiCF: learning bidirectional incongruity-aware correlation filter for efficient UAV object tracking [C]// IEEE International Conference on Robotics and Automation. Paris: IEEE, 2020: 2365-2371.
|
|
|
[10] |
BERTINETTO L, VALMADRE J, HENRIQUES J, et al. Fully-convolutional siamese networks for object tracking [C]// Computer Vision–ECCV Workshops. Amsterdam: Springer, 2016: 850-865.
|
|
|
[11] |
LI B, YAN J, WU W, et al. High performance visual tracking with siamese region proposal network [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8971-8980.
|
|
|
[12] |
LI B, WU W, WANG Q, et al. SiamRPN++: evolution of siamese visual tracking with very deep networks [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4277-4286.
|
|
|
[13] |
XU Y, WANG Z, LI Z, et al. SiamFC++: towards robust and accurate visual tracking with target estimation guidelines [C]// Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12549-12556.
|
|
|
[14] |
WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11531-11539.
|
|
|
[15] |
RAHMAN M M, FIAZ M, JUNG S K Efficient visual tracking with stacked channel-spatial attention learning[J]. IEEE Access, 2020, 8: 100857- 100869
doi: 10.1109/ACCESS.2020.2997917
|
|
|
[16] |
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Computer Vision–ECCV. Munich: Springer, 2018: 3-19.
|
|
|
[17] |
FU J, LIU J, LI Y, et al. Dual attention network for scene segmentation [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3141-3149.
|
|
|
[18] |
LIN T, MAIRE M, BELONGIE S, et al. Microsoft Coco: common objects in context [C]// Computer Vision- ECCV. Zurich: Springer, 2014: 740-755.
|
|
|
[19] |
RUSSAKOVSKY O, DENG J, SU H, et al ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115: 211- 252
doi: 10.1007/s11263-015-0816-y
|
|
|
[20] |
HUANG L, ZHAO X, HUANG K, GOT-10k: a large high-diversity benchmark for generic object tracking in the wild [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577.
|
|
|
[21] |
REAL E, SHLENS J, MAZZOCCHI S, et al. Youtube -bounding boxes: a large high-precision human-annotated data set for object detection in video [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 7464-7473.
|
|
|
[22] |
MUELLER M, SMITH N, GHANEM B, et al. A benchmark and simulator for UAV tracking [C]// Computer Vision–ECCV. Amsterdam: Springer, 2016: 445-461.
|
|
|
[23] |
LI, S, YEUNG D. Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models [C]// Proceedings of the AAAI Conference on Artificial Intelligence. San Francisco: AAAI, 2017: 4140-4146.
|
|
|
[24] |
FU C, CAO Z, LI Y, et al. Siamese anchor proposal network for high-speed aerial tracking [C]// IEEE International Conference on Robotics and Automation. Xi’an: IEEE, 2021: 510-516.
|
|
|
[25] |
DANELLJAN M, BHAT G, KHAN F S, et al. ECO: efficient convolution operators for tracking [C]// IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 6931-6939.
|
|
|
[26] |
WANG N, ZHOU W G, TIAN Q, et al. Multi-cue correlation filters for robust visual tracking [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4844-4853.
|
|
|
[27] |
LI F, TIAN C, ZUO W M, et al. Learning spatial-temporal regularized correlation filters for visual tracking [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4904-4913.
|
|
|
[28] |
ZHANG Z, PENG H, FU J, et al. Ocean: object-aware anchor-free tracking [C]// Computer Vision-ECCV. Glasgow: Springer, 2020: 771-787.
|
|
|
[29] |
SOSNOVIK I, MOSKALEV A, SMEULDERS A. Scale equivariance improves siamese tracking [C]// IEEE Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2021: 2764-2773.
|
|
|
[30] |
WANG N, ZHOU W, WANG J, et al. Transformer meets tracker: exploiting temporal context for robust visual tracking [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN: IEEE, 2021: 1571-1580.
|
|
|
[31] |
CHEN X, KANG B, WANG D, et al. Efficient visual tracking via hierarchical cross-attention transformer [EB/OL]. [2022-10-29]. https://arxiv.org/abs/ 2203.13537.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|