Attention aggregation siamese network with anchor free scheme for UAV object tracking

doi:10.3785/j.issn.1008-973X.2023.10.004

Journal of ZheJiang University (Engineering Science)

2023, Vol. 57

Issue (10): 1945-1954 DOI: 10.3785/j.issn.1008-973X.2023.10.004

Attention aggregation siamese network with anchor free scheme for UAV object tracking

Hai-jun WANG1(

),Wen-lai MA2,Sheng-yan ZHANG1

1. Key Laboratory of Aviation Information and Control in University of Shandong, Binzhou University, Binzhou 256603, China
2. College of Civil Aviation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China

Download:

HTML

PDF(2572KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A real-time UAV object tracker based on lightweight and attentional aggregation siamese network with anchor free scheme was proposed, aiming at the problems of viewpoint change, deformation and similar objects around in UAV tracking tasks. Considering the small number of object pixels in the view of UAV high-altitude platform, an efficient channel attention scheme was introduced to the two branches of template. Then semantics information and detail information can be effectively extracted. A spatial attention scheme was constructed to effectively aggregate attention and enlarge the visual field range after fusing the response of two layer. An anchor free mechanism was built to directly classify and predict the object box on each pixel, which can simplify the complexity of model and reduce the calculation cost. The proposed method was conducted on three public UAV data sets such as UAV123@10fps、UAV20L and DTB70, and compared with other state-of-the-art tracking algorithms. The experimental results show that the proposed method can track the target effectively in many challenging scenes with an average speed of 155.2 frame per second on three UAV benchmarks.

Key words： unmanned aerial vehicle object tracking anchor free siamese network channel attention

Received: 09 December 2022 Published: 18 October 2023

CLC:

TP 391.41

Fund: 山东省自然科学基金资助项目（ZR2020MF142）；滨州学院博士启动基金资助项目（2021Y04）；滨州学院重大科研基金资助项目（2019ZD03）

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Hai-jun WANG
	Wen-lai MA
	Sheng-yan ZHANG

Cite this article:

Hai-jun WANG,Wen-lai MA,Sheng-yan ZHANG. Attention aggregation siamese network with anchor free scheme for UAV object tracking. Journal of ZheJiang University (Engineering Science), 2023, 57(10): 1945-1954.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2023.10.004 OR https://www.zjujournals.com/eng/Y2023/V57/I10/1945

注意力聚集无锚框的孪生网络无人机跟踪算法

针对无人机目标跟踪过程中经常出现角度变化、形变、相似物体干扰等问题，提出轻量级注意力聚集无锚框的孪生网络无人机实时目标跟踪算法. 考虑到无人机高空视角跟踪目标较小，在特征模板两分支中引入高效通道注意力机制，能够有效获取目标的语义信息和细节信息. 在融合两层响应的基础上，引入空间注意力机制，能够有效地聚合注意力特征，同时扩大模型的视野范围. 引入无锚框机制，针对每个像素进行分类和预测回归目标框，减少了模型复杂度，大大降低了计算量. 在UAV123@10fps、UAV20L和DTB70等无人机跟踪数据集上与多个当前比较流行的算法进行对比实验，结果表明，所提算法在3个无人机数据集上的平均跟踪速度达到155.2 帧/s，在多种复杂环境下，均能实现对目标的有效跟踪.

关键词： 无人机, 目标跟踪, 无锚框, 孪生网络, 通道注意力

Fig.1 Flowchart of SiamFC algorithm

Fig.2 Flowchart of AASAF algorithm

Fig.3 Mechanism diagram of channel attention

Fig.4 Mechanism diagram of spatial attention

Tab.1 Comparison in terms of precision and success rate for eleven tracking algorithms on three UAV datasets

Fig.5 Comparison in terms of precision and success rate of 7 tracking algorithms for different attributes on three UAV datasets

Fig.6 Partial tracking results of different algorithms

Tab.2 Comparison of tracking speed on UAV123@10pfs、UAV20L and DTB70 by different algorithms

Tab.3 Comparison of tracking performance conducted on baseline algorithm with different attention schemes on three UAV datasets

Tab.4 Comparison of tracking performance with transformer based trackers on UAV20L


[1]	CHEN X, YAN B, ZHU J, et al. Transformer tracking[C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 8122-8131.

[2]	孙锐, 方林凤, 梁启丽, 等孪生网络框架下融合显著性和干扰在线学习的航拍目标跟踪算法[J]. 电子与信息学报, 2021, 43 (5): 1414- 1423 SUN Rui, FANG Lin-feng, LIANG Qi-li, et al Siamese network combined learning saliency and online leaning interference for aerial object tracking algorithm[J]. Journal of Electronics and Information Technology, 2021, 43 (5): 1414- 1423 doi: 10.11999/JEIT200140

[3]	王海军, 张圣燕, 杜玉杰响应和滤波器偏差感知约束的无人机目标跟踪算法[J]. 浙江大学学报: 工学版, 2022, 56 (9): 1824- 1832 WANG Hai-jun, ZHANG Sheng-yan, DU Yu-jie UAV object tracking algorithm based on response and filter deviation-aware regularization[J]. Journal of Zhejiang University: Engineering Science, 2022, 56 (9): 1824- 1832

[4]	刘芳, 王洪娟, 黄光伟, 等基于自适应深度网络的无人机目标跟踪算法[J]. 航空学报, 2019, 40 (3): 322332 LIU Fang, WANG Hong-juan, HUANG Guang-wei, et al UAV target tracking algorithm based on adaptive depth network[J]. Acta Aeronautica et Astronautica Sinica, 2019, 40 (3): 322332

[5]	BOLME D S, BEVERIDGE J R, DRAPER B A, et al. Visual object tracking using adaptive correlation filters[C]// IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2010: 2544-2550.

[6]	HENRIQUES J F, CASEIRO R, MARTINS P, et al High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37 (3): 583- 596 doi: 10.1109/TPAMI.2014.2345390

[7]	HUANG Z, FU C, LI Y, et al. Learning aberrance repressed correlation filters for real-time UAV tracking [C]// IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 2891-2900.

[8]	LI Y, FU C, DING F, et al. Auto track: towards high-performance visual tracking for UAV with automatic spatio-temporal regularization [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11920-11929.

[9]	LIN F, FU C, HE Y, et al. BiCF: learning bidirectional incongruity-aware correlation filter for efficient UAV object tracking [C]// IEEE International Conference on Robotics and Automation. Paris: IEEE, 2020: 2365-2371.

[10]	BERTINETTO L, VALMADRE J, HENRIQUES J, et al. Fully-convolutional siamese networks for object tracking [C]// Computer Vision–ECCV Workshops. Amsterdam: Springer, 2016: 850-865.

[11]	LI B, YAN J, WU W, et al. High performance visual tracking with siamese region proposal network [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8971-8980.

[12]	LI B, WU W, WANG Q, et al. SiamRPN++: evolution of siamese visual tracking with very deep networks [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4277-4286.

[13]	XU Y, WANG Z, LI Z, et al. SiamFC++: towards robust and accurate visual tracking with target estimation guidelines [C]// Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12549-12556.

[14]	WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11531-11539.

[15]	RAHMAN M M, FIAZ M, JUNG S K Efficient visual tracking with stacked channel-spatial attention learning[J]. IEEE Access, 2020, 8: 100857- 100869 doi: 10.1109/ACCESS.2020.2997917

[16]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Computer Vision–ECCV. Munich: Springer, 2018: 3-19.

[17]	FU J, LIU J, LI Y, et al. Dual attention network for scene segmentation [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3141-3149.

[18]	LIN T, MAIRE M, BELONGIE S, et al. Microsoft Coco: common objects in context [C]// Computer Vision- ECCV. Zurich: Springer, 2014: 740-755.

[19]	RUSSAKOVSKY O, DENG J, SU H, et al ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115: 211- 252 doi: 10.1007/s11263-015-0816-y

[20]	HUANG L, ZHAO X, HUANG K, GOT-10k: a large high-diversity benchmark for generic object tracking in the wild [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1562-1577.

[21]	REAL E, SHLENS J, MAZZOCCHI S, et al. Youtube -bounding boxes: a large high-precision human-annotated data set for object detection in video [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 7464-7473.

[22]	MUELLER M, SMITH N, GHANEM B, et al. A benchmark and simulator for UAV tracking [C]// Computer Vision–ECCV. Amsterdam: Springer, 2016: 445-461.

[23]	LI, S, YEUNG D. Visual object tracking for unmanned aerial vehicles: a benchmark and new motion models [C]// Proceedings of the AAAI Conference on Artificial Intelligence. San Francisco: AAAI, 2017: 4140-4146.

[24]	FU C, CAO Z, LI Y, et al. Siamese anchor proposal network for high-speed aerial tracking [C]// IEEE International Conference on Robotics and Automation. Xi’an: IEEE, 2021: 510-516.

[25]	DANELLJAN M, BHAT G, KHAN F S, et al. ECO: efficient convolution operators for tracking [C]// IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 6931-6939.

[26]	WANG N, ZHOU W G, TIAN Q, et al. Multi-cue correlation filters for robust visual tracking [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4844-4853.

[27]	LI F, TIAN C, ZUO W M, et al. Learning spatial-temporal regularized correlation filters for visual tracking [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4904-4913.

[28]	ZHANG Z, PENG H, FU J, et al. Ocean: object-aware anchor-free tracking [C]// Computer Vision-ECCV. Glasgow: Springer, 2020: 771-787.

[29]	SOSNOVIK I, MOSKALEV A, SMEULDERS A. Scale equivariance improves siamese tracking [C]// IEEE Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2021: 2764-2773.

[30]	WANG N, ZHOU W, WANG J, et al. Transformer meets tracker: exploiting temporal context for robust visual tracking [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, TN: IEEE, 2021: 1571-1580.

[31]	CHEN X, KANG B, WANG D, et al. Efficient visual tracking via hierarchical cross-attention transformer [EB/OL]. [2022-10-29]. https://arxiv.org/abs/ 2203.13537.

[1]	Xiao-yan LI,Peng WANG,Jia GUO,Xue LI,Meng-yu SUN. Multi branch Siamese network target tracking based on double attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1307-1316.

[2]	Hai-jun WANG,Sheng-yan ZHANG,Yu-jie DU. UAV object tracking algorithm based on response and filter deviation-aware regularization[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(9): 1824-1832.

[3]	Xia HUA,Xin-qing WANG,Ting RUI,Fa-ming SHAO,Dong WANG. Vision-driven end-to-end maneuvering object tracking of UAV[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(7): 1464-1472.

[4]	Xun ZHANG,Jian-sheng LI,Wen OUYANG,Run-ze CHEN,Zhen JI,Kai ZHENG. Efficient convolution operators integrating motion information and tracking evaluation[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(6): 1135-1143, 1167.

[5]	Si-peng WANG,Chang-ping DU,Guang-hua SONG,Yao ZHENG. Indoor positioning method of UAV based on improved MSCKF algorithm[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 711-717.

[6]	Yan BIAN,Yu-sheng GONG,Guo-peng MA,Chang WANG. Water extraction from unmanned aerial vehicle remote sensing images[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 764-774.

[7]	Jing-hui CHU,Li-dong SHI,Pei-guang JING,Wei LV. Context-aware knowledge distillation network for object detection[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 503-509.

[8]	Dong-jie YANG,Xian-jun GAO,Shu-hao RAN,Guang-bin ZHANG,Ping WANG,Yuan-wei YANG. Building extraction based on multiple multiscale-feature fusion attention network[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(10): 1924-1934.

[9]	Xiao-chen SONG,Xiao-fan YAO,Shang-jun YE. Trajectory optimization of small supersonic unmanned aerial vehicle based on pseudo-spectral method[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 193-201.

[10]	Li-sheng JIN,Qiang HUA,Bai-cang GUO,Xian-yi XIE,Fu-gang YAN,Bo-tao WU. Multi-target tracking of vehicles based on optimized DeepSort[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(6): 1056-1064.

[11]	Peng SONG,De-dong YANG,Chang LI,Chang GUO. An adaptive siamese network tracking algorithm based on global feature channel recognition[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 966-975.

[12]	Si-xiao WANG,Wen-jun ZHAO,Hao ZHANG,Yong GAO,Pu-sen LI. Cascade TD-PID control algorithm for coaxial anti-propeller unmanned aerial vehicle based on tracking differentiator[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(12): 2359-2364.

[13]	Kang-hao WANG,Hai-bing YIN,Xiao-feng HUANG. Visual object tracking based on policy gradient[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(10): 1923-1928.

[14]	Zhi-peng XI,Zhuo LOU,Xiao-xia LI,Yan SUN,Qiang YANG,Wen-jun YAN. Vision-based localization and navigation for UAV inspection in photovoltaic farms[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(5): 880-888.

[15]	DING Qiang, TAO Wei-ming. Improved Tau-H strategy for four-dimensional cooperative route planning of multi-UAVs[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(7): 1398-1405.

Viewed

Full text

Abstract

Cited

Shared

Discussed