Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2021, Vol. 55 Issue (5): 966-975    DOI: 10.3785/j.issn.1008-973X.2021.05.017
    
An adaptive siamese network tracking algorithm based on global feature channel recognition
Peng SONG(),De-dong YANG*(),Chang LI,Chang GUO
School of Artificial Intelligence, Hebei University of Technology, Tianjin 300130, China
Download: HTML     PDF(1281KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

Siamese network target tracking algorithm only uses the feature extraction network to extract features, leading to tracking failures in occlusion, rotation, illumination and scale changes. An adaptive siamese network tracking algorithm with global feature channel recognition was proposed. The efficient channel attention module is introduced into the ResNet22 siamese network to improve the ability to distinguish features. The global feature recognition function is used to calculate global information, extract richer semantic information, and improve the accuracy of tracking algorithms. At the same time, an adaptive template update mechanism is introduced to solve the problem of template degradation caused by occlusion and long-term tracking. In order to verify the effectiveness of the proposed method, the proposed method was tested on public data sets such as OTB2015、VOT2016 and VOT2018, and compared with other tracking algorithms. Results show that the proposed algorithm performs well in accuracy and success rate. The proposed method is stable under background clutter, rotation, as well as illumination and scale changes.



Key wordsvisual tracking      siamese network      global feature recognition      channel attention      template update     
Received: 10 September 2020      Published: 10 June 2021
CLC:  TP 391.4  
Fund:  河北省自然科学基金资助项目(F2017202009);河北省创新能力提升计划资助项目(18961604H)
Corresponding Authors: De-dong YANG     E-mail: spgoup@foxmail.com;ydd12677@163.com
Cite this article:

Peng SONG,De-dong YANG,Chang LI,Chang GUO. An adaptive siamese network tracking algorithm based on global feature channel recognition. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 966-975.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2021.05.017     OR     http://www.zjujournals.com/eng/Y2021/V55/I5/966


整体特征通道识别的自适应孪生网络跟踪算法

针对孪生网络目标跟踪算法仅使用特征提取网络提取特征,在遮挡、旋转、光照与尺度变化中容易出现跟踪失败的问题,提出整体特征通道识别的自适应孪生网络跟踪算法. 将高效的通道注意力模块引入ResNet22孪生网络中,提高特征的判别能力. 使用整体特征识别功能计算全局信息,提取更为丰富的语义信息,提高跟踪算法精度. 同时,引入自适应模板更新机制,解决遮挡与长期跟踪导致的模板退化问题. 为了验证所提方法的有效性,在OTB2015、VOT2016与VOT2018等公开数据集上进行测试,并与其他跟踪算法进行对比. 结果表明,所提算法在精确度与成功率上表现较好,在背景杂乱、旋转、光照与尺度变化等情况中表现稳定.


关键词: 目标跟踪,  孪生网络,  整体特征识别,  通道注意力,  模板更新 
Fig.1 Framework of siamese network
Fig.2 Schematic diagram of adaptive siamese network tracking algorithm based on global feature channel recognition
层名 核尺寸 模板尺寸 搜索尺寸
input ? $127 \times 127$ $255 \times 255$
Conv1 $7 \times 7$,64 $64 \times 64$ $128 \times 128$
Crop ? $60 \times 60$ $124 \times 124$
Maxpool $2 \times 2$ $30 \times 30$ $62 \times 62$
Conv2_x+Crop×3 $\left[ {\begin{array}{*{20}{c}} {1 \times 1,64} \\ {3 \times 3,64} \\ {1 \times 1,256} \end{array}} \right]$ $24 \times 24$ $56 \times 56$
Conv3_1+Crop $\left[ {\begin{array}{*{20}{c}} {1 \times 1,{\rm{128}}} \\ {3 \times 3,{\rm{128}}} \\ {1 \times 1,{\rm{512}}} \end{array}} \right]$ $22 \times 22$ $54 \times 54$
Maxpool $2 \times 2$ $11 \times 11$ $27 \times 27$
Conv3_x+Crop×3 $\left[ {\begin{array}{*{20}{c}} {1 \times 1,{\rm{128}}} \\ {3 \times 3,{\rm{128}}} \\ {1 \times 1,{\rm{512}}} \end{array}} \right]$ $5 \times 5$ $21 \times 21$
Tab.1 Network parameters of ResNet22
Fig.3 Mechanism diagram of efficient channel attention
Fig.4 Schematic diagram of global feature recognition network
Fig.5 Results of ten tracking algorithms on OTB2015 dataset
算法 光照变化 面内旋转 低分辨率 遮挡 面外旋转 出视野 尺度变化 快速移动 背景干扰 运动模糊 形变
CFNET 0.706 0.768 0.760 0.703 0.741 0.536 0.727 0.716 0.734 0.633 0.696
SiamFC 0.741 0.742 0.847 0.726 0.756 0.669 0.738 0.743 0.690 0.705 0.693
SiamTri 0.752 0.774 0.897 0.730 0.763 0.723 0.752 0.763 0.715 0.727 0.683
LMCF 0.795 0.755 0.679 0.736 0.760 0.693 0.723 0.730 0.822 0.730 0.729
DSiamM 0.805 0.807 0.857 0.794 0.829 0.684 0.778 0.759 0.792 0.721 0.761
Staple 0.787 0.770 0.631 0.721 0.730 0.661 0.715 0.697 0.766 0.707 0.743
ECO-HC 0.792 0.783 0.798 0.806 0.811 0.737 0.805 0.792 0.824 0.780 0.818
DeepSRDCF 0.786 0.818 0.708 0.822 0.835 0.781 0.817 0.814 0.841 0.823 0.779
SiamDW 0.854 0.841 0.882 0.786 0.842 0.782 0.842 0.808 0.800 0.842 0.831
本研究算法 0.910 0.898 0.913 0.846 0.915 0.792 0.888 0.866 0.898 0.875 0.883
Tab.2 Accuracy of ten tracking algorithms on eleven attributes of OTB
算法 光照变化 面内旋转 低分辨率 遮挡 面外旋转 出视野 尺度变化 快速移动 背景干扰 运动模糊 形变
CFNET 0.551 0.572 0.576 0.542 0.547 0.423 0.552 0.558 0.565 0.514 0.510
SiamFC 0.574 0.557 0.592 0.547 0.558 0.506 0.556 0.568 0.523 0.550 0.510
SiamTri 0.585 0.580 0.634 0.554 0.563 0.543 0.567 0.585 0.542 0.567 0.504
LMCF 0.601 0.543 0.450 0.554 0.553 0.539 0.519 0.551 0.606 0.561 0.525
DSiamM 0.608 0.599 0.606 0.583 0.599 0.509 0.576 0.579 0.589 0.562 0.544
Staple 0.596 0.552 0.418 0.545 0.531 0.481 0.518 0.537 0.574 0.546 0.552
ECO-HC 0.615 0.567 0.562 0.605 0.594 0.549 0.599 0.614 0.618 0.616 0.601
DeepSRDCF 0.624 0.589 0.475 0.603 0.607 0.553 0.607 0.628 0.627 0.642 0.567
SiamDW 0.656 0.611 0.607 0.598 0.615 0.588 0.625 0.627 0.596 0.659 0.608
本研究算法 0.666 0.633 0.585 0.618 0.642 0.582 0.636 0.648 0.636 0.669 0.638
Tab.3 Success rate of ten tracking algorithms on eleven attributes of OTB
Fig.6 Ten algorithms' tracking results on four video sequences
Fig.7 EAO score results of seven tracking algorithms on VOT2016 dataset
跟踪算法 EAO 跟踪算法 EAO
本研究算法 0.3482 DeepSRDCF 0.2763
staple 0.2952 MDNET_N 0.2572
SiamDW 0.2885 SiamAN 0.2352
SiamRN 0.2766 ? ?
Tab.4 Performance evaluation of seven tracking algorithms on VOT2016 dataset
Fig.8 EAO score results of seven tracking algorithms on VOT2018 dataset
跟踪算法 EAO 跟踪算法 EAO
本研究算法 0.2610 SiamFC 0.1876
UpdateNet 0.2431 Staple 0.168 5
SiamDW 0.2262 DeepSRDCF 0.154 0
DSiam 0.195 9 ? ?
Tab.5 Performance evaluation of seven tracking algorithms on VOT2018 dataset
[1]   TANG S Y, ANDRILUKA M, ANDRES B, et al. Multiple people tracking by lifted multicut and person re-identification [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 3539-3548.
[2]   LEE K H, HWANG J N On-road pedestrian tracking across multiple driving recorders[J]. IEEE Transactions on Multimedia, 2015, 17 (9): 1429- 1438
doi: 10.1109/TMM.2015.2455418
[3]   TEUTSCH M, KRUGER W. Detection, segmentation, and tracking of moving objects in uav videos [C]// 2012 IEEE Ninth International Conference on Advanced Video and Signal-based Surveillance. Beijing: IEEE, 2012: 313-318.
[4]   SMEULDERS A W M, CHU D M, CUCCHIARA R, et al Visual tracking: an experimental survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36 (7): 1442- 1468
doi: 10.1109/TPAMI.2013.230
[5]   QI Y, ZHANG S, LEI Q, et al. Hedged deep tracking [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 4303-4311.
[6]   DANELLJAN M, HAGER G, SHAHBAZ K F, et al. Convolutional features for correlation filter based visual tracking [C]// 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 58-66.
[7]   DANELLJAN M, ROBINSON A, KHAN F S, et al. Beyond correlation filters: learning continuous convolution operators for visual tracking [C]// Computer Vision-ECCV 2016. Cham: Springer, 2016: 472-488.
[8]   DANELLJAN M, BHAT G, KHAN F S, et al. Eco: efficient convolution operators for tracking [C]// Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 21-26.
[9]   BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional siamese networks for object tracking [C]// Computer Vision-ECCV 2016. Cham: Springer, 2016: 850-865.
[10]   VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 5000-5008.
[11]   LI B, YAN J J, WU W, et al. High performance visual tracking with siamese region proposal network [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8971-8980.
[12]   WANG Q, TENG Z, XING J L, et al. Learning attentions: residual attentional siamese network for high performance online visual tracking [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4854-4863.
[13]   WU Y, LIM J, YANG M H Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37 (9): 1834- 1848
doi: 10.1109/TPAMI.2014.2388226
[14]   KRISTAN M, LEONARDIS A, MATAS J, et al. The visual object tracking VOT2016 challenge results [C]// 14th European Conference on Computer Vision. Amsterdam: Springer, 2016, 9914: 777-823.
[15]   ZHANG Z P, PENG H W. Deeper and wider siamese networks for real-time visual tracking [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4591-4600.
[16]   WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11531-11539.
[17]   CAO Y, X J R, LIN S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond [C]// 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul: IEEE, 2019: 1971-1980.
[18]   WANG M, LIU Y, HUANG Z. Large margin object tracking with circulant feature maps [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 4800-4808.
[19]   OLGA R, JIA D, HAO S, et al ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115 (3): 211- 252
doi: 10.1007/s11263-015-0816-y
[20]   HUANG L, ZHAO X, HUANG K. GOT-10k: a large high-diversity benchmark for generic object tracking in the wild [EB/OL]. [2020-05-18]. https://arxiv.org/abs/1810.11981.
[21]   GUO Q, FENG W, ZHOU C, et al. Learning dynamic siamese network for visual object tracking [C]// IEEE International Conference on Computer Vision. Venice: IEEE, 1781-1789.
[22]   BERTINETTO L, VALMADRE J, GOLODETZ S, et al. Staple: complementary learners for real-time tracking [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 1401-1409.
[23]   DONG X P, SHEN J B. Triplet loss in siamese network for object tracking [C]// Proceedings of European Conference on Computer Vision. Munich: Springer, 2018: 459–474.
[24]   WANG M M, LIU Y, HUANG Z Y. Large margin object tracking with circulant feature maps [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 4021-4029.
[25]   NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking [C]// IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 4293-4302.
[1] Yan-wei ZHAO,Jian ZHANG,Xian-ming ZHOU,Geng-yu WU. Dynamic tracking and precise landing of UAV based on visual magnetic guidance[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 96-108.
[2] Chang-zhen XIONG,Yan LU,Jia-qing YAN. Visual tracking algorithm based on anisotropic Gaussian distribution[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(2): 301-310.
[3] Kang-hao WANG,Hai-bing YIN,Xiao-feng HUANG. Visual object tracking based on policy gradient[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(10): 1923-1928.
[4] Chang-zhen XIONG,Run-ling WANG,Jian-cheng ZOU. Real-time tracking algorithm based on multiple Gaussian-distribution correlation filters[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(8): 1488-1495.
[5] YU Hui-min, ZENG Xiong. Visual tracking combined with ranking vector SVM[J]. Journal of ZheJiang University (Engineering Science), 2015, 49(6): 1015-1021.