Please wait a minute...
浙江大学学报(工学版)  2023, Vol. 57 Issue (7): 1307-1316    DOI: 10.3785/j.issn.1008-973X.2023.07.005
自动化技术     
基于双注意力机制的多分支孪生网络目标跟踪
李晓艳(),王鹏*(),郭嘉,李雪,孙梦宇
西安工业大学 电子信息工程学院,陕西 西安 710021
Multi branch Siamese network target tracking based on double attention mechanism
Xiao-yan LI(),Peng WANG*(),Jia GUO,Xue LI,Meng-yu SUN
School of Electronic and Information Engineering, Xi’an Technological University, Xi’an 710021, China
 全文: PDF(2692 KB)   HTML
摘要:

为了解决SiamRPN++单目标跟踪算法在目标被短时遮挡及外观剧烈变化时定位不准确的问题,提出基于双注意力机制的多分支孪生网络目标跟踪算法. 采用具有轻量化主干网络的SiamRPN++为基础算法,结合轻量化的通道和空间注意力机制,提升跟踪过程中应对遮挡挑战时的抗干扰能力. 新增上一帧模板分支,动态更新目标外观变化,利用三元组损失增强跟踪过程中前景与背景的判别能力. 根据目标的移动速度进行局部扩大搜索,使目标被短时遮挡后仍可以及时、准确地跟踪到目标. 实验结果表明,改进后的算法在OTB100数据集的成功率和精确度较原算法分别提高了2.4%和1.6%,平均中心位置误差降低了28.97个像素,平均重叠率提高了14.5%.

关键词: 孪生网络注意力机制模板更新局部扩大    
Abstract:

A multi branch Siamese network target tracking algorithm based on dual attention mechanism was proposed in order to solve the problem of inaccurate positioning in the SiamRPN++ single target tracking algorithm when the target was briefly occluded and the appearance drastically changed. SiamRPN++ with lightweight backbone network was adopted as the basic algorithm. The algorithm was combined with lightweight channel and spatial attention mechanism in order to improve the anti-interference ability when dealing with occlusion challenges during the tracking process. A template branch was added from the previous frame, and the appearance changes of the target were dynamically updated. The ability to distinguish between foreground and background was enhanced during the tracking process using triplet loss. Local expansion search was conducted based on the speed of the target’s movement in order to enable timely and accurate tracking of the target even after short-term occlusion. The experimental results showed that the improved algorithm improved the success rate and precision of the OTB100 dataset by 2.4% and 1.6%, respectively, compared to the original algorithm. The average center position error decreased by 28.97 pixels, and the average overlap rate increased by 14.5%.

Key words: Siamese network    attention mechanism    template update    local enlargement
收稿日期: 2022-06-20 出版日期: 2023-07-17
CLC:  TP 391  
基金资助: 国家自然基金资助项目(62171360);陕西省科技厅重点研发计划资助项目(2022GY-110);2023年陕西省高校工程研究中心资助项目;2022年度陕西高校青年创新团队资助项目;山东省智慧交通重点实验室(筹)
通讯作者: 王鹏     E-mail: 76469715@qq.com;wang_peng@xatu.edu.cn
作者简介: 李晓艳(1982—),女,副教授,从事目标识别与跟踪的研究. orcid.org/0000-0002-1459-428X. E-mail: 76469715@qq.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
李晓艳
王鹏
郭嘉
李雪
孙梦宇

引用本文:

李晓艳,王鹏,郭嘉,李雪,孙梦宇. 基于双注意力机制的多分支孪生网络目标跟踪[J]. 浙江大学学报(工学版), 2023, 57(7): 1307-1316.

Xiao-yan LI,Peng WANG,Jia GUO,Xue LI,Meng-yu SUN. Multi branch Siamese network target tracking based on double attention mechanism. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1307-1316.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.07.005        https://www.zjujournals.com/eng/CN/Y2023/V57/I7/1307

图 1  提出算法模型的总体结构图
图 2  结合双注意力的RPN结构图
图 3  通道注意力模块结构图
图 4  全局上下文注意力模块的结构图
图 5  模板更新的示意图
视频序列 视频序列属性
Suv 短时完全遮挡,平面内旋转,超出视野范围
Skating1 光照变化,尺度变化,遮挡,形变,平面外翻转,背景干扰
Girl2 短时遮挡,旋转,尺度变化,形变,快速移动
Ironman 光照变化,尺度变化,复杂部分与完全遮挡,平面内、外旋转,背景干扰,快速移动,运动模糊,超出视野,低分辨率
表 1  跟踪视频序列及属性
图 6  定性分析结果图
图 7  中心位置误差结果的曲线图
图 8  重叠率结果曲线图
算法 lavg Oavg
DSiamRPNMEG++ 13.22 0.69
SiamRPNMEG++ 16.57 0.63
SiamRPNM++ 50.32 0.57
DaSiamRPN 22.56 0.59
表 2  10个跟踪视频序列下的平均中心位置误差与重叠率
图 9  OTB100数据集下的成功率和精确度结果曲线图
图 10  OTB100数据集下包含遮挡和形变属性序列下的总体成功率结果曲线图
图 11  OTB100数据集下包含遮挡和形变属性序列下的总体精确度结果曲线图
1 BERTINETTO L, VALMADRE J, HENRIQUE J F, et al. Fully-convolutional siamese networks for object tracking [C]// European Conference on Computer Vision. Berlin: Springer, 2016: 850-865.
2 VALMADRE J, BERTINETTO L, HENRIQUES J, et al. End-to-end representation learning for correlation filter based tracking [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 2805-2813.
3 HE A, LUO C, TIAN X, et al. A twofold Siamese network for real-time object tracking [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4834-4843.
4 LI B, WU W, WANG Q, et al. SiamRPN++: evolution of siamese visual tracking with very deep networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4282-4291.
5 HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
6 WANG Q, ZHANG L, BERTINETTO L, et al. Fast online object tracking and segmentation: a unifying approach [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 1328-1338.
7 XU Y, WANG Z, LI Z, et al. SiamFC++: towards robust and accurate visual tracking with target estimation guidelines [C]// Proceedings of the AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12549-12556.
8 VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [EB/OL]. [2022-06-15].https://xueshu.baidu.com/usercenter/paper/show?paperid=93f237b1172b174c55f3bdfd91d2f2d2&site=xueshu_se&apm;hitarticle=1, 2022.6.10.
9 WANG Q L, WU B G, ZHU P F, et al. ECA-Net: efficient channel attention for deep convolutional neural networks [C]// Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11531-11539.
10 HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141.
11 CAO Y, XU J, LIN S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond [C]// Proceedings of the IEEE International Conference on Computer Vision Workshops. Seoul: IEEE, 2019.
[1] 韩俊,袁小平,王准,陈烨. 基于YOLOv5s的无人机密集小目标检测算法[J]. 浙江大学学报(工学版), 2023, 57(6): 1224-1233.
[2] 项学泳,王力,宗文鹏,李广云. ASIS模块支持下融合注意力机制KNN的点云实例分割算法[J]. 浙江大学学报(工学版), 2023, 57(5): 875-882.
[3] 苏育挺,陆荣烜,张为. 基于注意力和自适应权重的车辆重识别算法[J]. 浙江大学学报(工学版), 2023, 57(4): 712-718.
[4] 卞佰成,陈田,吴入军,刘军. 基于改进YOLOv3的印刷电路板缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(4): 735-743.
[5] 程艳芬,吴家俊,何凡. 基于关系门控图卷积网络的方面级情感分析[J]. 浙江大学学报(工学版), 2023, 57(3): 437-445.
[6] 曾耀,高法钦. 基于改进YOLOv5的电子元件表面缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(3): 455-465.
[7] 杨帆,宁博,李怀清,周新,李冠宇. 基于语义增强特征融合的多模态图像检索模型[J]. 浙江大学学报(工学版), 2023, 57(2): 252-258.
[8] 刘超,孔兵,杜国王,周丽华,陈红梅,包崇明. 高阶互信息最大化与伪标签指导的深度聚类[J]. 浙江大学学报(工学版), 2023, 57(2): 299-309.
[9] 王林涛,毛齐. 基于RGB与深度信息融合的管片抓取位置测量方法[J]. 浙江大学学报(工学版), 2023, 57(1): 47-54.
[10] 凤丽洲,杨阳,王友卫,杨贵军. 基于Transformer和知识图谱的新闻推荐新方法[J]. 浙江大学学报(工学版), 2023, 57(1): 133-143.
[11] 郝琨,王阔,王贝贝. 基于改进Mobilenet-YOLOv3的轻量级水下生物检测算法[J]. 浙江大学学报(工学版), 2022, 56(8): 1622-1632.
[12] 莫仁鹏,司小胜,李天梅,朱旭. 基于多尺度特征与注意力机制的轴承寿命预测[J]. 浙江大学学报(工学版), 2022, 56(7): 1447-1456.
[13] 鞠晓臣,赵欣欣,钱胜胜. 基于自注意力机制的桥梁螺栓检测算法[J]. 浙江大学学报(工学版), 2022, 56(5): 901-908.
[14] 王友卫,童爽,凤丽洲,朱建明,李洋,陈福. 基于图卷积网络的归纳式微博谣言检测新方法[J]. 浙江大学学报(工学版), 2022, 56(5): 956-966.
[15] 张雪芹,李天任. 基于Cycle-GAN和改进DPN网络的乳腺癌病理图像分类[J]. 浙江大学学报(工学版), 2022, 56(4): 727-735.