基于策略梯度的目标跟踪方法

doi:10.3785/j.issn.1008-973X.2020.10.008

浙江大学学报(工学版)

2020, Vol. 54

Issue (10): 1923-1928 DOI: 10.3785/j.issn.1008-973X.2020.10.008

计算机技术

基于策略梯度的目标跟踪方法

王康豪(

),殷海兵*(

),黄晓峰

杭州电子科技大学通信工程学院，浙江杭州 310018

Visual object tracking based on policy gradient

Kang-hao WANG(

),Hai-bing YIN*(

),Xiao-feng HUANG

College of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China

全文: PDF(847 KB) HTML

摘要：

针对目标跟踪过程中的遮挡、形变和快速运动等问题，提出基于策略梯度的目标跟踪方法. 该方法利用策略梯度算法训练策略网络. 该策略网络能够根据当前跟踪结果的可靠性进行动作决策，以避免错误的模板更新或者重新检测丢失的目标. 在决策过程中，通过计算加权置信度差值分析当前跟踪结果的鲁棒性和准确性，使得策略网络能够更准确地评估跟踪结果. 在重检测过程中，提出有效的重检测方法，对大量的搜索区域进行过滤，大大提高了搜索效率，利用决策模块检验重检测结果，确保重检测结果的准确性. 利用提出的算法在OTB数据集及LaSOT数据集上进行评估. 实验结果表明，提出的跟踪算法在原算法的基础上提高了2.5%~4.0%的性能.

关键词： 目标跟踪; 决策; 策略梯度; 重检测; 模板更新

Abstract:

An object tracking method based on policy gradient was proposed aiming at the problems of occlusion, deformation and fast motion in the process of object tracking. The policy gradient algorithm was used to train the policy network. The policy network can make action decisions founded on the reliability of current tracking results to avoid the incorrect template update or re-detect the missing targets. During the decision-making process, the robustness and accuracy of the current tracking result were both analyzed by calculating the weighted confidence margin, which helped the policy network to evaluate the tracking results more accurately. During the re-detection process, an efficient re-detection method was proposed to filter a large number of searching areas, which greatly improved the search efficiency. The decision-making module was utilized to examine the re-detected result, which ensured the accuracy of the re-detected results. The proposed algorithm was evaluated on OTB dataset and LaSOT dataset. The experimental results show that the proposed tracking algorithm improves performance by 2.5%-4.0% based on the original algorithm.

Key words: visual object tracking decision making policy gradient re-detection template update

收稿日期: 2019-09-05 出版日期: 2020-10-28

CLC:

TP 391

基金资助: 国家自然科学基金资助项目（61572449，61972123，61901150）；科技部重点研发课题资助项目（2018YFC0830106）；浙江省自然科学基金资助项目（Q19F010030）

通讯作者: 殷海兵 E-mail: wangkh@hdu.edu.cn;yhb@hdu.edu.cn

作者简介: 王康豪（1995—），男，硕士生，从事计算机视觉的研究. orcid.org/0000-0001-6127-2059. E-mail： wangkh@hdu.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	王康豪
	殷海兵
	黄晓峰

引用本文:

王康豪,殷海兵,黄晓峰. 基于策略梯度的目标跟踪方法[J]. 浙江大学学报(工学版), 2020, 54(10): 1923-1928.

Kang-hao WANG,Hai-bing YIN,Xiao-feng HUANG. Visual object tracking based on policy gradient. Journal of ZheJiang University (Engineering Science), 2020, 54(10): 1923-1928.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2020.10.008 或 http://www.zjujournals.com/eng/CN/Y2020/V54/I10/1923

图 1 提出算法的总体框架

图 2 马尔可夫决策过程

图 3 OTB数据集的OPE结果比较

表 1 不同设定下的算法性能对比

图 4 视频序列跟踪结果

1	熊昌镇, 王润玲, 邹建成基于多高斯相关滤波的实时跟踪算法[J]. 浙江大学学报: 工学版, 2019, 53 (8): 1488- 1495 XIONG Chang-zhen, WANG Run-ling, ZOU Jian-cheng Real time tracking algorithm based on multi Gaussian correlation filtering[J]. Journal of Zhejiang University: Engineering Science, 2019, 53 (8): 1488- 1495
2	WANG N, ZHOU W, LI H Reliable re-detection for long-term tracking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2019, 29 (3): 730- 743 doi: 10.1109/TCSVT.2018.2816570
3	MA C, YANG X, ZHANG C, et al. Long-term correlation tracking [C]// Proceedings of CVPR. Boston: IEEE, 2015: 5388-5396.
4	BOLME D, BEVERIDGE J, DRAPER B, et al. Visual object tracking using adaptive correlation filters [C]// Proceedings of CVPR. San Francisco: IEEE, 2010: 2544-2550.
5	WANG M, LIU Y, HUANG Z. Large margin object tracking with circulant feature maps [C]// Proceedings of CVPR. Hawaii: IEEE, 2017: 4021-4029.
6	HUANG C, LUCEY S, RAMANAN D. Learning policies for adaptive tracking with deep feature cascades [C]// Proceedings of ICCV. Venice: IEEE, 2017: 105-114.
7	CHOI J, KWON J, LEE K Real-time visual tracking by deep reinforced decision making[J]. Computer Vision and Image Understanding, 2018, 171 (2): 10- 19
8	SUPANCIC J, RAMANAN D. Tracking as online decision-making: learning a policy from streaming videos with reinforce-ment learning [C]// Proceedings of ICCV. Venice: IEEE, 2017: 322-331.
9	BERTINETTO L, VALMADRE J, HENRIQUES J F, et al. Fully-convolutional Siamese networks for object tracking [C]// Proceedings of ECCV. Amsterdam: Springer, 2016: 850–865.
10	HAUSKNECHT M, STONE P. Deep recurrent Q-learning for partially observable MDPs [C]// Proceedings of AAAI. Austin: Springer, 2015: 29-37.
11	BHAT G, JOHNANDER J, DANELLJAN M, et al. Unveiling the power of deep tracking [C]// Proceedings of ECCV. Munich: Springer, 2018: 483-498.
12	江宝安, 卢焕章粒子滤波器及其在目标跟踪中的应用[J]. 雷达科学与技术, 2003, (3): 170- 174 JIANG Bao-an, LU Huan-zhang Particle filter and its application in object tracking[J]. Radar Science and Technology, 2003, (3): 170- 174 doi: 10.3969/j.issn.1672-2337.2003.03.010
13	FAN H, LIN L, YANG F, et al. LaSOT: a high-quality benchmark for large-scale single object tracking [C]// Proceedings of CVPR. Long Beach: IEEE, 2019: 5374-5383.
14	WU Y, LIM J, YANG M. Online object tracking: a benchmark [C]// Proceedings of CVPR. Portland: IEEE, 2013: 2411-2418.
15	HENRIQUES J, CASEIRO R, MARTINS P, et al High speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37 (3): 583- 596 doi: 10.1109/TPAMI.2014.2345390
16	DANELLJAN M, HAGER G, KHAN F, et al. Accurate scale estimation for robust visual tracking [C]// Proceedings of British Machine Vision Conference. Nottingham: BMVA, 2014: 1–11.
17	LI Y, ZHU J. A scale adaptive kernel correlation filter tracker with feature integration [C]// Proceedings of ECCV. Heidelberg: Springer, 2014: 254–265.

[1]	黎振宇,陈晓国,宋永超,龚建平,余志纬,朱永兴,鲁周勋. 二元联系数-投影灰靶决策理论在电网应急能力评估中的应用[J]. 浙江大学学报(工学版), 2021, 55(5): 927-934.
[2]	宋鹏,杨德东,李畅,郭畅. 整体特征通道识别的自适应孪生网络跟踪算法[J]. 浙江大学学报(工学版), 2021, 55(5): 966-975.
[3]	黄发明,曹中山,姚池,姜清辉,陈佳武. 基于决策树和有效降雨强度的滑坡危险性预警[J]. 浙江大学学报(工学版), 2021, 55(3): 472-482.
[4]	苗发盛,吴益平,李麟玮,廖康,薛阳. 基于Boosting-决策树C5.0的岩体结构面粗糙度预测[J]. 浙江大学学报(工学版), 2021, 55(3): 483-490.
[5]	何大治,李晓克,李明明. 考虑视域影响的疏散行为建模及双向行人流仿真[J]. 浙江大学学报(工学版), 2020, 54(6): 1185-1193.
[6]	王飞,龚国芳,段理文,秦永峰. 基于XGBoost的隧道掘进机操作参数智能决策系统设计[J]. 浙江大学学报(工学版), 2020, 54(4): 633-641.
[7]	樊佳爽,余隋怀,初建杰,王卉,陈晨,寸文哲,陈甜,郭家言. 云服务模式下设计方案的优选决策方法[J]. 浙江大学学报(工学版), 2020, 54(1): 143-151.
[8]	陈健,莫蓉,余隋怀,初建杰,陈登凯,宫静. 云环境下众包产品造型设计方案多目标群体决策[J]. 浙江大学学报(工学版), 2019, 53(8): 1517-1524.
[9]	徐兵,刘潇,汪子扬,刘飞虎,梁军. 采用梯度提升决策树的车辆换道融合决策模型[J]. 浙江大学学报(工学版), 2019, 53(6): 1171-1181.
[10]	荆丹翔,韩军,徐志伟,陈鹰. 基于成像声呐的水下多目标跟踪研究[J]. 浙江大学学报(工学版), 2019, 53(4): 753-760.
[11]	都明宇, 鲍官军, 杨庆华, 王志恒, 张立彬. 基于改进支持向量机的人手动作模式识别方法[J]. 浙江大学学报(工学版), 2018, 52(7): 1239-1246.
[12]	陆志强, 胡鑫铭, 杜鑫. 物料配送与线边存储集成决策模型与算法[J]. 浙江大学学报(工学版), 2018, 52(7): 1354-1363.
[13]	郝楠楠, 郑兵, 张超勇. 灰模糊积分在多目标决策中的应用[J]. 浙江大学学报(工学版), 2018, 52(4): 663-673.
[14]	刘如辉, 黄炜平, 王凯, 刘创, 梁军. 半监督约束集成的快速密度峰值聚类算法[J]. 浙江大学学报(工学版), 2018, 52(11): 2191-2200.
[15]	宿紫鹏, 杨磊, 杨家强, 高敏. 基于开关表决策的APF与TSC混合系统投切控制方法[J]. 浙江大学学报(工学版), 2018, 52(11): 2201-2209.

Viewed

Full text

Abstract

Cited

Shared

Discussed