Please wait a minute...
浙江大学学报(工学版)  2021, Vol. 55 Issue (6): 1056-1064    DOI: 10.3785/j.issn.1008.973X.2021.06.005
交通工程、土木工程     
基于优化DeepSort的前方车辆多目标跟踪
金立生1,2(),华强3,郭柏苍1,谢宪毅1,*(),闫福刚3,武波涛4
1. 燕山大学 车辆与能源学院,河北 秦皇岛 066004
2. 燕山大学 河北省特种运载装备重点实验室,河北 秦皇岛 066004
3. 吉林大学 交通学院,吉林 长春 130022
4. 河北机电职业技术学院 汽车工程系,河北 邢台 054000
Multi-target tracking of vehicles based on optimized DeepSort
Li-sheng JIN1,2(),Qiang HUA3,Bai-cang GUO1,Xian-yi XIE1,*(),Fu-gang YAN3,Bo-tao WU4
1. School of Vehicle and Energy, Yanshan University, Qinhuangdao 066004, China
2. Hebei Key Laboratory of Special Delivery Equipment, Yanshan University, Qinhuangdao 066004, China
3. Transportation College, Jilin University, Changchun 130022, China
4. Department of Automotive Engineering, Hebei Institute of Mechanical and Electrical Technology, Xingtai 054000, China
 全文: PDF(1014 KB)   HTML
摘要:

为了提升自动驾驶汽车对周边环境的感知能力,提出优化DeepSort的前方多车辆目标跟踪算法. 采用Gaussian YOLO v3作为前端目标检测器,基于DarkNet-53骨干网络训练,获得专门针对车辆的检测器Gaussian YOLO v3-vehicle,使车辆检测准确率提升3%. 为了克服传统预训练模型没有针对车辆类别的缺点,提出采用扩增后的VeRi数据集进行重识别预训练. 提出结合中心损失函数与交叉熵损失函数的新损失函数,使网络提取的目标特征有更好的类内聚合以及类间分辨能力. 试验部分采集不同环境的实际道路视频,采用CLEAR MOT评价指标进行性能评估. 结果表明,与基准DeepSort YOLO v3相比,跟踪准确度提升1%,身份切换次数减少4%.

关键词: 自动驾驶环境感知深度学习优化DeepSort算法目标跟踪    
Abstract:

A front multi-vehicle target tracking algorithm optimized by DeepSort was proposed in order to improve the awareness of autonomous vehicles to the surrounding environment. Gaussian YOLO v3 model was adopted as the front-end target detector, and training was based on DarkNet-53 backbone network. Gaussian YOLO v3-Vehicle, a detector specially designed for vehicles was obtained, which improved the vehicle detection accuracy by 3%. The augmented VeRi data set was proposed to conduct the re-recognition pre-training in order to overcome the shortcomings that the traditional pre-training model doesn't target vehicles. A new loss function combining the central loss function and the cross entropy loss function was proposed, which can make the target features extracted by the network become better in-class aggregation and inter-class resolution. Actual road videos in different environments were collected in the test part, and CLEAR MOT evaluation index was used for performance evaluation. Results showed a 1% increase in tracking accuracy and a 4% reduction in identity switching times compared with the benchmark DeepSort YOLO v3.

Key words: autonomous vehicle    environment perception    deep learning    optimized DeepSort algorithm    object tracking
收稿日期: 2020-07-24 出版日期: 2021-07-30
CLC:  U 462  
基金资助: 国家重点研发计划资助项目(2018YFB1600501);国家自然科学基金资助项目(52072333);国家自然科学基金区域创新发展联合基金资助项目(U19A2069);河北省省级科技计划资助项目(20310801D,E2020203092,F2021203107)
通讯作者: 谢宪毅     E-mail: jinls@ysu.edu.cn;xiexianyi123@126.com
作者简介: 金立生(1975—),男,教授,博导,从事自动驾驶、环境感知的研究. orcid.org/0000-0002-3086-1333. E-mail: jinls@ysu.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
金立生
华强
郭柏苍
谢宪毅
闫福刚
武波涛

引用本文:

金立生,华强,郭柏苍,谢宪毅,闫福刚,武波涛. 基于优化DeepSort的前方车辆多目标跟踪[J]. 浙江大学学报(工学版), 2021, 55(6): 1056-1064.

Li-sheng JIN,Qiang HUA,Bai-cang GUO,Xian-yi XIE,Fu-gang YAN,Bo-tao WU. Multi-target tracking of vehicles based on optimized DeepSort. Journal of ZheJiang University (Engineering Science), 2021, 55(6): 1056-1064.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008.973X.2021.06.005        https://www.zjujournals.com/eng/CN/Y2021/V55/I6/1056

图 1  DeepSort算法流程图
层名称 权重尺寸 输出尺寸
Conv1 3×3/1 32×128×64
Conv2 3×3/1 32×128×64
Max Pool 3 3×3/2 32×64×32
Residual 4 3×3/1 32×64×32
Residual 5 3×3/1 32×64×32
Residual 6 3×3/2 64×32×16
Residual 7 3×3/1 64×32×16
Residual 8 3×3/2 128×16×8
Residual 9 3×3/1 128×16×8
Dense 10 128
BN 128
表 1  重识别网络结构
图 2  图片数据质量增强前、后对比
图 3  YOLO v3网络结构
图 4  车辆重识别训练分类精度的变化图
图 5  Gaussian YOLO v3损失变化曲线
图 6  阴影交通场景环境可视化
图 7  车辆外形连续变化的可视化
图 8  交通场景中拥挤环境的可视化
算法 MOTA MOTP
Unsup Track[4] 61.7 78.3
Lif-T[4] 60.5 79.0
ISE-MOT17R[4] 60.1 78.2
msot[5] 59.2 78.0
EAMTT[27] 52.5 78.8
POI[27] 66.1 77.1
SORT[27] 59.8 79.6
基准DeepSort[27] 61.4 79.1
本文算法 62.5 78.9
表 2  多目标跟踪算法的MOTA与MOTP对比
1 王世峰, 戴祥, 徐宁 无人驾驶汽车环境感知技术综述[J]. 长春理工大学学报: 自然科学版, 2017, 40 (1): 1- 6
WANG Shi-feng, DAI Xiang, XU Ning Overview of driverless car environment perception technology[J]. Journal of Changchun University of Science and Technology: Natural Science Edition, 2017, 40 (1): 1- 6
2 李玺, 查宇飞, 张天柱 深度学习的目标跟踪算法综述[J]. 中国图象图形学报, 2019, 24 (12): 2057- 2080
LI Xi, ZHA Yu-fei, ZHANG Tian-zhu Overview of deep learning target tracking algorithms[J]. Chinese Journal of Image and Graphics, 2019, 24 (12): 2057- 2080
doi: 10.11834/jig.190372
3 储琪. 基于深度学习的视频多目标跟踪算法研究[D]. 合肥: 中国科学技术大学, 2019.
CHU Qi. Research on video multi-target tracking algorithm based on deep learning[D]. Hefei: University of Science and Technology of China, 2019.
4 KIM C, LI F, CIPTADI A, et al. Multiple hypothesis tracking revisited[C]// Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 4696–4704.
5 HU H, ZHOU L, GUAN Q, et al An automatic tracking method for multiple cells based on multi-feature fusion[J]. IEEE Access, 2018, 6: 69782- 69793
doi: 10.1109/ACCESS.2018.2880563
6 LEAL-TAIXÉ L, FERRER C C, SCHINDLER K. Learning by tracking: siamese CNN for robust target association[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Las Vegas: IEEE, 2016: 33–40.
7 ZHOU H, OUYANG W, CHENG J, et al Deep continuous conditional random fields with asymmetric inter-object constraints for online multi-object tracking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 29 (4): 1011- 1022
8 ULLAH M, MOHAMMED A K, CHEIKH F A, et al. A hierarchical feature model for multi-target tracking[C]// Proceedings of the 2017 IEEE International Conference on Image Processing. Beijing: IEEE, 2017: 2612–2616.
9 SHARMA S, ANSARI J A, MURTHY J K, et al. Beyond pixels: Leveraging geometry and shape cues for online multi-object tracking[C]// Proceedings of the 2018 IEEE International Conference on Mechatronics, Robotics and Automation. Brisbane: IEEE, 2018: 3508–3515.
10 CHU Q, OUYANG W, LI H, et al. Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism[C]// 2017 IEEE International Conference on Computer Vision. Venie: IEEE, 2017: 4836–4845.
11 MILAN A, REZATOFIGHI S H, DICK A, et al. Online multi-target tracking using recurrent neural networks[C]// National Conference on Artificial Intelligence. San Francisco: AAAI Press, 2017: 4225-4232.
12 MA C, YANG C, YANG F, et al. trajectory factory: tracklet cleaving and re-connection by deep siamese bi-GRU for multiple object tracking[C]// 2018 IEEE International Conference on Multimedia and Expo. San Diego: IEEE, 2018: 1-6.
13 ZHU J, YANG H, LIU N, et al. Online multi-object tracking with dual matching attention networks[C]// 2018 European Conference on Computer Vision. Munich: [s. n.], 2018: 366–382.
14 WOJKE N, BEWLEY A, PAULUS D. Simple online and real time tracking with a deep association metric[C]// 2017 IEEE International Conference on Image Processing. [S. l.]: IEEE, 2017: 3645-3649.
15 解耘宇. 基于扩展卡尔曼滤波的单目视觉轨迹跟踪方法的研究[D]. 北京: 华北电力大学, 2017.
XIE Yun-yu. Research on monocular vision trajectory tracking method based on extended Kalman filter[D]. Beijing: North China Electric Power University, 2017.
16 BISHOP C M. Pattern recognition and machine learning (information science and statistics)[M]. New York: Springer, 2006.
17 WOJKE N, BEWLEY A. Deep cosine metric learning for person re-identification[C]// IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe: IEEE, 2018: 748-756.
18 KRIZHEVSKY A, SUTSKEVER I, HINTON G E Imagenet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60 (6): 84- 90
doi: 10.1145/3065386
19 ZAGORUYKO S, KOMODAKIS N. Wide residual networks [C]// 2016 Proceedings of the British Machine Vision Conference. York: DBLP, 2016: 1-15.
20 WEN Y, ZHANG K, LI Z, et al. A discriminative feature learning approach for deep face recognition[C]// European Conference on Computer Vision. Amsterdam: Springer, 2016: 499-515.
21 CHARU C A. Neural networks and deep learning[M]. Cham: Springer, 2018: 48-51.
22 刘鑫辰. 城市视频监控网络中车辆搜索关键技术研究[D]. 北京: 北京邮电大学, 2018.
LIU Xin-chen. Research on key technologies of vehicle search in urban video surveillance network[D]. Beijing: Beijing University of Posts and Telecommunications, 2018.
23 REN S, HE K, GIRSHICK R, et al Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149
doi: 10.1109/TPAMI.2016.2577031
24 LIN T Y, DOLLAR P, GIRSHICK R. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 936-944.
25 REDMON J, FARHADI A. YOLO v3: an incremental improvement [EB/OL]. [2020-05-31]. https://arxiv.org/abs/1804.02767.
26 CHOI J, CHUN D, KIM H. Gaussian YOLO v3: an accurate and fast object detector using localization uncertainty for autonomous driving[C]// International Conference on Computer Vision. Seoul: IEEE, 2019: 502-511.
[1] 陈小波,陈玲,梁书荣,胡煜. 重尾非高斯定位噪声下鲁棒协同目标跟踪[J]. 浙江大学学报(工学版), 2022, 56(5): 967-976.
[2] 何立,庞善民. 结合年龄监督和人脸先验的语音-人脸图像重建[J]. 浙江大学学报(工学版), 2022, 56(5): 1006-1016.
[3] 张雪芹,李天任. 基于Cycle-GAN和改进DPN网络的乳腺癌病理图像分类[J]. 浙江大学学报(工学版), 2022, 56(4): 727-735.
[4] 褚晶辉,史李栋,井佩光,吕卫. 适用于目标检测的上下文感知知识蒸馏网络[J]. 浙江大学学报(工学版), 2022, 56(3): 503-509.
[5] 程若然,赵晓丽,周浩军,叶翰辰. 基于深度学习的中文字体风格转换研究综述[J]. 浙江大学学报(工学版), 2022, 56(3): 510-519, 530.
[6] 陈彤,郭剑锋,韩心中,谢学立,席建祥. 基于生成对抗模型的可见光-红外图像匹配方法[J]. 浙江大学学报(工学版), 2022, 56(1): 63-74.
[7] 任松,朱倩雯,涂歆玥,邓超,王小书. 基于深度学习的公路隧道衬砌病害识别方法[J]. 浙江大学学报(工学版), 2022, 56(1): 92-99.
[8] 曹宁博,赵利英. 路段环境自动驾驶汽车通行权决策方法[J]. 浙江大学学报(工学版), 2022, 56(1): 118-127.
[9] 刘兴,余建波. 注意力卷积GRU自编码器及其在工业过程监控的应用[J]. 浙江大学学报(工学版), 2021, 55(9): 1643-1651.
[10] 陈雪云,黄小巧,谢丽. 基于多尺度条件生成对抗网络血细胞图像分类检测方法[J]. 浙江大学学报(工学版), 2021, 55(9): 1772-1781.
[11] 刘嘉诚,冀俊忠. 基于宽度学习系统的fMRI数据分类方法[J]. 浙江大学学报(工学版), 2021, 55(7): 1270-1278.
[12] 周金海,周世镒,常阳,吴耿俊,王依川. 基于超宽带雷达基带信号的多人目标跟踪[J]. 浙江大学学报(工学版), 2021, 55(6): 1208-1214.
[13] 宋鹏,杨德东,李畅,郭畅. 整体特征通道识别的自适应孪生网络跟踪算法[J]. 浙江大学学报(工学版), 2021, 55(5): 966-975.
[14] 许佳辉,王敬昌,陈岭,吴勇. 基于图神经网络的地表水水质预测模型[J]. 浙江大学学报(工学版), 2021, 55(4): 601-607.
[15] 王虹力,郭斌,刘思聪,刘佳琪,仵允港,於志文. 边端融合的终端情境自适应深度感知模型[J]. 浙江大学学报(工学版), 2021, 55(4): 626-638.