Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2022, Vol. 56 Issue (3): 503-509    DOI: 10.3785/j.issn.1008-973X.2022.03.009
    
Context-aware knowledge distillation network for object detection
Jing-hui CHU(),Li-dong SHI,Pei-guang JING,Wei LV*()
School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
Download: HTML     PDF(927KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A context-aware knowledge distillation network (CAKD Net) method for object detection was proposed, aiming at the current methods of knowledge distillation for the task of object detection were difficult to use feature information of the surrounding context region of the detection object. The context information of the object was fully used, and the gap between the teacher network and the student network were eliminated by performing information perception along the spatial domain and channel domain simultaneously. A context-aware region modified module (CARM) and an adaptive channel attention module (ACAM) were included in CAKD Net. The context information was used to adaptively form a fine-grained mask of the salient region, and the difference of feature response of the teacher network and student network were precisely eliminated in the region of CARM. A novel spatial-channel attention was used to further optimize the objective function, thereby the performance of the student network was improved in ACAM. Experimental results show that the proposed algorithm improves the mean average precision by more than 2.9%.



Key wordsknowledge distillation      channel attention      model compression      object detection      deep learning     
Received: 07 September 2021      Published: 29 March 2022
CLC:  TP 37  
Fund:  天津市科技计划项目(18ZXJMTG00020);天津市自然科学基金资助项目(20JCQNJC01210)
Corresponding Authors: Wei LV     E-mail: cjh@tju.edu.cn;luwei@tju.edu.cn
Cite this article:

Jing-hui CHU,Li-dong SHI,Pei-guang JING,Wei LV. Context-aware knowledge distillation network for object detection. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 503-509.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2022.03.009     OR     https://www.zjujournals.com/eng/Y2022/V56/I3/503


适用于目标检测的上下文感知知识蒸馏网络

针对现有应用于目标检测的知识蒸馏方法难以利用目标周围上下文区域的特征信息,提出适用于目标检测的上下文感知知识蒸馏网络(CAKD Net)方法.该方法能充分利用被检测目标的上下文信息,同时沿空间域和通道域进行信息感知,消除教师网络和学生网络的差异. 该方法包括基于上下文感知的区域提纯模块(CARM)和自适应通道注意力模块(ACAM). CARM利用上下文信息,自适应生成显著性区域的细粒度掩膜,准确消除教师网络和学生网络各自特征响应在该区域的差异;ACAM引入空间?通道注意力机制,进一步优化目标函数,提高学生网络的性能. 实验结果表明,所提方法对模型检测精确率提升超过2.9%.


关键词: 知识蒸馏,  通道注意力,  模型轻量化,  目标检测,  深度学习 
Fig.1 Overview of context-aware knowledge distillation network
Fig.2 Structure diagram of context-aware region modified module
Fig.3 Structure diagram of adaptive channel attention module
模型 mAP AP
Aeroplane Bike Bird Boat Bus Chair Table Mbike Person Train
VGG16(教师) 70.4 70.9 78.0 67.8 55.1 79.6 48.7 63.5 74.5 77.0 76.0
VGG11(学生) 59.6 67.3 71.4 56.6 44.3 68.8 37.7 51.6 70.0 71.9 62.9
VGG11(本研究方法) 68.5 74.4 77.6 65.3 55.6 77.4 46.2 63.4 76.8 76.3 75.0
Tab.1 Experiment results of VGG16-VGG11 as teacher student network on VOC07 test set %
模型 mAP AP
Aeroplane Bike Bird Boat Bus Chair Table Mbike Person Train
ResNet101(教师) 74.4 77.8 78.9 77.5 63.2 79.2 54.5 68.7 77.8 78.6 78.8
ResNet50(学生) 69.1 68.9 79.0 67.0 54.1 78.6 49.7 62.6 72.5 77.2 75.0
ResNet50(本研究方法) 72.4 75.8 79.0 71.7 58.1 80.8 51.5 69.1 77.8 78.3 81.5
Tab.2 Experiment results of ResNet101-ResNet50 as teacher student network on VOC07 test set %
%
模型 mAP AP
Car Cyclist Pedestrian
ResNet101(教师) 63.4 78.5 54.6 57.1
ResNet50(学生) 52.5 77.7 35.4 44.2
ResNet50(本研究方法) 56.4 79.3 38.2 51.7
VGG16(教师) 62.6 79.3 52.1 56.4
VGG11(学生) 58.7 77.7 45.4 53.1
VGG11(本研究方法) 62.3 79.8 50.1 57.0
Tab.3 Experimental results of Faster R-CNN detector on KITTI test set
组号 CARM ACAM mAP/%
1 × × 58.7
2 × 62.0
3 × 60.7
4 62.3
Tab.4 Results of ablation experiments using Faster R-CNN detector on KITTI test set
组号 $ \rho $ mAP/% 组号 $\rho $ mAP/%
1 0 61.6 3 0.50 61.9
2 0.25 62.3 4 0.75 61.7
Tab.5 Experimental results of different filter thresholds
模型 mAP/%
Hinton CD FitNets DOD Task LD CAKD Net
教师 74.4 74.4 74.4 74.4 74.4 74.4 74.4
学生 69.1 69.1 69.1 69.1 70.0 69.1 69.1
蒸馏后 69.7 70.1 69.3 72.0 72.4 70.3 72.4
Tab.6 Results of comparative experiments using Faster R-CNN detector on VOC07 test set
Fig.4 Visual diagram of optimization objectives of different methods
组号 模型 ACAM mAP
1 FitNets × 59.0%
2 FitNets 59.7%
3 DOD × 60.7%
4 DOD 61.4%
Tab.7 Experiment results of universality verification of adaptive channel attention module
[1]   HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[2]   张彦楠, 黄小红, 马严, 等 基于深度学习的录音文本分类方法[J]. 浙江大学学报:工学版, 2020, 54 (7): 1264- 1271
ZHANG Yan-nan, HUANG Xiao-hong, MA Yan, et al Method with recording text classification based on deep learning[J]. Journal of Zhejiang University: Engineering Science, 2020, 54 (7): 1264- 1271
[3]   洪炎佳, 孟铁豹, 黎浩江, 等 多模态多维信息融合的鼻咽癌MR图像肿瘤深度分割方法[J]. 浙江大学学报:工学版, 2020, 54 (3): 566- 573
HONG Yan-jia, MENG Tie-bao, LI Hao-jiang, et al Deep segmentation method of tumor boundaries from MR images of patients with nasopharyngeal carcinoma using multi-modality and multi-dimension fusion[J]. Journal of Zhejiang University:Engineering Science, 2020, 54 (3): 566- 573
[4]   TIAN Y, KRISHNAN D, ISOLA P. Contrastive representation distillation [EB/OL]. [2021-09-07]. https://arxiv.org/pdf/1910.10699v2.pdf.
[5]   HINTON G, VINVALS O, DEAN J Distilling the knowledge in a neural network[J]. Computer Science, 2015, 14 (7): 38- 39
[6]   CHEN G, CHOI W, YU X, et al. Learning efficient object detection models with knowledge distillation [C]// Proceedings of the Annual Conference on Neural Information Processing Systems. Long Beach: [s. n.], 2017: 742–751.
[7]   TAN X, REN Y, HE D, et al. Multilingual neural machine translation with knowledge distillation [EB/OL]. [2021-09-07]. https://arxiv.org/pdf/1902.10461v3.pdf.
[8]   SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2021-09-07]. https://arxiv.org/pdf/1409.1556.pdf.
[9]   ROMERO A, BALLAS N, KAHOU S E, et al. FitNets: hints for thin deep nets [C]// Proceedings of the International Conference on Learning Representations. San Diego: [s.n.], 2015: 1–13.
[10]   WANG T, YUAN L, ZHANG X, et al. Distilling object detectors with fine-grained feature imitation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4928–4937.
[11]   REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [EB/OL]. [2021-09-07]. https://arxiv.org/pdf/1506.01497.pdf.
[12]   REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
[13]   JIE H, LI S, GANG S, et al Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42 (8): 2011- 2023
doi: 10.1109/TPAMI.2019.2913372
[14]   WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Angeles: IEEE, 2020: 11531–11539.
[15]   HOU Y, MA Z, LIU C, et al. Learning lightweight lane detection CNNs by self attention distillation [C]// Proceedings of the IEEE International Conference on Computer Vision. Seoul: IEEE, 2019: 1013-1021.
[16]   EVERINGHAM M, ESLAMI S M A, VAN GOOL L, et al The pascal visual object classes challenge: a retrospective[J]. International Journal of Computer Vision, 2015, 111 (1): 98- 136
doi: 10.1007/s11263-014-0733-5
[17]   GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? The KITTI vision benchmark suite [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 3354-3361.
[18]   MAO J, XIAO T, JIANG Y, et al. What can help pedestrian detection? [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 3127-3136.
[19]   SUN R, TANG F, ZHANG X, et al. Distilling object detectors with task adaptive regularization [EB/OL]. [2021-09-07]. https://arxiv.org/pdf/2006.13108.pdf.
[20]   ZHENG Z, YE R, WANG P, et al. Localization distillation for object detection [EB/OL]. [2021-09-07]. https://arxiv.org/pdf/2102.12252v3.pdf.
[1] Ruo-ran CHENG,Xiao-li ZHAO,Hao-jun ZHOU,Han-chen YE. Review of Chinese font style transfer research based on deep learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 510-519, 530.
[2] Kai DU,Guo-rong ZHU,Jiang-hua LU,Mu-ye PANG. Metal object detection method in wireless electric vehicle charging system[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 56-62.
[3] Tong CHEN,Jian-feng GUO,Xin-zhong HAN,Xue-li XIE,Jian-xiang XI. Visible and infrared image matching method based on generative adversarial model[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 63-74.
[4] Song REN,Qian-wen ZHU,Xin-yue TU,Chao DENG,Xiao-shu WANG. Lining disease identification of highway tunnel based on deep learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 92-99.
[5] Xing LIU,Jian-bo YU. Attention convolutional GRU-based autoencoder and its application in industrial process monitoring[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(9): 1643-1651.
[6] Xue-yun CHEN,Xiao-qiao HUANG,Li XIE. Classification and detection method of blood cells images based on multi-scale conditional generative adversarial network[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(9): 1772-1781.
[7] Jia-cheng LIU,Jun-zhong JI. Classification method of fMRI data based on broad learning system[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(7): 1270-1278.
[8] Li-sheng JIN,Qiang HUA,Bai-cang GUO,Xian-yi XIE,Fu-gang YAN,Bo-tao WU. Multi-target tracking of vehicles based on optimized DeepSort[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(6): 1056-1064.
[9] Peng SONG,De-dong YANG,Chang LI,Chang GUO. An adaptive siamese network tracking algorithm based on global feature channel recognition[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 966-975.
[10] Jia-hui XU,Jing-chang WANG,Ling CHEN,Yong WU. Surface water quality prediction model based on graph neural network[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(4): 601-607.
[11] Hong-li WANG,Bin GUO,Si-cong LIU,Jia-qi LIU,Yun-gang WU,Zhi-wen YU. End context-adaptative deep sensing model with edge-end collaboration[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(4): 626-638.
[12] Teng ZHANG,Xin-long JIANG,Yi-qiang CHEN,Qian CHEN,Tao-mian MI,Piu CHAN. Wrist attitude-based Parkinson's disease ON/OFF state assessment after medication[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(4): 639-647.
[13] Li-feng XU,Hai-fan HUANG,Wei-long DING,Yu-lei FAN. Detection of small fruit target based on improved DenseNet[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 377-385.
[14] Han-juan CHEN,Fei-peng DA,Shao-yan GAI. Deep 3D point cloud classification network based on competitive attention fusion[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(12): 2342-2351.
[15] Ying-jie NIU,Yan-chen SU,Dun-cheng CHENG,Jia LIAO,Hai-bo ZHAO,Yong-qiang GAO. High-speed rail contact network U-holding nut fault detection algorithm[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(10): 1912-1921.