Please wait a minute...
浙江大学学报(工学版)  2023, Vol. 57 Issue (6): 1195-1204    DOI: 10.3785/j.issn.1008-973X.2023.06.015
计算机与控制工程     
基于分割注意力与线性变换的轻量化目标检测
张艳1,2(),孙晶雪1,孙叶美1,2,*(),刘树东1,2,王传启3
1. 天津城建大学 计算机与信息工程学院,天津 300384
2. 天津市智慧养老与健康服务工程研究中心,天津 300384
3. 天津凯发电气股份有限公司,天津 300392
Lightweight object detection based on split attention and linear transformation
Yan ZHANG1,2(),Jing-xue SUN1,Ye-mei SUN1,2,*(),Shu-dong LIU1,2,Chuan-qi WANG3
1. School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China
2. Tianjin Intelligent Elderly Care and Health Service Engineering Research Center, Tianjin 300384, China
3. Tianjin Keyvia Electric Limited Company, Tianjin 300392, China
 全文: PDF(1383 KB)   HTML
摘要:

为了满足目标检测的实时性和模型轻量化需求,提高目标检测精度,对YOLOv5中的特征融合模块进行优化,提出基于金字塔分割注意力与线性变换的轻量化目标检测算法PG-YOLOv5. 利用金字塔分割注意力模块,捕获不同尺度特征图的空间信息以丰富特征空间,提升网络的多尺度特征表示能力,提高目标检测的精度. 利用基于线性变换的GhostBottleNeck模块,以少量原始特征图与线性变换得到的特征图相结合的方式,有效减少模型参数量. 算法的平均精度均值从YOLOv5L的81.2%提高到PG-YOLOv5的85.7%,PG-YOLOv5的参数量比YOLOv5L的下降了36%. 将PG-YOLOv5部署到Jetson TX2,并编写目标检测软件. 实验结果表明,基于Jetson TX2的目标检测系统的图像处理速度为262.1 ms/帧,PG-YOLOv5的平均精度均值为85.2%;与YOLOv5原始模型相比,PG-YOLOv5更适合边缘端部署.

关键词: 目标检测金字塔分割注意力线性变换轻量化YOLO    
Abstract:

To meet the real-time and model lightweight requirements of target detection and improve the accuracy of object detection, a lightweight target detection algorithm PG-YOLOv5 based on pyramid split attention and linear transformation was proposed. The feature fusion module in YOLOv5 was optimized by PG-YOLOv5. First, the pyramid split attention module was used to capture the spatial information of feature maps at different scales to enrich the feature space, thus the multi-scale feature representation ability of the network and the accuracy of object detection were improved. Then, the GhostBottleNeck module based on linear transformation was used to combine a small amount of original feature maps with those obtained from linear transformation, which reduced the number of model parameters effectively. The mean average precision of the algorithm increased from 81.2% of YOLOv5L to 85.7% of PG-YOLOv5, and the number of parameters of PG-YOLOv5 was 36% lower than that of YOLOv5L. The PG-YOLOv5 was deployed on Jetson TX2 and an object detection software was designed. Experimental results showed that the image processing speed of the target detection system based on Jetson TX2 was 262.1 ms/frame, and the mean average precision of PG-YOLOv5 was 85.2%. Compared with the YOLOv5L original model, PG-YOLOv5 is more suitable for edge deployment.

Key words: object detection    pyramid split attention    linear transformation    lightweight    YOLO
收稿日期: 2022-06-26 出版日期: 2023-06-30
CLC:  TP 391.41  
基金资助: 国家重点研发计划资助项目(2021YFB3301600);天津市科技计划资助项目(22YDTPJC00840)
通讯作者: 孙叶美     E-mail: zhangyan@tcu.edu.cn;sunyemei1216@163.com
作者简介: 张艳(1982—),女,副教授,博士,从事机器视觉与图像处理研究. orcid.org/0000-0003-0692-3028. E-mail: zhangyan@tcu.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
张艳
孙晶雪
孙叶美
刘树东
王传启

引用本文:

张艳,孙晶雪,孙叶美,刘树东,王传启. 基于分割注意力与线性变换的轻量化目标检测[J]. 浙江大学学报(工学版), 2023, 57(6): 1195-1204.

Yan ZHANG,Jing-xue SUN,Ye-mei SUN,Shu-dong LIU,Chuan-qi WANG. Lightweight object detection based on split attention and linear transformation. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1195-1204.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.06.015        https://www.zjujournals.com/eng/CN/Y2023/V57/I6/1195

图 1  PG-YOLOv5的网络结构
图 2  金字塔分割注意力模块的网络结构
图 3  GhostBottleNeck模块的网络结构
图 4  普通卷积与Ghost模块特征图生成过程对比
图 5  3种不同结构的特征融合模块
图 6  目标检测算法设计流程图
图 7  目标检测软件的界面
模型 PSA 特征融合模块1 特征融合模块2 特征融合模块3 mAP/% rR/% tI/ms nr/106 GFLOPS
原始模型 81.2 75 3.6 44.53 114.4
模型1 84.8 78.5 21.1 44.39 113.9
模型2 85.6 82.3 9.6 36.47 107.4
模型3 86.8 82 8.4 36.47 107.4
本研究模型 85.7 81.8 8.1 28.49 100.6
表 1  基于YOLOv5L改进的模型在Pascal VOC数据集上的消融实验结果
算法 主干网络 AP mAP
bike bird bottle bus car cat person sheep tv
DSSD ResNet-101 84.9 80.5 53.9 85.6 86.2 88.9 79.7 78 79.4 78.6
DES VGG-16 86.0 78.1 53.4 87.9 87.3 88.6 80.8 80.2 79.5 79.7
RefineDet VGG-16 85.4 81.4 60.2 86.4 88.1 89.1 82.6 82.7 79.4 80
R-FCN ResNet-101 87.2 81.5 69.8 86.8 88.5 89.8 81.2 81.8 79.9 80.5
SPANDet VGG-16 87.5 83.3 69.7 88.7 89.2 89.1 84.7 85.6 81.5 82.6
本研究 CSPDarknet 94.0 86.0 80.0 91.4 93.2 92.2 90.2 89.5 84.0 85.7
表 2  不同目标检测算法在Pascal VOC数据集上的检测实验结果对比
图 8  3种算法在Pascal VOC数据集上的目标检测结果图
算法 mAP/% rR/% tI/ms nr/106 GFLOPS
YOLOv3 80.20 74.70 4.5 58.66 155.1
YOLOv4 80.70 75.90 5.2 61.12 136.3
YOLOv5L 81.20 75.00 3.6 44.53 114.4
YOLOv5X 82.80 75.80 6.4 83.21 217.5
P本研究 85.70 81.80 8.1 28.49 100.6
表 3  不同目标检测算法在Pascal VOC数据集上的检测评价指标对比
图 9  YOLOv5L算法与PG-YOLOv5算法的检测结果图
算法 mAP/%
Faster-RCNN[21] 85.0
YOLOv3[22] 88.5
CenterNet[23] 90.0
Duan等[20] 92.0
本研究 95.7
表 4  不同目标检测算法在SHWD数据集上的平均精度均值
图 10  PG-YOLOv5在SHWD数据集上的检测结果图
图 11  工地安全帽佩戴情况的软件检测界面展示图
图 12  Jetson TX2显示器上目标检测算法的检测结果
1 张德祥, 王俊, 袁培成 基于注意力机制的多尺度全场景监控目标检测方法[J]. 电子与信息学报, 2022, 44 (9): 3249- 3257
ZHANG De-xiang, WANG Jun, YUAN Pei-cheng Object detection method for multi-scale full-scene surveillance based on attention mechanism[J]. Journal of Electronics and Information Technology, 2022, 44 (9): 3249- 3257
2 袁益琴, 何国金, 王桂周, 等 背景差分与帧间差分相融合的遥感卫星视频运动车辆检测方法[J]. 中国科学院大学学报, 2018, 35 (1): 50- 58
YUAN Yi-qin, HE Guo-jin, WANG Gui-zhou, et al A background subtraction and frame subtraction combined method for moving vehicle detection in satellite video data[J]. Journal of University of Chinese Academy of Sciences, 2018, 35 (1): 50- 58
3 ZHU J, ZOU H, ROSSET S, et al Multi-class AdaBoost[J]. Statistics and its Interface, 2009, 2: 349- 360
doi: 10.4310/SII.2009.v2.n3.a8
4 GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
5 GIRSHICK R. Fast R-CNN [C]// Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.
6 REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Vegas: IEEE, 2016: 779-788.
7 REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 7263-7271.
8 王立辉, 杨贤昭, 刘惠康, 等 基于GhostNet与注意力机制的行人检测跟踪算法[J]. 数据采集与处理, 2022, 37 (1): 108- 121
WANG Li-hui, YANG Xian-zhao, LIU Hui-kang, et al Pedestrian detection and tracking algorithm based on GhostNet and attention mechanism[J]. Journal of Data Acquisition and Processing, 2022, 37 (1): 108- 121
doi: 10.16337/j.1004-9037.2022.01.009
9 WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision. [S.l.]: Springer, 2018: 3-19.
10 ZHANG H, ZU K, LU J, et al. EPSANet: an efficient pyramid squeeze attention block on convolutional neural network[C]// Proceedings of the Asian Conference on Computer Vision, 2022: 1161-1177.
11 TAN M, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks [C]// Proceedings of the 36th International Conference on Machine Learning. [S.l.]: PMLR, 2019: 6105-6114.
12 LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988.
13 ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6848-6856.
14 HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 1580-1589.
15 FU C Y, LIU W, RANGA A, et al. DSSD: deconvolutional single shot detector [EB/OL]. (2017-01-23). https://arxiv.org/pdf/1701.06659.pdf.
16 ZHANG Z, QIAO S, XIE C, et al. Single-shot object detection with enriched semantics [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 5813–5821
17 ZHANG S, WEN L, BIAN X, et al. Single-shot refinement neural network for object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4203-4212.
18 DAI J, LI Y, HE K, et al. R-FCN: object detection via region-based fully convolutional networks [C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. [S.l.]: CAI, 2016: 379-387.
19 HAO S, WANG Z, SUN F Stacked pyramid attention network for object detection[J]. Neural Processing Letters, 2022, 54: 2759- 2782
doi: 10.1007/s11063-021-10505-x
20 DUAN Q, PING K, LI F, et al. Method of safety helmet wearing detection based on key-point estimation without anchor [C]// 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing. Chengdu: IEEE, 2020: 93-96
21 REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017: 1137-1149.
22 REDMON J, FARHADI A. YOLOv3: an incremental improvement [EB/OL]. (2018-04-08). https://arxiv.org/pdf/1804.02767.
[1] 韩俊,袁小平,王准,陈烨. 基于YOLOv5s的无人机密集小目标检测算法[J]. 浙江大学学报(工学版), 2023, 57(6): 1224-1233.
[2] 卞佰成,陈田,吴入军,刘军. 基于改进YOLOv3的印刷电路板缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(4): 735-743.
[3] 马庆禄,鲁佳萍,唐小垚,段学锋. 改进YOLOv5s的公路隧道烟火检测方法[J]. 浙江大学学报(工学版), 2023, 57(4): 784-794.
[4] 曾耀,高法钦. 基于改进YOLOv5的电子元件表面缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(3): 455-465.
[5] 郝琨,王阔,王贝贝. 基于改进Mobilenet-YOLOv3的轻量级水下生物检测算法[J]. 浙江大学学报(工学版), 2022, 56(8): 1622-1632.
[6] 张娜,戚旭磊,包晓安,吴彪,涂小妹,金瑜婷. 基于优化预测定位的单阶段目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(4): 783-794.
[7] 褚晶辉,史李栋,井佩光,吕卫. 适用于目标检测的上下文感知知识蒸馏网络[J]. 浙江大学学报(工学版), 2022, 56(3): 503-509.
[8] 袁天乐,袁巨龙,朱勇建,郑翰辰. 基于改进YOLOv5的推力球轴承表面缺陷检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2349-2357.
[9] 于楠晶,范晓飚,邓天民,冒国韬. 基于多头自注意力的复杂背景船舶检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2392-2402.
[10] 谢誉,包梓群,张娜,吴彪,涂小妹,包晓安. 基于特征优化与深层次融合的目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2403-2415.
[11] 张云佐,郭威,蔡昭权,李文博. 联合多尺度与注意力机制的遥感图像目标检测[J]. 浙江大学学报(工学版), 2022, 56(11): 2215-2223.
[12] 李飞,胡坤,张勇,王文善,蒋浩. 基于混合域注意力YOLOv4的输送带纵向撕裂多维度检测[J]. 浙江大学学报(工学版), 2022, 56(11): 2156-2167.
[13] 张融,张为. 基于改进GhostNet-FCOS的火灾检测算法[J]. 浙江大学学报(工学版), 2022, 56(10): 1891-1899.
[14] 董红召,方浩杰,张楠. 旋转框定位的多尺度再生物品目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(1): 16-25.
[15] 周金海,周世镒,常阳,吴耿俊,王依川. 基于超宽带雷达基带信号的多人目标跟踪[J]. 浙江大学学报(工学版), 2021, 55(6): 1208-1214.