Lightweight object detection based on split attention and linear transformation

doi:10.3785/j.issn.1008-973X.2023.06.015

Journal of ZheJiang University (Engineering Science)

2023, Vol. 57

Issue (6): 1195-1204 DOI: 10.3785/j.issn.1008-973X.2023.06.015

Lightweight object detection based on split attention and linear transformation

Yan ZHANG1,2(

),Jing-xue SUN1,Ye-mei SUN1,2,*(

),Shu-dong LIU1,2,Chuan-qi WANG3

1. School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China
2. Tianjin Intelligent Elderly Care and Health Service Engineering Research Center, Tianjin 300384, China
3. Tianjin Keyvia Electric Limited Company, Tianjin 300392, China

Download:

HTML

PDF(1383KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

To meet the real-time and model lightweight requirements of target detection and improve the accuracy of object detection, a lightweight target detection algorithm PG-YOLOv5 based on pyramid split attention and linear transformation was proposed. The feature fusion module in YOLOv5 was optimized by PG-YOLOv5. First, the pyramid split attention module was used to capture the spatial information of feature maps at different scales to enrich the feature space, thus the multi-scale feature representation ability of the network and the accuracy of object detection were improved. Then, the GhostBottleNeck module based on linear transformation was used to combine a small amount of original feature maps with those obtained from linear transformation, which reduced the number of model parameters effectively. The mean average precision of the algorithm increased from 81.2% of YOLOv5L to 85.7% of PG-YOLOv5, and the number of parameters of PG-YOLOv5 was 36% lower than that of YOLOv5L. The PG-YOLOv5 was deployed on Jetson TX2 and an object detection software was designed. Experimental results showed that the image processing speed of the target detection system based on Jetson TX2 was 262.1 ms/frame, and the mean average precision of PG-YOLOv5 was 85.2%. Compared with the YOLOv5L original model, PG-YOLOv5 is more suitable for edge deployment.

Key words： object detection pyramid split attention linear transformation lightweight YOLO

Received: 26 June 2022 Published: 30 June 2023

CLC:

TP 391.41

Fund: 国家重点研发计划资助项目(2021YFB3301600)；天津市科技计划资助项目(22YDTPJC00840)

Corresponding Authors: Ye-mei SUN E-mail: zhangyan@tcu.edu.cn;sunyemei1216@163.com

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Yan ZHANG
	Jing-xue SUN
	Ye-mei SUN
	Shu-dong LIU
	Chuan-qi WANG

Cite this article:

Yan ZHANG,Jing-xue SUN,Ye-mei SUN,Shu-dong LIU,Chuan-qi WANG. Lightweight object detection based on split attention and linear transformation. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1195-1204.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2023.06.015 OR https://www.zjujournals.com/eng/Y2023/V57/I6/1195

基于分割注意力与线性变换的轻量化目标检测

为了满足目标检测的实时性和模型轻量化需求，提高目标检测精度，对YOLOv5中的特征融合模块进行优化，提出基于金字塔分割注意力与线性变换的轻量化目标检测算法PG-YOLOv5. 利用金字塔分割注意力模块，捕获不同尺度特征图的空间信息以丰富特征空间，提升网络的多尺度特征表示能力，提高目标检测的精度. 利用基于线性变换的GhostBottleNeck模块，以少量原始特征图与线性变换得到的特征图相结合的方式，有效减少模型参数量. 算法的平均精度均值从YOLOv5L的81.2%提高到PG-YOLOv5的85.7%，PG-YOLOv5的参数量比YOLOv5L的下降了36%. 将PG-YOLOv5部署到Jetson TX2，并编写目标检测软件. 实验结果表明，基于Jetson TX2的目标检测系统的图像处理速度为262.1 ms/帧，PG-YOLOv5的平均精度均值为85.2%；与YOLOv5原始模型相比，PG-YOLOv5更适合边缘端部署.

关键词： 目标检测, 金字塔分割注意力, 线性变换, 轻量化, YOLO

Fig.1 Overall network structure of PG-YOLOv5

Fig.2 Structure of pyramid split attention module

Fig.3 Structure of GhostBottleNeck module

Fig.4 Comparison between ordinary convolution and Ghost module feature map gengration process

Fig.5 Feature fusion modules with three different structures

Fig.6 Object detection algorithm design flow chart

Fig.7 Interface of object detection software

Tab.1 Ablation results on Pascal VOC dataset based on YOLOv5L model

Tab.2 Comparison of experimental results of different target detection algorithms on Pascal VOC dataset %

Fig.8 Object detection results of three algorithm on Pascal VOC dataset

Tab.3 Comparison of detection evaluation indicators for different target detection algorithms on Pascal VOC dataset

Fig.9 Detection result diagram of YOLOv5L algorithm and PG-YOLOv5 algorithm

Tab.4 Mean average precisions of different target detection algorithms on SHWD dataset

Fig.10 Detection result of PG-YOLOv5 on SHWD dataset

Fig.11 Display diagram of software interface for testing wearing of safety helmet on construction site

Fig.12 Detection results of target detection algorithm on Jetson TX2


[1]	张德祥, 王俊, 袁培成基于注意力机制的多尺度全场景监控目标检测方法[J]. 电子与信息学报, 2022, 44 (9): 3249- 3257 ZHANG De-xiang, WANG Jun, YUAN Pei-cheng Object detection method for multi-scale full-scene surveillance based on attention mechanism[J]. Journal of Electronics and Information Technology, 2022, 44 (9): 3249- 3257

[2]	袁益琴, 何国金, 王桂周, 等背景差分与帧间差分相融合的遥感卫星视频运动车辆检测方法[J]. 中国科学院大学学报, 2018, 35 (1): 50- 58 YUAN Yi-qin, HE Guo-jin, WANG Gui-zhou, et al A background subtraction and frame subtraction combined method for moving vehicle detection in satellite video data[J]. Journal of University of Chinese Academy of Sciences, 2018, 35 (1): 50- 58

[3]	ZHU J, ZOU H, ROSSET S, et al Multi-class AdaBoost[J]. Statistics and its Interface, 2009, 2: 349- 360 doi: 10.4310/SII.2009.v2.n3.a8

[4]	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.

[5]	GIRSHICK R. Fast R-CNN [C]// Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1440-1448.

[6]	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Vegas: IEEE, 2016: 779-788.

[7]	REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 7263-7271.

[8]	王立辉, 杨贤昭, 刘惠康, 等基于GhostNet与注意力机制的行人检测跟踪算法[J]. 数据采集与处理, 2022, 37 (1): 108- 121 WANG Li-hui, YANG Xian-zhao, LIU Hui-kang, et al Pedestrian detection and tracking algorithm based on GhostNet and attention mechanism[J]. Journal of Data Acquisition and Processing, 2022, 37 (1): 108- 121 doi: 10.16337/j.1004-9037.2022.01.009

[9]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision. [S.l.]: Springer, 2018: 3-19.

[10]	ZHANG H, ZU K, LU J, et al. EPSANet: an efficient pyramid squeeze attention block on convolutional neural network[C]// Proceedings of the Asian Conference on Computer Vision, 2022: 1161-1177.

[11]	TAN M, LE Q V. EfficientNet: rethinking model scaling for convolutional neural networks [C]// Proceedings of the 36th International Conference on Machine Learning. [S.l.]: PMLR, 2019: 6105-6114.

[12]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988.

[13]	ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 6848-6856.

[14]	HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 1580-1589.

[15]	FU C Y, LIU W, RANGA A, et al. DSSD: deconvolutional single shot detector [EB/OL]. (2017-01-23). https://arxiv.org/pdf/1701.06659.pdf.

[16]	ZHANG Z, QIAO S, XIE C, et al. Single-shot object detection with enriched semantics [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 5813–5821

[17]	ZHANG S, WEN L, BIAN X, et al. Single-shot refinement neural network for object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4203-4212.

[18]	DAI J, LI Y, HE K, et al. R-FCN: object detection via region-based fully convolutional networks [C]// Proceedings of the 30th International Conference on Neural Information Processing Systems. [S.l.]: CAI, 2016: 379-387.

[19]	HAO S, WANG Z, SUN F Stacked pyramid attention network for object detection[J]. Neural Processing Letters, 2022, 54: 2759- 2782 doi: 10.1007/s11063-021-10505-x

[20]	DUAN Q, PING K, LI F, et al. Method of safety helmet wearing detection based on key-point estimation without anchor [C]// 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing. Chengdu: IEEE, 2020: 93-96

[21]	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017: 1137-1149.

[22]	REDMON J, FARHADI A. YOLOv3: an incremental improvement [EB/OL]. (2018-04-08). https://arxiv.org/pdf/1804.02767.

[1]	Bai-cheng BIAN,Tian CHEN,Ru-jun WU,Jun LIU. Improved YOLOv3-based defect detection algorithm for printed circuit board[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(4): 735-743.

[2]	Qing-lu MA,Jia-ping LU,Xiao-yao TANG,Xue-feng DUAN. Improved YOLOv5s flame and smoke detection method in road tunnels[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(4): 784-794.

[3]	Yao ZENG,Fa-qin GAO. Surface defect detection algorithm of electronic components based on improved YOLOv5[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(3): 455-465.

[4]	Kun HAO,Kuo WANG,Bei-bei WANG. Lightweight underwater biological detection algorithm based on improved Mobilenet-YOLOv3[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(8): 1622-1632.

[5]	Na ZHANG,Xu-lei QI,Xiao-an BAO,Biao WU,Xiao-mei TU,Yu-ting JIN. Single-stage object detection algorithm based on optimizing position prediction[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 783-794.

[6]	Jing-hui CHU,Li-dong SHI,Pei-guang JING,Wei LV. Context-aware knowledge distillation network for object detection[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 503-509.

[7]	Tian-le YUAN,Ju-long YUAN,Yong-jian ZHU,Han-chen ZHENG. Surface defect detection algorithm of thrust ball bearing based on improved YOLOv5[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2349-2357.

[8]	Nan-jing YU,Xiao-biao FAN,Tian-min DENG,Guo-tao MAO. Ship detection algorithm in complex backgrounds via multi-head self-attention[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2392-2402.

[9]	Yu XIE,Zi-qun BAO,Na ZHANG,Biao WU,Xiao-mei TU,Xiao-an BAO. Object detection algorithm based on feature enhancement and deep fusion[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2403-2415.

[10]	Yun-zuo ZHANG,Wei GUO,Zhao-quan CAI,Wen-bo LI. Remote sensing image target detection combining multi-scale and attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(11): 2215-2223.

[11]	Fei LI,Kun HU,Yong ZHANG,Wen-shan WANG,Hao JIANG. Multi-dimensional detection of longitudinal tearing of conveyor belt based on YOLOv4 of hybrid domain attention[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(11): 2156-2167.

[12]	Rong ZHANG,Wei ZHANG. Fire detection algorithm based on improved GhostNet-FCOS[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(10): 1891-1899.

[13]	Hong-zhao DONG,Hao-jie FANG,Nan ZHANG. Multi-scale object detection algorithm for recycled objects based on rotating block positioning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 16-25.

[14]	Kai DU,Guo-rong ZHU,Jiang-hua LU,Mu-ye PANG. Metal object detection method in wireless electric vehicle charging system[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 56-62.

[15]	Jiang-bo XU,Yuan-zhi WANG,Yu QI,Bao-hua CAO,Yong-zhen LUO,Chang-gen YAN,Xiao-hua YANG,Han BAO,Yu-zhou XIANG. Deformation characteristics of fiber-reinforced foam lightweight soil under cyclic loading and unloading[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 111-117.

Viewed

Full text

Abstract

Cited

Shared

Discussed