Please wait a minute...
浙江大学学报(工学版)  2022, Vol. 56 Issue (8): 1622-1632    DOI: 10.3785/j.issn.1008-973X.2022.08.016
计算机与控制工程     
基于改进Mobilenet-YOLOv3的轻量级水下生物检测算法
郝琨1(),王阔1,王贝贝2,*()
1. 天津城建大学 计算机与信息工程学院,天津 300384
2. 天津城建大学 控制与机械工程学院,天津 300384
Lightweight underwater biological detection algorithm based on improved Mobilenet-YOLOv3
Kun HAO1(),Kuo WANG1,Bei-bei WANG2,*()
1. School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China
2. School of Control and Mechanical Engineering, Tianjin Chengjian University, Tianjin 300384, China
 全文: PDF(3887 KB)   HTML
摘要:

在水下生物检测中,经典目标检测模型由于体积大、参数量多,不适用于微小型水下硬件设备,而现有轻量化模型又难以平衡检测精度和实时性. 针对这一问题,本研究提出了基于改进Mobilenet-YOLOv3的轻量级检测算法CPM-YOLOv3,该算法利用规整通道剪枝算法对Mobilenet-YOLOv3进行剪枝,并将特征提取网络中的SE (squeeze-and-excitation)模块替换成CBAM (convolutional block attention module),实现对网络模型的压缩. 同时,在不同尺寸的检测层中分别加入2个CBAM,在几乎不增加模型大小的情况下提升模型关注目标特征信息的能力. 实验结果表明,CPM-YOLOv3模型大小仅有4.86 MB,与原模型相比大小降低了94.7%,平均检测精度为87.0%,速度为5.1 ms/帧. 相较于其他网络模型,CPM-YOLOv3更适合在微小型水下设备中应用.

关键词: 水下生物检测轻量化模型通道剪枝注意力机制深度学习    
Abstract:

In underwater biological detection, the classical target detection model is not suitable for small underwater hardware equipment due to its large volume and large number of parameters, and the existing lightweight model is difficult to balance detection accuracy and real-time performance. To solve this problem, a lightweight detection algorithm CPM-YOLOv3 was proposed based on the improved Mobilenet-YOLOv3. The regular channel pruning algorithm was used to pruning Mobilenet-YOLOv3, and the squeeze-and-excitation (SE) module in the feature extraction network was replaced with convolutional block attention module (CBAM) to compress the network model. At the same time, two CBAM were added to the detection layer of different sizes to improve the model's ability to pay attention to target feature information without increasing the size of the model. Experimental results showed that the size of CPM-YOLOv3 model was only 4.86 MB, which was reduced by 94.7% compared with the original model. The average detection precision was 87.0%, and the speed was 5.1 ms/frame. Compared with other network models, CPM-YOLOV3 is more suitable for the application of micro underwater equipment.

Key words: underwater biological detection    lightweight model    channel pruning    attention mechanism    deep learning
收稿日期: 2021-08-11 出版日期: 2022-08-30
CLC:  TP 391.4  
基金资助: 国家自然科学基金资助项目(61902273)
通讯作者: 王贝贝     E-mail: kunhao@tcu.edu.cn;wbbking@163.com
作者简介: 郝琨(1979—),女,教授,从事水下传感器网络、计算机视觉研究. orcid.org/0000-0002-5627-7151. E-mail: kunhao@tcu.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
郝琨
王阔
王贝贝

引用本文:

郝琨,王阔,王贝贝. 基于改进Mobilenet-YOLOv3的轻量级水下生物检测算法[J]. 浙江大学学报(工学版), 2022, 56(8): 1622-1632.

Kun HAO,Kuo WANG,Bei-bei WANG. Lightweight underwater biological detection algorithm based on improved Mobilenet-YOLOv3. Journal of ZheJiang University (Engineering Science), 2022, 56(8): 1622-1632.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.08.016        https://www.zjujournals.com/eng/CN/Y2022/V56/I8/1622

图 1  深度可分离卷积示意图
图 2  通道剪枝原理图
图 3  CBAM示意图
图 4  CPM-YOLOv3网络模型
输入 模块 卷积核数 输出 CBAM 激活函数 步幅
4162×3 conv2d,3×3 ? 16 ? HS 2
2082×16 bneck,3×3 16 16 ? RE 1
2082×16 bneck,3×3 64 24 ? RE 2
1042×24 bneck,3×3 72 24 ? RE 1
1042×24 bneck,5×5 72 40 RE 2
522×40 bneck,5×5 120 40 RE 1
522×40 bneck,5×5 120 40 RE 1
522×40 bneck,3×3 240 73 ? HS 2
262×73 bneck,3×3 200 73 ? HS 1
262×73 bneck,3×3 184 73 ? HS 1
262×73 bneck,3×3 184 73 ? HS 1
262×73 bneck,3×3 480 89 HS 1
262×89 bneck,3×3 672 89 HS 1
262×89 bneck,5×5 672 43 HS 2
132×43 bneck,5×5 960 43 HS 1
132×43 bneck,5×5 960 43 HS 1
132×43 conv2d,1×1 ? 14 ? HS 1
表 1  CPM-YOLOv3特征提取网络结构
图 5  预测层的输出信息展示图
图 6  石斑鱼数据集展示图
图像处理 N
非自然环境 自然环境下无遮挡 自然环境下有遮挡
原始图像 100 415 150
旋转180° 100 415 150
水平翻转 100 415 150
垂直翻转 100 415 150
添加随机噪声 100 415 150
提升亮度 100 415 150
总计 600 2490 900
表 2  石斑鱼数据集的数量信息
图 7  CPM-YOLOv3训练过程损失曲线图
网络模型 Para/MB M/MB
Mobilenet-YOLOv3 22.68 91.08
Prune60% 4.84 19.66
Prune70% 3.62 14.78
Prune80% 2.90 11.88
Prune90% 2.22 9.08
Prune95% 1.69 7.02
表 3  不同剪枝比率下的模型
图 8  不同剪枝率的测试结果图
图 9  剪枝前、后通道数量对比图
网络模型 Para/MB M/MB R/% AP/% AP50/% AP75/% T/ms
Mobilenet-YOLOv3 Prune90% 2.22 9.08 89.3 85.8 86.8 85.5 4.5
+ CBAM替换SE 1.14 4.84 89.6 86.0 87.1 85.9 4.9
+ 在预测层加入CBAM 1.14 4.86 89.7 87.0 88.0 87.0 5.1
Mobilenet-YOLOv3 Prune90% (调整降维比例) 0.80 3.46 88.2 85.2 85.2 84.3 4.3
表 4  CBAM对模型大小及检测精度的影响
图 10  采用CBAM改进前、后检测效果对比图
网络模型 Para/MB M/MB T/ms
EfficientDet-D1 6.60 25.64 22.6
SSD 23.75 90.61 31.0
YOLOv3 61.52 235.10 9.9
Mobilenet-YOLOv3 22.68 91.08 5.9
Tiny YOLOv3 8.67 33.17 2.6
Tiny YOLOv4 5.60 22.50 4.8
YOLO Nano 2.85 11.22 53.5
CPM-YOLOv3 1.14 4.86 5.1
表 5  不同算法检测结果对比
图 11  正常、遮挡测试集下的石斑鱼检测结果
图 12  不同算法检测效果图
1 YANG H, LIU P, HU Y, et al Research on underwater object recognition based on YOLOv3[J]. Microsystem Technologies, 2021, 27 (4): 1837- 1844
doi: 10.1007/s00542-019-04694-8
2 徐凤强, 董鹏, 王辉兵, 等 基于水下机器人的海产品智能检测与自主抓取系统[J]. 北京航空航天大学学报, 2019, 45 (12): 2393- 2402
XU Feng-qiang, DONG Peng, WANG Hui-bing, et al Intelligent detection and autonomous capture system of seafood based on underwater robot[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45 (12): 2393- 2402
doi: 10.13700/j.bh.1001-5965.2019.0377
3 XU F, WANG H, PENG J, et al Scale-aware feature pyramid architecture for marine object detection[J]. Neural Computing and Applications, 2021, 33 (8): 3637- 3653
doi: 10.1007/s00521-020-05217-7
4 WANG H, SHI Y, YUE Y, et al. Study on freshwater fish image recognition integrating SPP and DenseNet network[C]// The IEEE International Conference on Mechatronics and Automation. Beijing: ICMA, 2020: 564-569.
5 ZHAO Z, LIU Y, SUN X, et al Composited fishnet: fish detection and species recognition from low-quality underwater videos[J]. IEEE Transactions on Image Processing, 2021, 30: 4719- 4734
doi: 10.1109/TIP.2021.3074738
6 SALMAN A, SIDDIQUI S, SHAFAIT F, et al Automatic fish detection in underwater videos by a deep neural network-based hybrid motion learning system[J]. ICES Journal of Marine Science, 2020, 77 (4): 1295- 1307
doi: 10.1093/icesjms/fsz025
7 WONG A, FAMUORI M, SHAFIEE M, et al. Yolo nano: a highly compact you only look once convolutional neural network for object detection [EB/OL]. (2019-10-03) [2021-8-12]. https://arxiv.org/abs/1910.01271.
8 HAN K, WANG Y, TIAN Q, et al. Ghostnet: more features from cheap operations[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: CVPR, 2020: 1580-1589.
9 袁哲明, 袁鸿杰, 言雨璇, 等 基于深度学习的轻量化田间昆虫识别及分类模型[J]. 吉林大学学报:工学版, 2021, 51 (3): 1131- 1139
YUAN Zhe-ming, YUAN Hong-jie, YAN Yu-xuan, et al Automatic recognition and classification of field insects based on lightweight deep learning model[J]. Journal of Jilin University: Engineering and Technology Edition, 2021, 51 (3): 1131- 1139
10 孟庆宽, 张漫, 杨晓霞, 等 基于轻量卷积结合特征信息融合的玉米幼苗与杂草识别[J]. 农业机械学报, 2020, 51 (12): 238- 245
MENG Qing-kuan, ZHANG Man, YANG Xiao-xia, et al Recognition of maize seedling and weed based on light weight convolution and feature fusion[J]. Transactions of The Chinese Society for Agricultural Machinery, 2020, 51 (12): 238- 245
doi: 10.6041/j.issn.1000-1298.2020.12.026
11 ZHANG X, KANG X, FENG N, et al Automatic recognition of dairy cow mastitis from thermal images by a deep learning detector[J]. Computers and Electronics in Agriculture, 2020, 178: 105754
doi: 10.1016/j.compag.2020.105754
12 HUANG S, HE Y, CHEN X M-YOLO: a nighttime vehicle detection method combining Mobilenet v2 and YOLO v3[J]. Journal of Physics: Conference Series, 2021, 1883 (1): 012094
doi: 10.1088/1742-6596/1883/1/012094
13 GU Y, GE B Research on lightweight convolutional neural network in garbage classification[J]. IOP Conference Series: Earth and Environmental Science, 2021, 781 (3): 032011
doi: 10.1088/1755-1315/781/3/032011
14 ZHANG, X, LI N, ZHANG R. An improved lightweight network MobileNetv3 based YOLOv3 for pedestrian detection[C]// The IEEE International Conference on Consumer Electronics and Computer Engineering. Guangzhou: ICCECE, 2021: 114-118.
15 CAI K, MIAO X, WANG W, et al A modified YOLOv3 model for fish detection based on MobileNet v1 as backbone[J]. Aquacultural Engineering, 2020, 91: 102117
doi: 10.1016/j.aquaeng.2020.102117
16 QIU Z, YAO Y, ZHONG M. Underwater sea cucumbers detection based on pruned SSD[C]// The IEEE 3rd Advanced Information Management, Communicates, Electronic and Automation Control Conference. Chongqing: IMCEC, 2019: 738-742.
17 强伟, 贺昱曜, 郭玉锦, 等 基于改进SSD 的水下目标检测算法研究[J]. 西北工业大学学报, 2020, 38 (4): 747- 754
QIANG Wei, HE Yu-yao, GUO Yu-jin, et al Research on underwater target detection algorithm based on improved SSD[J]. Journal of Northwestern Polytechnical University, 2020, 38 (4): 747- 754
doi: 10.3969/j.issn.1000-2758.2020.04.008
18 REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// The IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: CVPR, 2016: 779-788.
19 REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// The IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: CVPR , 2017: 6517-6525.
20 REDMON J, FARHADI A. YOLOv3: an incremental improvement [EB/OL]. (2018-04-08) [2021-8-12]. https://arxiv.org/abs/1804.02767.
21 HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// The IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: CVPR, 2018: 7132-7141.
22 LIU Z, LI J, SHEN Z, et al. Learning efficient convolutional networks through network slimming [C]// The IEEE International Conference on Computer Vision. Shenzhen: ICCV, 2017: 2736-2744.
23 WOO S, PARK J, LEE J, et al. Cbam: convolutional block attention module [C]// The European Conference on Computer Vision. Munich: ECCV, 2018: 3-19.
24 CUTTER G, STIERHOFF K, ZENG J. Automated detection of rockfish in unconstrained underwater videos using haar cascades and a new image dataset: labeled fishes in the wild[C]// The IEEE Winter Conference on Applications of Computer Vision Workshops. Waikoloa: WACVW, 2015: 57-62.
[1] 夏杰锋,唐武勤,杨强. 光伏航拍红外图像的热斑自动检测方法[J]. 浙江大学学报(工学版), 2022, 56(8): 1640-1647.
[2] 莫仁鹏,司小胜,李天梅,朱旭. 基于多尺度特征与注意力机制的轴承寿命预测[J]. 浙江大学学报(工学版), 2022, 56(7): 1447-1456.
[3] 赵永胜,李瑞祥,牛娜娜,赵志勇. 数字孪生驱动的机身形状控制方法[J]. 浙江大学学报(工学版), 2022, 56(7): 1457-1463.
[4] 王友卫,童爽,凤丽洲,朱建明,李洋,陈福. 基于图卷积网络的归纳式微博谣言检测新方法[J]. 浙江大学学报(工学版), 2022, 56(5): 956-966.
[5] 鞠晓臣,赵欣欣,钱胜胜. 基于自注意力机制的桥梁螺栓检测算法[J]. 浙江大学学报(工学版), 2022, 56(5): 901-908.
[6] 何立,庞善民. 结合年龄监督和人脸先验的语音-人脸图像重建[J]. 浙江大学学报(工学版), 2022, 56(5): 1006-1016.
[7] 张雪芹,李天任. 基于Cycle-GAN和改进DPN网络的乳腺癌病理图像分类[J]. 浙江大学学报(工学版), 2022, 56(4): 727-735.
[8] 许萌,王丹,李致远,陈远方. IncepA-EEGNet: 融合Inception网络和注意力机制的P300信号检测方法[J]. 浙江大学学报(工学版), 2022, 56(4): 745-753, 782.
[9] 柳长源,何先平,毕晓君. 融合注意力机制的高效率网络车型识别[J]. 浙江大学学报(工学版), 2022, 56(4): 775-782.
[10] 陈巧红,裴皓磊,孙麒. 基于视觉关系推理与上下文门控机制的图像描述[J]. 浙江大学学报(工学版), 2022, 56(3): 542-549.
[11] 褚晶辉,史李栋,井佩光,吕卫. 适用于目标检测的上下文感知知识蒸馏网络[J]. 浙江大学学报(工学版), 2022, 56(3): 503-509.
[12] 程若然,赵晓丽,周浩军,叶翰辰. 基于深度学习的中文字体风格转换研究综述[J]. 浙江大学学报(工学版), 2022, 56(3): 510-519, 530.
[13] 农元君,王俊杰,陈红,孙文涵,耿慧,李书悦. 基于注意力机制和编码-解码架构的施工场景图像描述方法[J]. 浙江大学学报(工学版), 2022, 56(2): 236-244.
[14] 刘英莉,吴瑞刚,么长慧,沈韬. 铝硅合金实体关系抽取数据集的构建方法[J]. 浙江大学学报(工学版), 2022, 56(2): 245-253.
[15] 董红召,方浩杰,张楠. 旋转框定位的多尺度再生物品目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(1): 16-25.