Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2022, Vol. 56 Issue (8): 1622-1632    DOI: 10.3785/j.issn.1008-973X.2022.08.016
    
Lightweight underwater biological detection algorithm based on improved Mobilenet-YOLOv3
Kun HAO1(),Kuo WANG1,Bei-bei WANG2,*()
1. School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China
2. School of Control and Mechanical Engineering, Tianjin Chengjian University, Tianjin 300384, China
Download: HTML     PDF(3887KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

In underwater biological detection, the classical target detection model is not suitable for small underwater hardware equipment due to its large volume and large number of parameters, and the existing lightweight model is difficult to balance detection accuracy and real-time performance. To solve this problem, a lightweight detection algorithm CPM-YOLOv3 was proposed based on the improved Mobilenet-YOLOv3. The regular channel pruning algorithm was used to pruning Mobilenet-YOLOv3, and the squeeze-and-excitation (SE) module in the feature extraction network was replaced with convolutional block attention module (CBAM) to compress the network model. At the same time, two CBAM were added to the detection layer of different sizes to improve the model's ability to pay attention to target feature information without increasing the size of the model. Experimental results showed that the size of CPM-YOLOv3 model was only 4.86 MB, which was reduced by 94.7% compared with the original model. The average detection precision was 87.0%, and the speed was 5.1 ms/frame. Compared with other network models, CPM-YOLOV3 is more suitable for the application of micro underwater equipment.



Key wordsunderwater biological detection      lightweight model      channel pruning      attention mechanism      deep learning     
Received: 11 August 2021      Published: 30 August 2022
CLC:  TP 391.4  
Fund:  国家自然科学基金资助项目(61902273)
Corresponding Authors: Bei-bei WANG     E-mail: kunhao@tcu.edu.cn;wbbking@163.com
Cite this article:

Kun HAO,Kuo WANG,Bei-bei WANG. Lightweight underwater biological detection algorithm based on improved Mobilenet-YOLOv3. Journal of ZheJiang University (Engineering Science), 2022, 56(8): 1622-1632.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2022.08.016     OR     https://www.zjujournals.com/eng/Y2022/V56/I8/1622


基于改进Mobilenet-YOLOv3的轻量级水下生物检测算法

在水下生物检测中,经典目标检测模型由于体积大、参数量多,不适用于微小型水下硬件设备,而现有轻量化模型又难以平衡检测精度和实时性. 针对这一问题,本研究提出了基于改进Mobilenet-YOLOv3的轻量级检测算法CPM-YOLOv3,该算法利用规整通道剪枝算法对Mobilenet-YOLOv3进行剪枝,并将特征提取网络中的SE (squeeze-and-excitation)模块替换成CBAM (convolutional block attention module),实现对网络模型的压缩. 同时,在不同尺寸的检测层中分别加入2个CBAM,在几乎不增加模型大小的情况下提升模型关注目标特征信息的能力. 实验结果表明,CPM-YOLOv3模型大小仅有4.86 MB,与原模型相比大小降低了94.7%,平均检测精度为87.0%,速度为5.1 ms/帧. 相较于其他网络模型,CPM-YOLOv3更适合在微小型水下设备中应用.


关键词: 水下生物检测,  轻量化模型,  通道剪枝,  注意力机制,  深度学习 
Fig.1 Schematic diagram of depth separable convolution
Fig.2 Schematic diagram of channel pruning
Fig.3 Schematic diagram of CBAM
Fig.4 CPM-YOLOv3 network model
输入 模块 卷积核数 输出 CBAM 激活函数 步幅
4162×3 conv2d,3×3 ? 16 ? HS 2
2082×16 bneck,3×3 16 16 ? RE 1
2082×16 bneck,3×3 64 24 ? RE 2
1042×24 bneck,3×3 72 24 ? RE 1
1042×24 bneck,5×5 72 40 RE 2
522×40 bneck,5×5 120 40 RE 1
522×40 bneck,5×5 120 40 RE 1
522×40 bneck,3×3 240 73 ? HS 2
262×73 bneck,3×3 200 73 ? HS 1
262×73 bneck,3×3 184 73 ? HS 1
262×73 bneck,3×3 184 73 ? HS 1
262×73 bneck,3×3 480 89 HS 1
262×89 bneck,3×3 672 89 HS 1
262×89 bneck,5×5 672 43 HS 2
132×43 bneck,5×5 960 43 HS 1
132×43 bneck,5×5 960 43 HS 1
132×43 conv2d,1×1 ? 14 ? HS 1
Tab.1 CPM-YOLOv3 feature extraction network structure
Fig.5 Output information display diagram of prediction layer
Fig.6 Display of rockfish dataset
图像处理 N
非自然环境 自然环境下无遮挡 自然环境下有遮挡
原始图像 100 415 150
旋转180° 100 415 150
水平翻转 100 415 150
垂直翻转 100 415 150
添加随机噪声 100 415 150
提升亮度 100 415 150
总计 600 2490 900
Tab.2 Quantitative information on rockfish dataset
Fig.7 Loss curve of training process for CPM-YOLOv3
网络模型 Para/MB M/MB
Mobilenet-YOLOv3 22.68 91.08
Prune60% 4.84 19.66
Prune70% 3.62 14.78
Prune80% 2.90 11.88
Prune90% 2.22 9.08
Prune95% 1.69 7.02
Tab.3 Models with different pruning ratios
Fig.8 Graph of test results of different pruning rates
Fig.9 Comparison of number of channels before and after pruning
网络模型 Para/MB M/MB R/% AP/% AP50/% AP75/% T/ms
Mobilenet-YOLOv3 Prune90% 2.22 9.08 89.3 85.8 86.8 85.5 4.5
+ CBAM替换SE 1.14 4.84 89.6 86.0 87.1 85.9 4.9
+ 在预测层加入CBAM 1.14 4.86 89.7 87.0 88.0 87.0 5.1
Mobilenet-YOLOv3 Prune90% (调整降维比例) 0.80 3.46 88.2 85.2 85.2 84.3 4.3
Tab.4 Effect of CBAM on model size and detection accuracy
Fig.10 Comparison of detection results before and after CBAM improvement
网络模型 Para/MB M/MB T/ms
EfficientDet-D1 6.60 25.64 22.6
SSD 23.75 90.61 31.0
YOLOv3 61.52 235.10 9.9
Mobilenet-YOLOv3 22.68 91.08 5.9
Tiny YOLOv3 8.67 33.17 2.6
Tiny YOLOv4 5.60 22.50 4.8
YOLO Nano 2.85 11.22 53.5
CPM-YOLOv3 1.14 4.86 5.1
Tab.5 Comparison of detection results of different algorithms
Fig.11 Rockfish test results under normal and occluded test sets
Fig.12 Detection effect diagram of different algorithms
[1]   YANG H, LIU P, HU Y, et al Research on underwater object recognition based on YOLOv3[J]. Microsystem Technologies, 2021, 27 (4): 1837- 1844
doi: 10.1007/s00542-019-04694-8
[2]   徐凤强, 董鹏, 王辉兵, 等 基于水下机器人的海产品智能检测与自主抓取系统[J]. 北京航空航天大学学报, 2019, 45 (12): 2393- 2402
XU Feng-qiang, DONG Peng, WANG Hui-bing, et al Intelligent detection and autonomous capture system of seafood based on underwater robot[J]. Journal of Beijing University of Aeronautics and Astronautics, 2019, 45 (12): 2393- 2402
doi: 10.13700/j.bh.1001-5965.2019.0377
[3]   XU F, WANG H, PENG J, et al Scale-aware feature pyramid architecture for marine object detection[J]. Neural Computing and Applications, 2021, 33 (8): 3637- 3653
doi: 10.1007/s00521-020-05217-7
[4]   WANG H, SHI Y, YUE Y, et al. Study on freshwater fish image recognition integrating SPP and DenseNet network[C]// The IEEE International Conference on Mechatronics and Automation. Beijing: ICMA, 2020: 564-569.
[5]   ZHAO Z, LIU Y, SUN X, et al Composited fishnet: fish detection and species recognition from low-quality underwater videos[J]. IEEE Transactions on Image Processing, 2021, 30: 4719- 4734
doi: 10.1109/TIP.2021.3074738
[6]   SALMAN A, SIDDIQUI S, SHAFAIT F, et al Automatic fish detection in underwater videos by a deep neural network-based hybrid motion learning system[J]. ICES Journal of Marine Science, 2020, 77 (4): 1295- 1307
doi: 10.1093/icesjms/fsz025
[7]   WONG A, FAMUORI M, SHAFIEE M, et al. Yolo nano: a highly compact you only look once convolutional neural network for object detection [EB/OL]. (2019-10-03) [2021-8-12]. https://arxiv.org/abs/1910.01271.
[8]   HAN K, WANG Y, TIAN Q, et al. Ghostnet: more features from cheap operations[C]// The IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: CVPR, 2020: 1580-1589.
[9]   袁哲明, 袁鸿杰, 言雨璇, 等 基于深度学习的轻量化田间昆虫识别及分类模型[J]. 吉林大学学报:工学版, 2021, 51 (3): 1131- 1139
YUAN Zhe-ming, YUAN Hong-jie, YAN Yu-xuan, et al Automatic recognition and classification of field insects based on lightweight deep learning model[J]. Journal of Jilin University: Engineering and Technology Edition, 2021, 51 (3): 1131- 1139
[10]   孟庆宽, 张漫, 杨晓霞, 等 基于轻量卷积结合特征信息融合的玉米幼苗与杂草识别[J]. 农业机械学报, 2020, 51 (12): 238- 245
MENG Qing-kuan, ZHANG Man, YANG Xiao-xia, et al Recognition of maize seedling and weed based on light weight convolution and feature fusion[J]. Transactions of The Chinese Society for Agricultural Machinery, 2020, 51 (12): 238- 245
doi: 10.6041/j.issn.1000-1298.2020.12.026
[11]   ZHANG X, KANG X, FENG N, et al Automatic recognition of dairy cow mastitis from thermal images by a deep learning detector[J]. Computers and Electronics in Agriculture, 2020, 178: 105754
doi: 10.1016/j.compag.2020.105754
[12]   HUANG S, HE Y, CHEN X M-YOLO: a nighttime vehicle detection method combining Mobilenet v2 and YOLO v3[J]. Journal of Physics: Conference Series, 2021, 1883 (1): 012094
doi: 10.1088/1742-6596/1883/1/012094
[13]   GU Y, GE B Research on lightweight convolutional neural network in garbage classification[J]. IOP Conference Series: Earth and Environmental Science, 2021, 781 (3): 032011
doi: 10.1088/1755-1315/781/3/032011
[14]   ZHANG, X, LI N, ZHANG R. An improved lightweight network MobileNetv3 based YOLOv3 for pedestrian detection[C]// The IEEE International Conference on Consumer Electronics and Computer Engineering. Guangzhou: ICCECE, 2021: 114-118.
[15]   CAI K, MIAO X, WANG W, et al A modified YOLOv3 model for fish detection based on MobileNet v1 as backbone[J]. Aquacultural Engineering, 2020, 91: 102117
doi: 10.1016/j.aquaeng.2020.102117
[16]   QIU Z, YAO Y, ZHONG M. Underwater sea cucumbers detection based on pruned SSD[C]// The IEEE 3rd Advanced Information Management, Communicates, Electronic and Automation Control Conference. Chongqing: IMCEC, 2019: 738-742.
[17]   强伟, 贺昱曜, 郭玉锦, 等 基于改进SSD 的水下目标检测算法研究[J]. 西北工业大学学报, 2020, 38 (4): 747- 754
QIANG Wei, HE Yu-yao, GUO Yu-jin, et al Research on underwater target detection algorithm based on improved SSD[J]. Journal of Northwestern Polytechnical University, 2020, 38 (4): 747- 754
doi: 10.3969/j.issn.1000-2758.2020.04.008
[18]   REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// The IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: CVPR, 2016: 779-788.
[19]   REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// The IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: CVPR , 2017: 6517-6525.
[20]   REDMON J, FARHADI A. YOLOv3: an incremental improvement [EB/OL]. (2018-04-08) [2021-8-12]. https://arxiv.org/abs/1804.02767.
[21]   HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]// The IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: CVPR, 2018: 7132-7141.
[22]   LIU Z, LI J, SHEN Z, et al. Learning efficient convolutional networks through network slimming [C]// The IEEE International Conference on Computer Vision. Shenzhen: ICCV, 2017: 2736-2744.
[23]   WOO S, PARK J, LEE J, et al. Cbam: convolutional block attention module [C]// The European Conference on Computer Vision. Munich: ECCV, 2018: 3-19.
[24]   CUTTER G, STIERHOFF K, ZENG J. Automated detection of rockfish in unconstrained underwater videos using haar cascades and a new image dataset: labeled fishes in the wild[C]// The IEEE Winter Conference on Applications of Computer Vision Workshops. Waikoloa: WACVW, 2015: 57-62.
[1] Ren-peng MO,Xiao-sheng SI,Tian-mei LI,Xu ZHU. Bearing life prediction based on multi-scale features and attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(7): 1447-1456.
[2] Yong-sheng ZHAO,Rui-xiang LI,Na-na NIU,Zhi-yong ZHAO. Shape control method of fuselage driven by digital twin[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(7): 1457-1463.
[3] You-wei WANG,Shuang TONG,Li-zhou FENG,Jian-ming ZHU,Yang LI,Fu CHEN. New inductive microblog rumor detection method based on graph convolutional network[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 956-966.
[4] Xiao-chen JU,Xin-xin ZHAO,Sheng-sheng QIAN. Self-attention mechanism based bridge bolt detection algorithm[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 901-908.
[5] Li HE,Shan-min PANG. Face reconstruction from voice based on age-supervised learning and face prior information[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 1006-1016.
[6] Xue-qin ZHANG,Tian-ren LI. Breast cancer pathological image classification based on Cycle-GAN and improved DPN network[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 727-735.
[7] Meng XU,Dan WANG,Zhi-yuan LI,Yuan-fang CHEN. IncepA-EEGNet: P300 signal detection method based on fusion of Inception network and attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 745-753, 782.
[8] Chang-yuan LIU,Xian-ping HE,Xiao-jun BI. Efficient network vehicle recognition combined with attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 775-782.
[9] Qiao-hong CHEN,Hao-lei PEI,Qi SUN. Image caption based on relational reasoning and context gate mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 542-549.
[10] Jing-hui CHU,Li-dong SHI,Pei-guang JING,Wei LV. Context-aware knowledge distillation network for object detection[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 503-509.
[11] Ruo-ran CHENG,Xiao-li ZHAO,Hao-jun ZHOU,Han-chen YE. Review of Chinese font style transfer research based on deep learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 510-519, 530.
[12] Yuan-jun NONG,Jun-jie WANG,Hong CHEN,Wen-han SUN,Hui GENG,Shu-yue LI. A image caption method of construction scene based on attention mechanism and encoding-decoding architecture[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 236-244.
[13] Ying-li LIU,Rui-gang WU,Chang-hui YAO,Tao SHEN. Construction method of extraction dataset of Al-Si alloy entity relationship[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 245-253.
[14] Xin WANG,Qiao-hong CHEN,Qi SUN,Yu-bo JIA. Visual question answering method based on relational reasoning and gating mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 36-46.
[15] Tong CHEN,Jian-feng GUO,Xin-zhong HAN,Xue-li XIE,Jian-xiang XI. Visible and infrared image matching method based on generative adversarial model[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 63-74.