Please wait a minute...
浙江大学学报(工学版)  2022, Vol. 56 Issue (1): 16-25    DOI: 10.3785/j.issn.1008-973X.2022.01.002
计算机技术、信息与电子工程     
旋转框定位的多尺度再生物品目标检测算法
董红召(),方浩杰,张楠
浙江工业大学 智能交通系统联合研究所,浙江 杭州 310014
Multi-scale object detection algorithm for recycled objects based on rotating block positioning
Hong-zhao DONG(),Hao-jie FANG,Nan ZHANG
ITS Joint Research Institute, Zhejiang University of Technology, Hangzhou 310014, China
 全文: PDF(1360 KB)   HTML
摘要:

针对传统目标检测算法未考虑实际分拣场景目标物形态尺度的多样性,无法获取旋转角度信息的问题,提出基于YOLOv5的改进算法MR2-YOLOv5. 通过添加角度预测分支,引入环形平滑标签(CSL)角度分类方法,完成旋转角度精准检测. 增加目标检测层用于提升模型不同尺度检测能力,在主干网络末端利用Transformer注意力机制对各通道赋予不同的权重,强化特征提取. 利用主干网络提取到的不同层次特征图输入BiFPN网络结构中,开展多尺度特征融合. 实验结果表明,MR2-YOLOv5在自制数据集上的均值平均精度(mAP)为90.56%,较仅添加角度预测分支的YOLOv5s基础网络提升5.36%;对于遮挡、透明、变形等目标物,均可以识别类别和旋转角度,图像单帧检测时间为0.02~0.03 s,满足分拣场景对目标检测算法的性能需求.

关键词: 再生物品检测YOLOv5旋转框检测环形平滑标签特征金字塔注意力机制    
Abstract:

An improved algorithm MR2-YOLOV5 based on YOLOv5 was proposed aiming at the problem that the traditional target detection algorithm did not consider the diversity of the target shape scale in the actual sorting scene and could not obtain the rotation angle information. Precise rotation angle detection was completed by adding angle prediction branches and introducing angle classification method of ring smooth label (CSL). The target detection layer was added to improve the detection ability of different scales of the model. Transformer attention mechanism was used at the end of the backbone network to give different weights to each channel and strengthen feature extraction. The feature graphs of different levels extracted from the backbone network were input into the BiFPN network structure to conduct multi-scale feature fusion. The experimental results showed that the mean average precision (mAP) of MR2-YOLOV5 on the self-made data set was 90.56%, which was 5.36% higher than that of YOLOv5s with only angle prediction branch. Categories and rotation angles can be recognized for objects such as occlusion, transparent and deformation. The detection time of single frame is 0.02-0.03 s, which meets the performance requirements of target detection algorithm for sorting scenes.

Key words: detection of recycled goods    YOLOv5    rotating frame detection    circular smooth label    feature pyramid    attentional mechanism
收稿日期: 2021-11-25 出版日期: 2022-01-05
CLC:  TP 391  
基金资助: 国家自然科学基金资助项目(61773347);浙江公益技术研究项目(LGF19F030001)
作者简介: 董红召(1969—),男,教授,从事智能交通、智能机电系统的研究. orcid.org/0000-0001-5905-597X. E-mail: its@zjut.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
董红召
方浩杰
张楠

引用本文:

董红召,方浩杰,张楠. 旋转框定位的多尺度再生物品目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(1): 16-25.

Hong-zhao DONG,Hao-jie FANG,Nan ZHANG. Multi-scale object detection algorithm for recycled objects based on rotating block positioning. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 16-25.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.01.002        https://www.zjujournals.com/eng/CN/Y2022/V56/I1/16

图 1  MR2-YOLOv5网络结构
图 2  MR2-YOLOv5各模块结构
图 3  CSL示意图
图 4  角度定义示意图
图 5  旋转框标注示意图(圆点为标注起始点)
图 6  数据集标签分布情况
图 7  Mosaic数据增强方法后的输入图像
模型 mAP/%
r = 0 r = 2 r = 4 r = 6 r = 8
YOLOv5s 68.79 82.47 84.20 81.20 79.94
表 1  不同窗口半径下的检测性能比较
多尺度融合网络 mAP/% M/106
FPN 81.05 6.52
PANet 84.20 7.55
BiFPN 85.36 7.83
表 2  多尺度特征融合网络性能的比较
角度预测分支(CSL) 检测层 Transformer 融合网络 M/106 FLOPs/109 AP mAP/%
Cs Gb Pb
? 3 ? PANet 7.28 17.1 91.7 90.1 85.4 89.05
? 4 ? PANet 8.17 28.2 94.5 92.4 87.6 91.34
? 4 PANet 8.17 28.0 95.4 94.1 90.2 93.23
? 4 BiFPN 9.25 29.6 95.2 94.7 93.4 94.46
3 ? PANet 7.55 18.0 90.5 82.5 79.1 85.20
4 ? PANet 8.73 33.4 92.1 85.3 82.2 86.53
4 PANet 9.91 32.6 93.8 87.5 83.9 88.44
4 BiFPN 10.99 34.2 96.5 91.2 84.2 90.56
表 3  基于YOLOv5s模型的消融实验
图 8  各模型训练曲线(无角度预测分支)
图 9  各模型训练曲线(含角度预测分支)
图 10  各模型验证集的角度损失曲线
图 11  不同尺度目标物检测效果
图 12  多目标物遮挡情况下的检测效果
图 13  目标物透明度较高情况下的检测效果
图 14  目标物发生变形情况下的检测效果
1 康庄, 杨杰, 郭濠奇 基于机器视觉的垃圾自动分类系统设计[J]. 浙江大学学报:工学版, 2020, 54 (7): 1272- 1280
KANG Zhuang, YANG Jie, GUO Hao-qi Automatic garbage classification system based on machine vision[J]. Journal of Zhejiang University: Engineering Science, 2020, 54 (7): 1272- 1280
2 谢先武, 熊禾根, 陶永, 等 一种面向机器人分拣的杂乱工件视觉检测识别方法[J]. 高技术通讯, 2018, 28 (4): 344- 353
XIE Xian-wu, XIONG He-gen, TAO Yong, et al A method for visual detection and recognition of clutter workpieces for robot sorting[J]. Chinese High Technology Letters, 2018, 28 (4): 344- 353
doi: 10.3772/j.issn.1002-0470.2018.04.008
3 YANG Z, LI D Wasnet: a neural network-based garbage collection management system[J]. IEEE Access, 2020, 8: 103984- 103993
doi: 10.1109/ACCESS.2020.2999678
4 陈智超, 焦海宁, 杨杰, 等 基于改进 MobileNet v2 的垃圾图像分类算法[J]. 浙江大学学报:工学版, 2021, 55 (8): 1490- 1499
CHEN Zhi-chao, JIAO Hai-ning, YANG Jie, et al Garbage image classification algorithm based on improved MobileNet v2[J]. Journal of Zhejiang University: Engineering Science, 2021, 55 (8): 1490- 1499
5 袁建野, 南新元, 蔡鑫, 等 基于轻量级残差网路的垃圾图片分类方法[J]. 环境工程, 2021, 39 (2): 6
YUAN Jian-ye, NAN Xin-yuan, CAI Xin, et al Garbage image classification by lightweight residual network[J]. Environmental Engineering, 2021, 39 (2): 6
6 NIE Z, DUAN W, LI X Domestic garbage recognition and detection based on Faster R-CNN[J]. Journal of Physics: Conference Series, 2021, 1738 (1): 012089
doi: 10.1088/1742-6596/1738/1/012089
7 LIANG B, WANG Y, WANG Y, et al Garbage sorting system based on composite layer cnn and multi-robots[J]. Journal of Physics: Conference Series, 2020, 1634 (1): 012083
doi: 10.1088/1742-6596/1634/1/012083
8 周滢慜. 基于机器视觉的生活垃圾智能分拣系统的设计与实现 [D]. 哈尔滨: 哈尔滨工业大学, 2018.
ZHOU Ying-min. Design and implementation of visionbased Sorting system for solid waste [D]. Harbin: Harbin Institute of Technology, 2018.
9 REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
10 LIU K, TANG H, HE S, et al. Performance validation of Yolo variants for object detection [C]// Proceedings of the 2021 International Conference on Bioinformatics and Intelligent Computing. Vancouver: [s. n. ], 2021: 239-243.
11 朱煜, 方观寿, 郑兵兵, 等. 基于旋转框精细定位的遥感目标检测方法研究 [EB/OL]. [2021-10-01]. http://www.aas.net.cn/cn/article/doi/10.16383/j.aas.C200261.
ZHU Yu, FANG Guan-shou, ZHENG Bing-bing, et al. Research on detection method of refined rotated boxes in remote sensing [EB/OL]. [2021-10-01]. http://www.aas.net.cn/cn/article/doi/10.16383/j.aas.C200261.
12 DING J, XUE N, LONG Y, et al. Learning roi transformer for oriented object detection in aerial images [C]// 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Bench: IEEE, 2019: 2849-2858.
13 XU Y, FU M, WANG Q, et al Gliding vertex on the horizontal bounding box for multi-oriented object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43 (4): 1452- 1459
14 CHEN Y , DING W , LI H , et al. Arbitrary-oriented dense object detection in remote sensing imagery [C]// 2018 IEEE 9th International Conference on Software Engineering and Service Science. Beijing: IEEE, 2019: 436-440.
15 YANG X, YANG J, YAN J, et al. Scrdet: towards more robust detection for small, cluttered and rotated objects [C]// 2019 IEEE/CVF International Conference on Computer Vision. South Korea: IEEE, 2019: 8232-8241.
16 YANG X, LIU Q, YAN J, et al. R3det: refined single-stage detector with feature refinement for rotating object [EB/OL]. [2021-10-01]. https://arxiv.org/abs/1908.05612.
17 YANG X, YAN J, HE T. On the arbitrary-oriented object detection: classification based approaches revisited [EB/OL]. [2021-10-01]. https://arxiv.org/abs/2003.05597v3.
18 ZHU X, LIU S, WANG X, et al. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 2778-2788.
19 FU L, GU W, LI W, et al Bidirectional parallel multi-branch convolution feature pyramid network for target detection in aerial images of swarm UAVs[J]. Defence Technology, 2021, 17 (4): 1531- 1541
doi: 10.1016/j.dt.2020.09.018
20 VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Advances in Neural Information Processing Systems. Long Beach: [s. n], 2017: 5998-6008.
21 鲁博, 瞿绍军 融合BiFPN和改进Yolov3-tiny网络的航拍图像车辆检测方法[J]. 小型微型计算机系统, 2021, 42 (8): 1694- 1698
LU bo, QU Shao-jun Vehicle detection method in aerial images based on BiFPN and improved Yolov3-tiny Network[J]. Journal of Chinese Computer Systems, 2021, 42 (8): 1694- 1698
doi: 10.3969/j.issn.1000-1220.2021.08.020
22 LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Hawaii: IEEE, 2017: 2117-2125.
23 LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759-8768.
24 GHIASI G, LIN T Y, LE Q V. Nas-fpn: Learning scalable feature pyramid architecture for object detection [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7036-7045.
[1] 鞠晓臣,赵欣欣,钱胜胜. 基于自注意力机制的桥梁螺栓检测算法[J]. 浙江大学学报(工学版), 2022, 56(5): 901-908.
[2] 王友卫,童爽,凤丽洲,朱建明,李洋,陈福. 基于图卷积网络的归纳式微博谣言检测新方法[J]. 浙江大学学报(工学版), 2022, 56(5): 956-966.
[3] 张雪芹,李天任. 基于Cycle-GAN和改进DPN网络的乳腺癌病理图像分类[J]. 浙江大学学报(工学版), 2022, 56(4): 727-735.
[4] 许萌,王丹,李致远,陈远方. IncepA-EEGNet: 融合Inception网络和注意力机制的P300信号检测方法[J]. 浙江大学学报(工学版), 2022, 56(4): 745-753, 782.
[5] 柳长源,何先平,毕晓君. 融合注意力机制的高效率网络车型识别[J]. 浙江大学学报(工学版), 2022, 56(4): 775-782.
[6] 杨淑琴,马玉浩,方铭宇,钱伟行,蔡洁萱,刘童. 基于实例分割的复杂环境车道线检测方法[J]. 浙江大学学报(工学版), 2022, 56(4): 809-815, 832.
[7] 陈巧红,裴皓磊,孙麒. 基于视觉关系推理与上下文门控机制的图像描述[J]. 浙江大学学报(工学版), 2022, 56(3): 542-549.
[8] 农元君,王俊杰,陈红,孙文涵,耿慧,李书悦. 基于注意力机制和编码-解码架构的施工场景图像描述方法[J]. 浙江大学学报(工学版), 2022, 56(2): 236-244.
[9] 刘英莉,吴瑞刚,么长慧,沈韬. 铝硅合金实体关系抽取数据集的构建方法[J]. 浙江大学学报(工学版), 2022, 56(2): 245-253.
[10] 王鑫,陈巧红,孙麒,贾宇波. 基于关系推理与门控机制的视觉问答方法[J]. 浙江大学学报(工学版), 2022, 56(1): 36-46.
[11] 陈智超,焦海宁,杨杰,曾华福. 基于改进MobileNet v2的垃圾图像分类算法[J]. 浙江大学学报(工学版), 2021, 55(8): 1490-1499.
[12] 雍子叶,郭继昌,李重仪. 融入注意力机制的弱监督水下图像增强算法[J]. 浙江大学学报(工学版), 2021, 55(3): 555-562.
[13] 徐利锋,黄海帆,丁维龙,范玉雷. 基于改进DenseNet的水果小目标检测[J]. 浙江大学学报(工学版), 2021, 55(2): 377-385.
[14] 陈涵娟,达飞鹏,盖绍彦. 基于竞争注意力融合的深度三维点云分类网络[J]. 浙江大学学报(工学版), 2021, 55(12): 2342-2351.
[15] 陈岳林,田文靖,蔡晓东,郑淑婷. 基于密集连接网络和多维特征融合的文本匹配模型[J]. 浙江大学学报(工学版), 2021, 55(12): 2352-2358.