Please wait a minute...
浙江大学学报(工学版)  2024, Vol. 58 Issue (3): 599-610    DOI: 10.3785/j.issn.1008-973X.2024.03.017
机械工程     
轻量化机器人抓取位姿实时检测算法
宋明俊(),严文,邓益昭,张俊然,涂海燕*()
1. 四川大学 电气工程学院,四川 成都 610065
Light-weight algorithm for real-time robotic grasp detection
Mingjun SONG(),Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU*()
1. College of Electrical Engineering, Sichuan University, Chengdu 610065, China
 全文: PDF(4675 KB)   HTML
摘要:

针对机器人对形状、大小、种类变化不一的未知物体的抓取,提出轻量化的抓取位姿实时检测算法RTGN,以进一步提升抓取检测准确率及检测速度. 设计多尺度空洞卷积模块,以构建轻量化的特征提取主干网络;设计混合注意力模块,以加强网络对重要抓取特征的关注;引入金字塔池化模块融合多层级特征,以提升网络对物体的抓取感知能力. 在Cornell抓取数据集上进行测试,RTGN检测速度为142帧/s,在图像拆分和对象拆分上的检测准确率分别为98. 26%和97. 65%;在实际抓取环境下进行抓取实验,机器人对20类未知物体进行400次抓取,抓取成功率为96. 0%. 实验结果表明,RTGN的检测准确率和检测速度较现有方法有明显提升,对物体的位置和姿态变化具有较强的适应性,并且能够有效地泛化到形状、大小、种类等变化不一的未知物体的抓取检测中.

关键词: 机器人抓取抓取检测注意力机制卷积神经网络深度学习非结构化环境    
Abstract:

A light-weight, real-time approach named RTGN (real-time grasp net) was proposed to improve the accuracy and speed of robotic grasp detection for novel objects of diverse shapes, types and sizes. Firstly, a multi-scale dilated convolution module was designed to construct a light-weight feature extraction backbone. Secondly, a mixed attention module was designed to help the network focus more on meaningful features. Finally, the pyramid pool module was deployed to fuse the multi-level features extracted by the network, thereby improving the capability of grasp perception to the object. On the Cornell grasping dataset, RTGN generated grasps at a speed of 142 frame per second and attained accuracy rates of 98.26% and 97.65% on image-wise and object-wise splits, respectively. In real-world robotic grasping experiments, RTGN obtained a success rate of 96.0% in 400 grasping attempts across 20 novel objects. Experimental results demonstrate that RTGN outperforms existing methods in both detection accuracy and detection speed. Furthermore, RTGN shows strong adaptability to variations in the position and pose of grasped objects, effectively generalizing to novel objects of diverse shapes, types and sizes.

Key words: robotic grasping    grasp detection    attention mechanism    convolutional neural networks    deep learning    unstructured environment
收稿日期: 2023-08-22 出版日期: 2024-03-05
CLC:  TP 242  
基金资助: 国家自然科学基金资助项目(12126606);四川省科技计划资助项目(23ZDYF2913);德阳科技(揭榜)资助项目(2021JBJZ007);智能电网四川省重点实验室应急重点资助项目(020IEPG-KL-20YJ01).
通讯作者: 涂海燕     E-mail: mingjun_s@foxmail.com;haiyantu@163.com
作者简介: 宋明俊(1999—),男,硕士生,从事机器人抓取研究. orcid.org/0009-0001-2417-8562. E-mail:mingjun_s@foxmail.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
宋明俊
严文
邓益昭
张俊然
涂海燕

引用本文:

宋明俊,严文,邓益昭,张俊然,涂海燕. 轻量化机器人抓取位姿实时检测算法[J]. 浙江大学学报(工学版), 2024, 58(3): 599-610.

Mingjun SONG,Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU. Light-weight algorithm for real-time robotic grasp detection. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 599-610.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2024.03.017        https://www.zjujournals.com/eng/CN/Y2024/V58/I3/599

图 1  五维抓取表示[6]示意图
图 2  四维抓取参数热力图表示[16]
图 3  RTGN抓取检测算法整体结构
图 4  多尺度空洞卷积模块结构图
图 5  不同空洞率下3×3大小的空洞卷积核
图 6  通道注意力模块结构图
图 7  坐标注意力模块结构图
图 8  混合注意力模块结构图
图 9  金字塔池化模块结构图
图 10  预测输出头结构图
方法A/%v/ms
Image-wiseObject-wise
Jiang 等[5]60. 5058. 305000
Lenz 等[6]73. 9075. 601350
Redmon 等[10]88. 0087. 1076
Kumra 等[11]89. 2188. 96103
Guo 等[13]93. 2089. 10
Chu 等[14]96. 0096. 10120
Zhou 等[15]97. 7496. 61118
夏晶等[8]93. 8091. 3057
喻群超等[9]94. 1093. 30
张云洲等[7]95. 7194. 0117
Morrison 等[16]73. 0069. 0019
Kumra 等[17]97. 7096. 6020
Cheng 等[18]98. 0097. 0073
Wang 等[19]97. 9996. 7041. 6
RTGN98. 2697. 657
表 1  Cornell抓取数据集上不同算法对比结果
图 11  RTGN在Cornell数据集上的抓取检测可视化结果
图 12  Cornell数据集的不完全标注
网络架构A/%v/ms
Image-wiseObject-wise
MDM-Backbone97. 7396. 805. 29
+CBAM97. 86(+0. 13)96. 90(+0. 10)6. 42
+MAM97. 91(+0. 18)97. 00(+0. 20)6. 60
+PPM97. 95(+0. 22)97. 03(+0. 23)5. 64
+CBAM+PPM98. 08(+0. 35)97. 18(+0. 38)6. 74
+MAM+PPM98. 26(+0. 53)97. 65(+0. 85)6. 96
表 2  Cornell数据集上模块消融实验对比结果
方法A/%v/msP/MF/G
Image-wiseObject-wise
Kumra 等[11]89. 2188. 96103>32
Chu 等[14]96. 0096. 1012028. 18
Zhou 等[15]97. 7496. 61118>30
Morrison 等 [16]73. 0069. 00190. 062
夏晶等[8]93. 8091. 3057>46
张云洲等[7]95. 7194. 0117>12
RTGN98. 2697. 6571. 6608. 00
表 3  不同方法的模型性能和参数大小对比结果
图 13  RTGN和TF-Grasp[19]对单个未知物体的抓取检测对比结果
图 14  RTGN对多个未知物体的抓取检测可视化结果
图 15  机器人抓取实验平台
图 16  机器人抓取实验所用物体
图 17  机器人对未知物体的抓取
物体As物体As
桔子100% (20/20)糖果100% (20/20)
饼干100% (20/20)塑料盘100% (20/20)
鼠标85% (17/20)塑料碗95% (19/20)
纸杯90% (18/20)雨伞80% (16/20)
酒精喷雾瓶90% (18/20)胶布100% (20/20)
五号电池100% (20/20)圆柱积木100% (20/20)
螺丝刀100% (20/20)牛奶盒95% (19/20)
牙膏盒100% (20/20)牙膏90% (18/20)
洗衣液瓶100% (20/20)刷子100% (20/20)
洗面奶95% (19/20)化妆水瓶100% (20/20)
表 4  机器人抓取统计结果
1 刘亚欣, 王斯瑶, 姚玉峰, 等 机器人抓取检测技术的研究现状[J]. 控制与决策, 2020, 35 (12): 2817- 2828
LIU Yaxin, WANG Siyao, YAO Yufeng, et al Recent researches on robot autonomous grasp technology[J]. Control and Decision, 2020, 35 (12): 2817- 2828
2 BOHG J, MORALES A, ASFOUR T, et al Data-driven grasp synthesis: a survey[J]. IEEE Transactions on Robotics, 2014, 30 (2): 289- 309
doi: 10.1109/TRO.2013.2289018
3 仲训杲, 徐敏, 仲训昱, 等 基于多模特征深度学习的机器人抓取判别方法[J]. 自动化学报, 2016, 42 (7): 1022- 1029
ZHONG Xungao, XU Min, ZHONG Xunyu, et al Multimodal features deep learning for robotic potential grasp recognition[J]. Acta Automatica Sinica, 2016, 42 (7): 1022- 1029
4 杜学丹, 蔡莹皓, 鲁涛, 等 一种基于深度学习的机械臂抓取方法[J]. 机器人, 2017, 39 (6): 820- 828
DU Xuedan, CAI Yinghao, LU Tao, et al A robotic grasping method based on deep learning[J]. Robot, 2017, 39 (6): 820- 828
5 JIANG Y, MOSESON S, SAXENA A. Efficient grasping from RGBD images: learning using a new rectangle representation [C]// IEEE International Conference on Robotics and Automation . Shanghai: IEEE, 2011: 3304−3311.
6 LENZ I, LEE H, SAXENA A Deep learning for detecting robotic grasps[J]. The International Journal of Robotics Research, 2015, 34 (4/5): 705- 724
7 张云洲, 李奇, 曹赫, 等 基于多层级特征的机械臂单阶段抓取位姿检测[J]. 控制与决策, 2021, 36 (8): 1815- 1824
ZHANG Yunzhou, LI Qi, CAO He, et al Single-stage grasp pose detection of manipulator based on multi-level features[J]. Control and Decision, 2021, 36 (8): 1815- 1824
8 夏晶, 钱堃, 马旭东, 等 基于级联卷积神经网络的机器人平面抓取位姿快速检测[J]. 机器人, 2018, 40 (6): 794- 802
XIA Jing, QIAN Kun, MA Xudong, et al Fast planar grasp pose detection for robot based on cascaded deep convolutional neural networks[J]. Robot, 2018, 40 (6): 794- 802
9 喻群超, 尚伟伟, 张驰 基于三级卷积神经网络的物体抓取检测[J]. 机器人, 2018, 40 (5): 762- 768
YU Qunchao, SHANG Weiwei, ZHANG Chi Object grasp detecting based on three-level convolution neural network[J]. Robot, 2018, 40 (5): 762- 768
10 REDMON J, ANGELOVA A. Real-time grasp detection using convolutional neural networks [C]// IEEE International Conference on Robotics and Automation . Seattle: IEEE, 2015: 1316−1322.
11 KUMRA S, KANAN C. Robotic grasp detection using deep convolutional neural networks [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Vancouver: IEEE, 2017: 769−776.
12 HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Las Vegas: IEEE, 2016: 770−778.
13 GUO D, SUN F C, LIU H P, et al. A hybrid deep architecture for robotic grasp detection [C]// IEEE International Conference on Robotics and Automation . Singapore: IEEE, 2017: 1609−1614.
14 CHU F J, XU R N, VELA P A Real-world multiobject, multigrasp detection[J]. IEEE Robotics and Automation Letters, 2018, 3 (4): 3355- 3362
doi: 10.1109/LRA.2018.2852777
15 ZHOU X W, LAN X G, ZHANG H B, et al. Fully convolutional grasp detection network with oriented anchor box [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Madrid: IEEE, 2018: 7223−7230.
16 MORRISON D, CORKE P, LEITNER J. Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach [EB/OL]. (2018−05−15) [2023−02−06]. https://arxiv.org/abs/1804.05172v2.
17 KUMRA S, JOSHI S, SAHIN F. Antipodal robotic grasping using generative residual convolutional neural network [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Las Vegas: IEEE, 2020: 9626−9633.
18 CHENG H, WANG Y Y, MENG Max Q H. Grasp pose detection from a single RGB image [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Prague: IEEE, 2021: 4686−4691.
19 WANG S C, ZHOU Z L, KAN Z When transformer meets robotic grasping: exploits context for efficient grasp detection[J]. IEEE Robotics and Automation Letters, 2022, 7 (3): 8170- 8177
doi: 10.1109/LRA.2022.3187261
20 ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network [C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Honolulu: IEEE, 2017: 6230−6239.
21 WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation [C]// IEEE Winter Conference on Applications of Computer Vision (WACV) . Lake Tahoe: IEEE, 2018: 1451−1460.
22 SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al Dropout: a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15 (1): 1929- 1958
23 WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision (ECCV) . Munich: Springer, 2018: 3−19.
24 HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Nashville: IEEE, 2021: 13708−13717.
[1] 姚鑫骅,于涛,封森文,马梓健,栾丛丛,沈洪垚. 基于图神经网络的零件机加工特征识别方法[J]. 浙江大学学报(工学版), 2024, 58(2): 349-359.
[2] 秦思怡,盖绍彦,达飞鹏. 混合采样下多级特征聚合的视频目标检测算法[J]. 浙江大学学报(工学版), 2024, 58(1): 10-19.
[3] 孙雪菲,张瑞峰,关欣,李锵. 强化先验骨架结构的轻量型高效人体姿态估计[J]. 浙江大学学报(工学版), 2024, 58(1): 50-60.
[4] 李海烽,张雪英,段淑斐,贾海蓉,Huizhi Liang . 融合生成对抗网络与时间卷积网络的普通话情感识别[J]. 浙江大学学报(工学版), 2023, 57(9): 1865-1875.
[5] 宋昭漾,赵小强,惠永永,蒋红梅. 基于多级连续编码与解码的图像超分辨率重建算法[J]. 浙江大学学报(工学版), 2023, 57(9): 1885-1893.
[6] 郑超昊,尹志伟,曾钢锋,许月萍,周鹏,刘莉. 基于时空深度学习模型的数值降水预报后处理[J]. 浙江大学学报(工学版), 2023, 57(9): 1756-1765.
[7] 赵小强,王泽,宋昭漾,蒋红梅. 基于动态注意力网络的图像超分辨率重建[J]. 浙江大学学报(工学版), 2023, 57(8): 1487-1494.
[8] 王慧欣,童向荣. 融合知识图谱的推荐系统研究进展[J]. 浙江大学学报(工学版), 2023, 57(8): 1527-1540.
[9] 王殿海,谢瑞,蔡正义. 基于最优汇集时间间隔的城市间断交通流预测[J]. 浙江大学学报(工学版), 2023, 57(8): 1607-1617.
[10] 宋秀兰,董兆航,单杭冠,陆炜杰. 基于时空融合的多头注意力车辆轨迹预测[J]. 浙江大学学报(工学版), 2023, 57(8): 1636-1643.
[11] 李晓艳,王鹏,郭嘉,李雪,孙梦宇. 基于双注意力机制的多分支孪生网络目标跟踪[J]. 浙江大学学报(工学版), 2023, 57(7): 1307-1316.
[12] 杨哲,葛洪伟,李婷. 特征融合与分发的多专家并行推荐算法框架[J]. 浙江大学学报(工学版), 2023, 57(7): 1317-1325.
[13] 李云红,段姣姣,苏雪平,张蕾涛,于惠康,刘杏瑞. 基于改进生成对抗网络的书法字生成算法[J]. 浙江大学学报(工学版), 2023, 57(7): 1326-1334.
[14] 权巍,蔡永青,王超,宋佳,孙鸿凯,李林轩. 基于3D-ResNet双流网络的VR病评估模型[J]. 浙江大学学报(工学版), 2023, 57(7): 1345-1353.
[15] 周欣磊,顾海挺,刘晶,许月萍,耿芳,王冲. 基于集成学习与深度学习的日供水量预测方法[J]. 浙江大学学报(工学版), 2023, 57(6): 1120-1127.