轻量化机器人抓取位姿实时检测算法

doi:10.3785/j.issn.1008-973X.2024.03.017

浙江大学学报(工学版)

2024, Vol. 58

Issue (3): 599-610 DOI: 10.3785/j.issn.1008-973X.2024.03.017

机械工程

轻量化机器人抓取位姿实时检测算法

宋明俊(

),严文,邓益昭,张俊然,涂海燕*(

)

1. 四川大学电气工程学院，四川成都 610065

Light-weight algorithm for real-time robotic grasp detection

Mingjun SONG(

),Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU*(

)

1. College of Electrical Engineering, Sichuan University, Chengdu 610065, China

全文: PDF(4675 KB) HTML

摘要：

针对机器人对形状、大小、种类变化不一的未知物体的抓取，提出轻量化的抓取位姿实时检测算法RTGN，以进一步提升抓取检测准确率及检测速度. 设计多尺度空洞卷积模块，以构建轻量化的特征提取主干网络；设计混合注意力模块，以加强网络对重要抓取特征的关注；引入金字塔池化模块融合多层级特征，以提升网络对物体的抓取感知能力. 在Cornell抓取数据集上进行测试，RTGN检测速度为142帧/s，在图像拆分和对象拆分上的检测准确率分别为98. 26%和97. 65%；在实际抓取环境下进行抓取实验，机器人对20类未知物体进行400次抓取，抓取成功率为96. 0%. 实验结果表明，RTGN的检测准确率和检测速度较现有方法有明显提升，对物体的位置和姿态变化具有较强的适应性，并且能够有效地泛化到形状、大小、种类等变化不一的未知物体的抓取检测中.

关键词： 机器人抓取; 抓取检测; 注意力机制; 卷积神经网络; 深度学习; 非结构化环境

Abstract:

A light-weight, real-time approach named RTGN (real-time grasp net) was proposed to improve the accuracy and speed of robotic grasp detection for novel objects of diverse shapes, types and sizes. Firstly, a multi-scale dilated convolution module was designed to construct a light-weight feature extraction backbone. Secondly, a mixed attention module was designed to help the network focus more on meaningful features. Finally, the pyramid pool module was deployed to fuse the multi-level features extracted by the network, thereby improving the capability of grasp perception to the object. On the Cornell grasping dataset, RTGN generated grasps at a speed of 142 frame per second and attained accuracy rates of 98.26% and 97.65% on image-wise and object-wise splits, respectively. In real-world robotic grasping experiments, RTGN obtained a success rate of 96.0% in 400 grasping attempts across 20 novel objects. Experimental results demonstrate that RTGN outperforms existing methods in both detection accuracy and detection speed. Furthermore, RTGN shows strong adaptability to variations in the position and pose of grasped objects, effectively generalizing to novel objects of diverse shapes, types and sizes.

Key words: robotic grasping grasp detection attention mechanism convolutional neural networks deep learning unstructured environment

收稿日期: 2023-08-22 出版日期: 2024-03-05

CLC:

TP 242

基金资助: 国家自然科学基金资助项目（12126606）；四川省科技计划资助项目（23ZDYF2913）；德阳科技（揭榜）资助项目（2021JBJZ007）；智能电网四川省重点实验室应急重点资助项目（020IEPG-KL-20YJ01）.

通讯作者: 涂海燕 E-mail: mingjun_s@foxmail.com;haiyantu@163.com

作者简介: 宋明俊（1999—），男，硕士生，从事机器人抓取研究. orcid.org/0009-0001-2417-8562. E-mail：mingjun_s@foxmail.com

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	宋明俊
	严文
	邓益昭
	张俊然
	涂海燕

引用本文:

宋明俊,严文,邓益昭,张俊然,涂海燕. 轻量化机器人抓取位姿实时检测算法[J]. 浙江大学学报(工学版), 2024, 58(3): 599-610.

Mingjun SONG,Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU. Light-weight algorithm for real-time robotic grasp detection. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 599-610.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2024.03.017 或 https://www.zjujournals.com/eng/CN/Y2024/V58/I3/599

图 1 五维抓取表示[6]示意图

图 2 四维抓取参数热力图表示[16]

图 3 RTGN抓取检测算法整体结构

图 4 多尺度空洞卷积模块结构图

图 5 不同空洞率下3×3大小的空洞卷积核

图 6 通道注意力模块结构图

图 7 坐标注意力模块结构图

图 8 混合注意力模块结构图

图 9 金字塔池化模块结构图

图 10 预测输出头结构图

表 1 Cornell抓取数据集上不同算法对比结果

图 11 RTGN在Cornell数据集上的抓取检测可视化结果

图 12 Cornell数据集的不完全标注

表 2 Cornell数据集上模块消融实验对比结果

表 3 不同方法的模型性能和参数大小对比结果

图 13 RTGN和TF-Grasp[19]对单个未知物体的抓取检测对比结果

图 14 RTGN对多个未知物体的抓取检测可视化结果

图 15 机器人抓取实验平台

图 16 机器人抓取实验所用物体

图 17 机器人对未知物体的抓取

表 4 机器人抓取统计结果

1	刘亚欣, 王斯瑶, 姚玉峰, 等机器人抓取检测技术的研究现状[J]. 控制与决策, 2020, 35 (12): 2817- 2828 LIU Yaxin, WANG Siyao, YAO Yufeng, et al Recent researches on robot autonomous grasp technology[J]. Control and Decision, 2020, 35 (12): 2817- 2828
2	BOHG J, MORALES A, ASFOUR T, et al Data-driven grasp synthesis: a survey[J]. IEEE Transactions on Robotics, 2014, 30 (2): 289- 309 doi: 10.1109/TRO.2013.2289018
3	仲训杲, 徐敏, 仲训昱, 等基于多模特征深度学习的机器人抓取判别方法[J]. 自动化学报, 2016, 42 (7): 1022- 1029 ZHONG Xungao, XU Min, ZHONG Xunyu, et al Multimodal features deep learning for robotic potential grasp recognition[J]. Acta Automatica Sinica, 2016, 42 (7): 1022- 1029
4	杜学丹, 蔡莹皓, 鲁涛, 等一种基于深度学习的机械臂抓取方法[J]. 机器人, 2017, 39 (6): 820- 828 DU Xuedan, CAI Yinghao, LU Tao, et al A robotic grasping method based on deep learning[J]. Robot, 2017, 39 (6): 820- 828
5	JIANG Y, MOSESON S, SAXENA A. Efficient grasping from RGBD images: learning using a new rectangle representation [C]// IEEE International Conference on Robotics and Automation . Shanghai: IEEE, 2011: 3304−3311.
6	LENZ I, LEE H, SAXENA A Deep learning for detecting robotic grasps[J]. The International Journal of Robotics Research, 2015, 34 (4/5): 705- 724
7	张云洲, 李奇, 曹赫, 等基于多层级特征的机械臂单阶段抓取位姿检测[J]. 控制与决策, 2021, 36 (8): 1815- 1824 ZHANG Yunzhou, LI Qi, CAO He, et al Single-stage grasp pose detection of manipulator based on multi-level features[J]. Control and Decision, 2021, 36 (8): 1815- 1824
8	夏晶, 钱堃, 马旭东, 等基于级联卷积神经网络的机器人平面抓取位姿快速检测[J]. 机器人, 2018, 40 (6): 794- 802 XIA Jing, QIAN Kun, MA Xudong, et al Fast planar grasp pose detection for robot based on cascaded deep convolutional neural networks[J]. Robot, 2018, 40 (6): 794- 802
9	喻群超, 尚伟伟, 张驰基于三级卷积神经网络的物体抓取检测[J]. 机器人, 2018, 40 (5): 762- 768 YU Qunchao, SHANG Weiwei, ZHANG Chi Object grasp detecting based on three-level convolution neural network[J]. Robot, 2018, 40 (5): 762- 768
10	REDMON J, ANGELOVA A. Real-time grasp detection using convolutional neural networks [C]// IEEE International Conference on Robotics and Automation . Seattle: IEEE, 2015: 1316−1322.
11	KUMRA S, KANAN C. Robotic grasp detection using deep convolutional neural networks [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Vancouver: IEEE, 2017: 769−776.
12	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Las Vegas: IEEE, 2016: 770−778.
13	GUO D, SUN F C, LIU H P, et al. A hybrid deep architecture for robotic grasp detection [C]// IEEE International Conference on Robotics and Automation . Singapore: IEEE, 2017: 1609−1614.
14	CHU F J, XU R N, VELA P A Real-world multiobject, multigrasp detection[J]. IEEE Robotics and Automation Letters, 2018, 3 (4): 3355- 3362 doi: 10.1109/LRA.2018.2852777
15	ZHOU X W, LAN X G, ZHANG H B, et al. Fully convolutional grasp detection network with oriented anchor box [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Madrid: IEEE, 2018: 7223−7230.
16	MORRISON D, CORKE P, LEITNER J. Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach [EB/OL]. (2018−05−15) [2023−02−06]. https://arxiv.org/abs/1804.05172v2.
17	KUMRA S, JOSHI S, SAHIN F. Antipodal robotic grasping using generative residual convolutional neural network [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Las Vegas: IEEE, 2020: 9626−9633.
18	CHENG H, WANG Y Y, MENG Max Q H. Grasp pose detection from a single RGB image [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Prague: IEEE, 2021: 4686−4691.
19	WANG S C, ZHOU Z L, KAN Z When transformer meets robotic grasping: exploits context for efficient grasp detection[J]. IEEE Robotics and Automation Letters, 2022, 7 (3): 8170- 8177 doi: 10.1109/LRA.2022.3187261
20	ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network [C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Honolulu: IEEE, 2017: 6230−6239.
21	WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation [C]// IEEE Winter Conference on Applications of Computer Vision (WACV) . Lake Tahoe: IEEE, 2018: 1451−1460.
22	SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al Dropout: a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15 (1): 1929- 1958
23	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision (ECCV) . Munich: Springer, 2018: 3−19.
24	HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Nashville: IEEE, 2021: 13708−13717.

[1]	姚鑫骅,于涛,封森文,马梓健,栾丛丛,沈洪垚. 基于图神经网络的零件机加工特征识别方法[J]. 浙江大学学报(工学版), 2024, 58(2): 349-359.
[2]	秦思怡,盖绍彦,达飞鹏. 混合采样下多级特征聚合的视频目标检测算法[J]. 浙江大学学报(工学版), 2024, 58(1): 10-19.
[3]	孙雪菲,张瑞峰,关欣,李锵. 强化先验骨架结构的轻量型高效人体姿态估计[J]. 浙江大学学报(工学版), 2024, 58(1): 50-60.
[4]	李海烽,张雪英,段淑斐,贾海蓉,Huizhi Liang . 融合生成对抗网络与时间卷积网络的普通话情感识别[J]. 浙江大学学报(工学版), 2023, 57(9): 1865-1875.
[5]	宋昭漾,赵小强,惠永永,蒋红梅. 基于多级连续编码与解码的图像超分辨率重建算法[J]. 浙江大学学报(工学版), 2023, 57(9): 1885-1893.
[6]	郑超昊,尹志伟,曾钢锋,许月萍,周鹏,刘莉. 基于时空深度学习模型的数值降水预报后处理[J]. 浙江大学学报(工学版), 2023, 57(9): 1756-1765.
[7]	赵小强,王泽,宋昭漾,蒋红梅. 基于动态注意力网络的图像超分辨率重建[J]. 浙江大学学报(工学版), 2023, 57(8): 1487-1494.
[8]	王慧欣,童向荣. 融合知识图谱的推荐系统研究进展[J]. 浙江大学学报(工学版), 2023, 57(8): 1527-1540.
[9]	王殿海,谢瑞,蔡正义. 基于最优汇集时间间隔的城市间断交通流预测[J]. 浙江大学学报(工学版), 2023, 57(8): 1607-1617.
[10]	宋秀兰,董兆航,单杭冠,陆炜杰. 基于时空融合的多头注意力车辆轨迹预测[J]. 浙江大学学报(工学版), 2023, 57(8): 1636-1643.
[11]	李晓艳,王鹏,郭嘉,李雪,孙梦宇. 基于双注意力机制的多分支孪生网络目标跟踪[J]. 浙江大学学报(工学版), 2023, 57(7): 1307-1316.
[12]	杨哲,葛洪伟,李婷. 特征融合与分发的多专家并行推荐算法框架[J]. 浙江大学学报(工学版), 2023, 57(7): 1317-1325.
[13]	李云红,段姣姣,苏雪平,张蕾涛,于惠康,刘杏瑞. 基于改进生成对抗网络的书法字生成算法[J]. 浙江大学学报(工学版), 2023, 57(7): 1326-1334.
[14]	权巍,蔡永青,王超,宋佳,孙鸿凯,李林轩. 基于3D-ResNet双流网络的VR病评估模型[J]. 浙江大学学报(工学版), 2023, 57(7): 1345-1353.
[15]	周欣磊,顾海挺,刘晶,许月萍,耿芳,王冲. 基于集成学习与深度学习的日供水量预测方法[J]. 浙江大学学报(工学版), 2023, 57(6): 1120-1127.

Viewed

Full text

Abstract

Cited

Shared

Discussed