Light-weight algorithm for real-time robotic grasp detection

doi:10.3785/j.issn.1008-973X.2024.03.017

Journal of ZheJiang University (Engineering Science)

2024, Vol. 58

Issue (3): 599-610 DOI: 10.3785/j.issn.1008-973X.2024.03.017

Light-weight algorithm for real-time robotic grasp detection

Mingjun SONG(

),Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU*(

)

1. College of Electrical Engineering, Sichuan University, Chengdu 610065, China

Download:

HTML

PDF(4675KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A light-weight, real-time approach named RTGN (real-time grasp net) was proposed to improve the accuracy and speed of robotic grasp detection for novel objects of diverse shapes, types and sizes. Firstly, a multi-scale dilated convolution module was designed to construct a light-weight feature extraction backbone. Secondly, a mixed attention module was designed to help the network focus more on meaningful features. Finally, the pyramid pool module was deployed to fuse the multi-level features extracted by the network, thereby improving the capability of grasp perception to the object. On the Cornell grasping dataset, RTGN generated grasps at a speed of 142 frame per second and attained accuracy rates of 98.26% and 97.65% on image-wise and object-wise splits, respectively. In real-world robotic grasping experiments, RTGN obtained a success rate of 96.0% in 400 grasping attempts across 20 novel objects. Experimental results demonstrate that RTGN outperforms existing methods in both detection accuracy and detection speed. Furthermore, RTGN shows strong adaptability to variations in the position and pose of grasped objects, effectively generalizing to novel objects of diverse shapes, types and sizes.

Key words： robotic grasping grasp detection attention mechanism convolutional neural networks deep learning unstructured environment

Received: 22 August 2023 Published: 05 March 2024

CLC:

TP 242

Fund: 国家自然科学基金资助项目（12126606）；四川省科技计划资助项目（23ZDYF2913）；德阳科技（揭榜）资助项目（2021JBJZ007）；智能电网四川省重点实验室应急重点资助项目（020IEPG-KL-20YJ01）.

Corresponding Authors: Haiyan TU E-mail: mingjun_s@foxmail.com;haiyantu@163.com

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Mingjun SONG
	Wen YAN
	Yizhao DENG
	Junran ZHANG
	Haiyan TU

Cite this article:

Mingjun SONG,Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU. Light-weight algorithm for real-time robotic grasp detection. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 599-610.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2024.03.017 OR https://www.zjujournals.com/eng/Y2024/V58/I3/599

轻量化机器人抓取位姿实时检测算法

针对机器人对形状、大小、种类变化不一的未知物体的抓取，提出轻量化的抓取位姿实时检测算法RTGN，以进一步提升抓取检测准确率及检测速度. 设计多尺度空洞卷积模块，以构建轻量化的特征提取主干网络；设计混合注意力模块，以加强网络对重要抓取特征的关注；引入金字塔池化模块融合多层级特征，以提升网络对物体的抓取感知能力. 在Cornell抓取数据集上进行测试，RTGN检测速度为142帧/s，在图像拆分和对象拆分上的检测准确率分别为98. 26%和97. 65%；在实际抓取环境下进行抓取实验，机器人对20类未知物体进行400次抓取，抓取成功率为96. 0%. 实验结果表明，RTGN的检测准确率和检测速度较现有方法有明显提升，对物体的位置和姿态变化具有较强的适应性，并且能够有效地泛化到形状、大小、种类等变化不一的未知物体的抓取检测中.

关键词： 机器人抓取, 抓取检测, 注意力机制, 卷积神经网络, 深度学习, 非结构化环境

Fig.1 Schematic diagram of five-dimensional grasp representation^[6]

Fig.2 Four-dimensional grasp representation using heatmaps^[16]

Fig.3 Overview architecture of RTGN grasp detection algorithm

Fig.4 Structure of multi-scale dilated convolution module

Fig.5 3×3 dilated convolution kernels at different dilation rates

Fig.6 Structure of channel attention module

Fig.7 Structure of coordinate attention module

Fig.8 Structure of mixed attention module

Fig.9 Structure of pyramid pool module

Fig.10 Structure of predict head

Tab.1 Comparison results of different algorithms on Cornell grasping dataset

Fig.11 Visualization results of grasping detection on Cornell grasping dataset predicted by RTGN

Fig.12 Incomplete labelled ground truth of Cornell grasping dataset

Tab.2 Ablation experiments on Cornell grasping dataset

Tab.3 Comparison results of network performance and size for different methods

Fig.13 Comparison results of RTGN and TF-Grasp^[19] on grasping detection for single novel object

Fig.14 Visualization results of grasping detection for multiple novel objects predicted by RTGN

Fig.15 Physical platform of robotic grasping experiment

Fig.16 Objects used in robotic grasping experiment

Fig.17 Robotic grasping of novel objects

Tab.4 Statistic results of robotic grasping experiment


[1]	刘亚欣, 王斯瑶, 姚玉峰, 等机器人抓取检测技术的研究现状[J]. 控制与决策, 2020, 35 (12): 2817- 2828 LIU Yaxin, WANG Siyao, YAO Yufeng, et al Recent researches on robot autonomous grasp technology[J]. Control and Decision, 2020, 35 (12): 2817- 2828

[2]	BOHG J, MORALES A, ASFOUR T, et al Data-driven grasp synthesis: a survey[J]. IEEE Transactions on Robotics, 2014, 30 (2): 289- 309 doi: 10.1109/TRO.2013.2289018

[3]	仲训杲, 徐敏, 仲训昱, 等基于多模特征深度学习的机器人抓取判别方法[J]. 自动化学报, 2016, 42 (7): 1022- 1029 ZHONG Xungao, XU Min, ZHONG Xunyu, et al Multimodal features deep learning for robotic potential grasp recognition[J]. Acta Automatica Sinica, 2016, 42 (7): 1022- 1029

[4]	杜学丹, 蔡莹皓, 鲁涛, 等一种基于深度学习的机械臂抓取方法[J]. 机器人, 2017, 39 (6): 820- 828 DU Xuedan, CAI Yinghao, LU Tao, et al A robotic grasping method based on deep learning[J]. Robot, 2017, 39 (6): 820- 828

[5]	JIANG Y, MOSESON S, SAXENA A. Efficient grasping from RGBD images: learning using a new rectangle representation [C]// IEEE International Conference on Robotics and Automation . Shanghai: IEEE, 2011: 3304−3311.

[6]	LENZ I, LEE H, SAXENA A Deep learning for detecting robotic grasps[J]. The International Journal of Robotics Research, 2015, 34 (4/5): 705- 724

[7]	张云洲, 李奇, 曹赫, 等基于多层级特征的机械臂单阶段抓取位姿检测[J]. 控制与决策, 2021, 36 (8): 1815- 1824 ZHANG Yunzhou, LI Qi, CAO He, et al Single-stage grasp pose detection of manipulator based on multi-level features[J]. Control and Decision, 2021, 36 (8): 1815- 1824

[8]	夏晶, 钱堃, 马旭东, 等基于级联卷积神经网络的机器人平面抓取位姿快速检测[J]. 机器人, 2018, 40 (6): 794- 802 XIA Jing, QIAN Kun, MA Xudong, et al Fast planar grasp pose detection for robot based on cascaded deep convolutional neural networks[J]. Robot, 2018, 40 (6): 794- 802

[9]	喻群超, 尚伟伟, 张驰基于三级卷积神经网络的物体抓取检测[J]. 机器人, 2018, 40 (5): 762- 768 YU Qunchao, SHANG Weiwei, ZHANG Chi Object grasp detecting based on three-level convolution neural network[J]. Robot, 2018, 40 (5): 762- 768

[10]	REDMON J, ANGELOVA A. Real-time grasp detection using convolutional neural networks [C]// IEEE International Conference on Robotics and Automation . Seattle: IEEE, 2015: 1316−1322.

[11]	KUMRA S, KANAN C. Robotic grasp detection using deep convolutional neural networks [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Vancouver: IEEE, 2017: 769−776.

[12]	HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Las Vegas: IEEE, 2016: 770−778.

[13]	GUO D, SUN F C, LIU H P, et al. A hybrid deep architecture for robotic grasp detection [C]// IEEE International Conference on Robotics and Automation . Singapore: IEEE, 2017: 1609−1614.

[14]	CHU F J, XU R N, VELA P A Real-world multiobject, multigrasp detection[J]. IEEE Robotics and Automation Letters, 2018, 3 (4): 3355- 3362 doi: 10.1109/LRA.2018.2852777

[15]	ZHOU X W, LAN X G, ZHANG H B, et al. Fully convolutional grasp detection network with oriented anchor box [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Madrid: IEEE, 2018: 7223−7230.

[16]	MORRISON D, CORKE P, LEITNER J. Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach [EB/OL]. (2018−05−15) [2023−02−06]. https://arxiv.org/abs/1804.05172v2.

[17]	KUMRA S, JOSHI S, SAHIN F. Antipodal robotic grasping using generative residual convolutional neural network [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Las Vegas: IEEE, 2020: 9626−9633.

[18]	CHENG H, WANG Y Y, MENG Max Q H. Grasp pose detection from a single RGB image [C]// IEEE/RSJ International Conference on Intelligent Robots and Systems . Prague: IEEE, 2021: 4686−4691.

[19]	WANG S C, ZHOU Z L, KAN Z When transformer meets robotic grasping: exploits context for efficient grasp detection[J]. IEEE Robotics and Automation Letters, 2022, 7 (3): 8170- 8177 doi: 10.1109/LRA.2022.3187261

[20]	ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network [C]// IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Honolulu: IEEE, 2017: 6230−6239.

[21]	WANG P Q, CHEN P F, YUAN Y, et al. Understanding convolution for semantic segmentation [C]// IEEE Winter Conference on Applications of Computer Vision (WACV) . Lake Tahoe: IEEE, 2018: 1451−1460.

[22]	SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al Dropout: a simple way to prevent neural networks from overfitting[J]. The Journal of Machine Learning Research, 2014, 15 (1): 1929- 1958

[23]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision (ECCV) . Munich: Springer, 2018: 3−19.

[24]	HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Nashville: IEEE, 2021: 13708−13717.

[1]	Xinhua YAO,Tao YU,Senwen FENG,Zijian MA,Congcong LUAN,Hongyao SHEN. Recognition method of parts machining features based on graph neural network[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(2): 349-359.

[2]	Siyi QIN,Shaoyan GAI,Feipeng DA. Video object detection algorithm based on multi-level feature aggregation under mixed sampler[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(1): 10-19.

[3]	Xuefei SUN,Ruifeng ZHANG,Xin GUAN,Qiang LI. Lightweight and efficient human pose estimation with enhanced priori skeleton structure[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(1): 50-60.

[4]	Chao-hao ZHENG,Zhi-wei YIN,Gang-feng ZENG,Yue-ping XU,Peng ZHOU,Li LIU. Post-processing of numerical precipitation forecast based on spatial-temporal deep learning model[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(9): 1756-1765.

[5]	Hai-feng LI,Xue-ying ZHANG,Shu-fei DUAN,Hai-rong JIA,Hui-zhi LIANG. Fusing generative adversarial network and temporal convolutional network for Mandarin emotion recognition[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(9): 1865-1875.

[6]	Xiao-qiang ZHAO,Ze WANG,Zhao-yang SONG,Hong-mei JIANG. Image super-resolution reconstruction based on dynamic attention network[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(8): 1487-1494.

[7]	Hui-xin WANG,Xiang-rong TONG. Research progress of recommendation system based on knowledge graph[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(8): 1527-1540.

[8]	Xiu-lan SONG,Zhao-hang DONG,Hang-guan SHAN,Wei-jie LU. Vehicle trajectory prediction based on temporal-spatial multi-head attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(8): 1636-1643.

[9]	Xiao-yan LI,Peng WANG,Jia GUO,Xue LI,Meng-yu SUN. Multi branch Siamese network target tracking based on double attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1307-1316.

[10]	Zhe YANG,Hong-wei GE,Ting LI. Framework of feature fusion and distribution with mixture of experts for parallel recommendation algorithm[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1317-1325.

[11]	Yun-hong LI,Jiao-jiao DUAN,Xue-ping SU,Lei-tao ZHANG,Hui-kang YU,Xing-rui LIU. Calligraphy generation algorithm based on improved generative adversarial network[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1326-1334.

[12]	Wei QUAN,Yong-qing CAI,Chao WANG,Jia SONG,Hong-kai SUN,Lin-xuan LI. VR sickness estimation model based on 3D-ResNet two-stream network[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1345-1353.

[13]	Xin-lei ZHOU,Hai-ting GU,Jing LIU,Yue-ping XU,Fang GENG,Chong WANG. Daily water supply prediction method based on integrated learning and deep learning[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1120-1127.

[14]	Pei-feng LIU,Lu QIAN,Xing-wei ZHAO,Bo TAO. Continual learning framework of named entity recognition in aviation assembly domain[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1186-1194.

[15]	Jun HAN,Xiao-ping YUAN,Zhun WANG,Ye CHEN. UAV dense small target detection algorithm based on YOLOv5s[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1224-1233.

Viewed

Full text

Abstract

Cited

Shared

Discussed