Please wait a minute...
浙江大学学报(工学版)  2023, Vol. 57 Issue (5): 875-882    DOI: 10.3785/j.issn.1008-973X.2023.05.003
计算机技术与控制工程     
ASIS模块支持下融合注意力机制KNN的点云实例分割算法
项学泳1,2(),王力1,2,宗文鹏1,2,李广云1,*()
1. 信息工程大学 地理空间信息学院,河南 郑州 450001
2. 地理信息工程国家重点实验室,陕西 西安 710054
Point cloud instance segmentation based on attention mechanism KNN and ASIS module
Xue-yong XIANG1,2(),Li WANG1,2,Wen-peng ZONG1,2,Guang-yun LI1,*()
1. Institute of Geospatial Information, Information Engeering University, Zhengzhou 450001, China
2. State Key Laboratory of Geo-Information Engineering, Xi’an 710054, China
 全文: PDF(2097 KB)   HTML
摘要:

针对基于3D卷积的点云实例分割算法的分割结果离散化、特征利用不充分的问题,提出具有注意力机制(KNN)模块和改进的实例语义关联(ASIS)模块的点云实例分割模型. 模型以体素作为输入,通过3D子流形稀疏卷积提取点特征. 利用具有注意力机制的KNN算法,对语义、实例特征空间的特征进行重组,以缓解提取到的特征离散化问题. 通过改进的ASIS模块,对重组后的语义、实例特征相互关联以增强点特征间的区分度. 对于语义特征与实例嵌入,分别应用Softmax模块、MeanShift算法获得语义与实例分割结果,采用S3DIS公开数据集对所提模型进行验证. 实验结果表明,所提模型的实例分割结果在平均实例覆盖率(mCov)、平均加权实例覆盖率(mWCov)、平均精确率(mPrec)、平均召回率(mRec)衡量指标上分别达到了53.1%、57.1%、65.2%与52.8%;语义分割结果在平均交并比和总体精度上分别达到了61.7%、88.1%. 消融实验结果验证了所提模块的有效性.

关键词: 点云体素实例分割注意力机制子流形    
Abstract:

A point cloud instance segmentation model with a k-nearest neighbors (KNN) module featuring attention mechanism and an improved associatively segmenting instances and semantics (ASIS) module was proposed to address the problems of discrete segmentation and insufficient feature utilization in traditional 3D convolution-based algorithms. The model took voxels as input and extracted point features through sparse convolution of 3D submanifolds. The KNN algorithm with attention mechanism was used for reorganizing the features in the semantic and instance feature space to alleviate the problem caused by the quantization error of extracted features. The reorganized semantic and instance features were correlated through the improved ASIS module to enhance the discrimination between point features. For semantic features and instance embedding, the softmax module and the meanshift algorithm were applied to obtain semantic and instance segmentation results respectively. The public S3DIS dataset was employed to validate the proposed model. The experimental results showed that the instance segmentation results of the proposed model achieved 53.1%, 57.1%, 65.2% and 52.8% in terms of mean coverage (mCoV), mean weighted coverage (mWCov), mean precision (mPrec) and mean recall (mRec) for the instance segmentation. The semantic segmentation achieved 61.7% and 88.1% respectively in terms of mean intersection over union (mIoU) and Over-all accuracy (Oacc) for the semantic segmentation. The ablation experiment verified the effectiveness of the proposed modules.

Key words: point cloud    voxel    instance segmentation    attention mechanism    submanifold
收稿日期: 2022-05-05 出版日期: 2023-05-09
CLC:  P 204  
基金资助: 国家自然科学基金资助项目(42071454);地理信息工程国家重点实验室自主研究课题资助项目(SKLGIE2021-ZZ-5)
通讯作者: 李广云     E-mail: ahhsxxy@163.com;guangyunli_chxy@163.com
作者简介: 项学泳(1994—),男,博士生,从事三维场景识别研究. orcid.org/0000-0001-9314-5732. E-mail: ahhsxxy@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
项学泳
王力
宗文鹏
李广云

引用本文:

项学泳,王力,宗文鹏,李广云. ASIS模块支持下融合注意力机制KNN的点云实例分割算法[J]. 浙江大学学报(工学版), 2023, 57(5): 875-882.

Xue-yong XIANG,Li WANG,Wen-peng ZONG,Guang-yun LI. Point cloud instance segmentation based on attention mechanism KNN and ASIS module. Journal of ZheJiang University (Engineering Science), 2023, 57(5): 875-882.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.05.003        https://www.zjujournals.com/eng/CN/Y2023/V57/I5/875

图 1  所提模型整体结构
图 2  具有注意力机制的KNN模型
图 3  ASIS模块和改进后的ASIS模块
模型 mCoV mWCov mPrec mRec
基准模型 42.1 39.3 53.1 41.3
具有注意力机制的KNN 43.6 40.5 54.5 42.6
表 1  具有注意力机制的KNN对实例分割结果的影响
图 4  邻域点数量对模型实例分割结果的影响
模型 mCoV mWCov mPrec mRec
基准模型 42.1 39.3 53.1 41.3
ASIS 43.1 40.5 54.0 42.0
改进后的ASIS 44.7 41.5 55.3 43.2
表 2  改进后ASIS模块对实例分割结果的影响
图 5  所提模型在S3DIS数据集上实验结果
模型 mCoV mWCov mPrec mRec
SGPN[8] 37.9 40.8 38.2 31.2
MT-PNet[10] ? ? 24.9 ?
MV-CRF[10] ? ? 36.3 ?
PartNet[30] ? ? 56.4 43.4
ASIS[9] 51.2 55.1 63.6 47.5
BoNet[31] ? ? 65.6 47.6
Ours 53.1 57.1 65.2 52.8
表 3  所提模型与现有模型在S3DIS数据集上实例分割结果
模型 mIoU OAcc
PointNet[14] 47.6 78.6
SGPN[8] 50.4 80.8
PointNet++[32] 54.5 81.0
DGCNN[23] 56.1 ?
3D-Bevis[33] 58.4 83.7
ASIS[9] 59.3 86.2
Ours 61.7 88.1
表 4  所提模型与现有模型在S3DIS数据集上语义分割结果
1 ZHAO N, CHUA T S, LEE G H. Few-shot 3d point cloud semantic segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 8873-8882.
2 WU K L, XU G D, LIU Z L, et al PointCSE: context-sensitive encoders for efficient 3d object detection from point cloud[J]. International Journal of Machine Learning and Cybernetics, 2021, 28 (7): 1- 9
3 HE K M, GKIOXARI G, DOLLÁR P, et al. Mask r-cnn [C]// IEEE International Conference on Computer Vision Workshops. Venice: IEEE, 2017: 2961-2969.
4 姚培军, 尹燕运. 基于三维激光扫描仪和全站仪技术的外立面测量方法[J]. 岩土工程技术, 2022, 36(2): 156-159.
YAO Pei-jun, YIN Yan-yun, Facade measurement method based on three-dimensional laser scanner and total station technology [J]. Geotechnical Engineering Technique, 2022, 36(2): 156-159.
5 王朝莹, 邢帅, 戴莫凡 遥感影像与LiDAR点云多尺度深度特征融合的地物分类方法[J]. 测绘科学技术学报, 2021, 38 (6): 604- 610
WANG Chao-ying, XING Shuai, DAI Mo-fan, et al A method of ground object classification based on multi-scale deep feature fusion of remote sensing image and LiDAR point cloud[J]. Journal of Geomatics Science and Technology, 2021, 38 (6): 604- 610
doi: 10.3969/j.issn.1673-6338.2021.06.009
6 HOU J, DAI A, NIEßNER M. 3D-SIS: 3d semantic instance segmentation of RGB-d scans [C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4421-4430.
7 YI L, ZHAO W, WANG H, et al. GSPN: generative shape proposal network for 3d instance segmentation in point cloud [C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3947-3956.
8 WANG W Y, YU R, HUANG Q, et al. SGPN: similarity group proposal network for 3d point cloud instance segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 2569-2578.
9 WANG X L, LIU S, SHEN X Y, et al. Associatively segmenting instances and semantics in point clouds [C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4096-4105.
10 PHAM Q H, NGUYEN T, HUA B S, et al. Jsis3d: joint semantic-instance segmentation of 3d point clouds with multi-task pointwise networks and multi-value conditional random fields [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8827-8836.
11 LAHOUD J, GHANEM B, POLLEFEYS M, et al. 3D instance segmentation via multi-task metric learning [C]// IEEE International Conference on Computer Vision Workshops. Seoul: IEEE, 2019: 9256-9266.
12 DU J, CAI G R, WANG Z Y, et al. Convertible sparse convolution for point cloud instace segmentation [C]// IEEE International Geoscience and Remote Sensing Symposium. Brussels: IEEE, 2021: 4111-4114.
13 PAN R Y, HUANG C M. Accuracy improvement of deep learning 3d point cloud instance segmentation [C]// IEEE International Conference on Consumer Electronics Taiwan. Taiwan: IEEE, 2021: 1-12.
14 QI R, SU H, MO K, et al. PointNet: deep learning on point sets for 3d classification and segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 652-660.
15 GRAHAM B, ENGELCKE M, VAN DER MAATEN L. 3D semantic segmentation with submanifold sparse convolutional networks [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 9224-9232.
16 CHOY C, GWAK J Y, SAVARESE S. 4D spatio-temporal convnets: minkowski convolutional neural networks [C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 998-1008.
17 LIANG Z, YANG M, LI H, et al 3D instance embedding learning with a structure-aware loss function for point cloud segmentation[J]. IEEE Robotics and Automation Letters, 2020, 5 (3): 4915- 4922
18 HE K M, ZHANG X, REN S Q, et al. Deep residual learning for image recognition [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
19 WEN Y D, ZHANG K P, LI Z F, et al. A discriminative feature learning approach for deep face recognition [C]// European Conference on Computer Vision. Amsterdam: Springer, 2016: 499-515.
20 DE BRABANDERE B, NEVEN D, VAN GOOL L. Semantic instance segmentation with a discriminative loss function [EB/OL]. [2017-08-08]. https://arxiv.org/abs/1708.02551.
21 COMANICIU D, MEER P Mean shift: a robust approach toward feature space analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24 (5): 603- 619
22 WANG Y, SUN Y B, LIU Z W, et al Dynamic graph CNN for learning on point clouds[J]. Acm Transactions on Graphics, 2019, 38 (5): 1- 12
23 LIU W Y, WEN Y, YU Z, et al. Large-margin softmax loss for convolutional neural networks [C]// International Conference on Machine Learning. New York City: IMLS, 2016: 7-18.
24 LIN M, CHEN Q, YAN S. Network in network [EB/OL]. [2013-12-16]. https://arxiv.org/abs/1312.4400.
25 WOO S, PARK J, LEE J Y, et al. Cbam: convolutional block attention module [C]// European Conference on Computer Vision. Munich: Springer, 2018: 3-19.
26 ARMENI I, SENER O, ZAMIR A, et al. 3D semantic parsing of large-scale indoor spaces [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 1534-1543.
27 MENGYE R, RICHARD Z. End-to-end instance segmentation with recurrent attention [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6656-6664.
28 LIU S R, JIA J, FIDLER S, et al. SGN: sequential grouping networks for instance segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6656-6664.
29 ZHUO W, SALZMANN M, HE X, et al. Indoor scene parsing with instance segmentation, semantic labeling and support relationship inference [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6656-6664.
30 MO K, ZHU S, CHANG A X, et al. PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3d object understanding [C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 998-1008.
31 YANG B, WANG J, CLARK R, ET AL. Learning object bounding boxes for 3d instance segmentation on point clouds [C]// Proceedings of the Advances in Neural Information Processing Systems. Vancouver: NIPS, 2019: 563-575.   
32 QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space [C]// Proceedings of the Advances in Neural Information Processing Systems. Long Beach: NIPS, 2017: 5099-5108.
[1] 苏育挺,陆荣烜,张为. 基于注意力和自适应权重的车辆重识别算法[J]. 浙江大学学报(工学版), 2023, 57(4): 712-718.
[2] 卞佰成,陈田,吴入军,刘军. 基于改进YOLOv3的印刷电路板缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(4): 735-743.
[3] 程艳芬,吴家俊,何凡. 基于关系门控图卷积网络的方面级情感分析[J]. 浙江大学学报(工学版), 2023, 57(3): 437-445.
[4] 曾耀,高法钦. 基于改进YOLOv5的电子元件表面缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(3): 455-465.
[5] 兰欢,余建波. 基于深度学习三维成型的钢板表面缺陷检测[J]. 浙江大学学报(工学版), 2023, 57(3): 466-476.
[6] 杨帆,宁博,李怀清,周新,李冠宇. 基于语义增强特征融合的多模态图像检索模型[J]. 浙江大学学报(工学版), 2023, 57(2): 252-258.
[7] 刘超,孔兵,杜国王,周丽华,陈红梅,包崇明. 高阶互信息最大化与伪标签指导的深度聚类[J]. 浙江大学学报(工学版), 2023, 57(2): 299-309.
[8] 王林涛,毛齐. 基于RGB与深度信息融合的管片抓取位置测量方法[J]. 浙江大学学报(工学版), 2023, 57(1): 47-54.
[9] 凤丽洲,杨阳,王友卫,杨贵军. 基于Transformer和知识图谱的新闻推荐新方法[J]. 浙江大学学报(工学版), 2023, 57(1): 133-143.
[10] 郝琨,王阔,王贝贝. 基于改进Mobilenet-YOLOv3的轻量级水下生物检测算法[J]. 浙江大学学报(工学版), 2022, 56(8): 1622-1632.
[11] 范光宇,宫宇宸,饶蕾,陈年生. 基于灰度相似性的激光点云与全景影像配准[J]. 浙江大学学报(工学版), 2022, 56(8): 1633-1639.
[12] 莫仁鹏,司小胜,李天梅,朱旭. 基于多尺度特征与注意力机制的轴承寿命预测[J]. 浙江大学学报(工学版), 2022, 56(7): 1447-1456.
[13] 王友卫,童爽,凤丽洲,朱建明,李洋,陈福. 基于图卷积网络的归纳式微博谣言检测新方法[J]. 浙江大学学报(工学版), 2022, 56(5): 956-966.
[14] 鞠晓臣,赵欣欣,钱胜胜. 基于自注意力机制的桥梁螺栓检测算法[J]. 浙江大学学报(工学版), 2022, 56(5): 901-908.
[15] 张雪芹,李天任. 基于Cycle-GAN和改进DPN网络的乳腺癌病理图像分类[J]. 浙江大学学报(工学版), 2022, 56(4): 727-735.