Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2023, Vol. 57 Issue (5): 875-882    DOI: 10.3785/j.issn.1008-973X.2023.05.003
    
Point cloud instance segmentation based on attention mechanism KNN and ASIS module
Xue-yong XIANG1,2(),Li WANG1,2,Wen-peng ZONG1,2,Guang-yun LI1,*()
1. Institute of Geospatial Information, Information Engeering University, Zhengzhou 450001, China
2. State Key Laboratory of Geo-Information Engineering, Xi’an 710054, China
Download: HTML     PDF(2097KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A point cloud instance segmentation model with a k-nearest neighbors (KNN) module featuring attention mechanism and an improved associatively segmenting instances and semantics (ASIS) module was proposed to address the problems of discrete segmentation and insufficient feature utilization in traditional 3D convolution-based algorithms. The model took voxels as input and extracted point features through sparse convolution of 3D submanifolds. The KNN algorithm with attention mechanism was used for reorganizing the features in the semantic and instance feature space to alleviate the problem caused by the quantization error of extracted features. The reorganized semantic and instance features were correlated through the improved ASIS module to enhance the discrimination between point features. For semantic features and instance embedding, the softmax module and the meanshift algorithm were applied to obtain semantic and instance segmentation results respectively. The public S3DIS dataset was employed to validate the proposed model. The experimental results showed that the instance segmentation results of the proposed model achieved 53.1%, 57.1%, 65.2% and 52.8% in terms of mean coverage (mCoV), mean weighted coverage (mWCov), mean precision (mPrec) and mean recall (mRec) for the instance segmentation. The semantic segmentation achieved 61.7% and 88.1% respectively in terms of mean intersection over union (mIoU) and Over-all accuracy (Oacc) for the semantic segmentation. The ablation experiment verified the effectiveness of the proposed modules.



Key wordspoint cloud      voxel      instance segmentation      attention mechanism      submanifold     
Received: 05 May 2022      Published: 09 May 2023
CLC:  P 204  
Fund:  国家自然科学基金资助项目(42071454);地理信息工程国家重点实验室自主研究课题资助项目(SKLGIE2021-ZZ-5)
Corresponding Authors: Guang-yun LI     E-mail: ahhsxxy@163.com;guangyunli_chxy@163.com
Cite this article:

Xue-yong XIANG,Li WANG,Wen-peng ZONG,Guang-yun LI. Point cloud instance segmentation based on attention mechanism KNN and ASIS module. Journal of ZheJiang University (Engineering Science), 2023, 57(5): 875-882.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2023.05.003     OR     https://www.zjujournals.com/eng/Y2023/V57/I5/875


ASIS模块支持下融合注意力机制KNN的点云实例分割算法

针对基于3D卷积的点云实例分割算法的分割结果离散化、特征利用不充分的问题,提出具有注意力机制(KNN)模块和改进的实例语义关联(ASIS)模块的点云实例分割模型. 模型以体素作为输入,通过3D子流形稀疏卷积提取点特征. 利用具有注意力机制的KNN算法,对语义、实例特征空间的特征进行重组,以缓解提取到的特征离散化问题. 通过改进的ASIS模块,对重组后的语义、实例特征相互关联以增强点特征间的区分度. 对于语义特征与实例嵌入,分别应用Softmax模块、MeanShift算法获得语义与实例分割结果,采用S3DIS公开数据集对所提模型进行验证. 实验结果表明,所提模型的实例分割结果在平均实例覆盖率(mCov)、平均加权实例覆盖率(mWCov)、平均精确率(mPrec)、平均召回率(mRec)衡量指标上分别达到了53.1%、57.1%、65.2%与52.8%;语义分割结果在平均交并比和总体精度上分别达到了61.7%、88.1%. 消融实验结果验证了所提模块的有效性.


关键词: 点云,  体素,  实例分割,  注意力机制,  子流形 
Fig.1 Overall architecture of proposed network
Fig.2 KNN model with attention mechanism
Fig.3 ASIS module and improved ASIS module
模型 mCoV mWCov mPrec mRec
基准模型 42.1 39.3 53.1 41.3
具有注意力机制的KNN 43.6 40.5 54.5 42.6
Tab.1 Effect of KNN with attention mechanism on instance segmentation results %
Fig.4 Influence of number of neighborhood points on segmentation results of proposed model
模型 mCoV mWCov mPrec mRec
基准模型 42.1 39.3 53.1 41.3
ASIS 43.1 40.5 54.0 42.0
改进后的ASIS 44.7 41.5 55.3 43.2
Tab.2 Impact of improved ASIS module on segmentation results %
Fig.5 Qualitative results of proposed module on S3DIS dataset
模型 mCoV mWCov mPrec mRec
SGPN[8] 37.9 40.8 38.2 31.2
MT-PNet[10] ? ? 24.9 ?
MV-CRF[10] ? ? 36.3 ?
PartNet[30] ? ? 56.4 43.4
ASIS[9] 51.2 55.1 63.6 47.5
BoNet[31] ? ? 65.6 47.6
Ours 53.1 57.1 65.2 52.8
Tab.3 Instance segmentation results of proposed model and some existing models on S3DIS dataset %
模型 mIoU OAcc
PointNet[14] 47.6 78.6
SGPN[8] 50.4 80.8
PointNet++[32] 54.5 81.0
DGCNN[23] 56.1 ?
3D-Bevis[33] 58.4 83.7
ASIS[9] 59.3 86.2
Ours 61.7 88.1
Tab.4 Semantic segmentation results of proposed model and some existing models on S3DIS dataset %
[1]   ZHAO N, CHUA T S, LEE G H. Few-shot 3d point cloud semantic segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 8873-8882.
[2]   WU K L, XU G D, LIU Z L, et al PointCSE: context-sensitive encoders for efficient 3d object detection from point cloud[J]. International Journal of Machine Learning and Cybernetics, 2021, 28 (7): 1- 9
[3]   HE K M, GKIOXARI G, DOLLÁR P, et al. Mask r-cnn [C]// IEEE International Conference on Computer Vision Workshops. Venice: IEEE, 2017: 2961-2969.
[4]   姚培军, 尹燕运. 基于三维激光扫描仪和全站仪技术的外立面测量方法[J]. 岩土工程技术, 2022, 36(2): 156-159.
YAO Pei-jun, YIN Yan-yun, Facade measurement method based on three-dimensional laser scanner and total station technology [J]. Geotechnical Engineering Technique, 2022, 36(2): 156-159.
[5]   王朝莹, 邢帅, 戴莫凡 遥感影像与LiDAR点云多尺度深度特征融合的地物分类方法[J]. 测绘科学技术学报, 2021, 38 (6): 604- 610
WANG Chao-ying, XING Shuai, DAI Mo-fan, et al A method of ground object classification based on multi-scale deep feature fusion of remote sensing image and LiDAR point cloud[J]. Journal of Geomatics Science and Technology, 2021, 38 (6): 604- 610
doi: 10.3969/j.issn.1673-6338.2021.06.009
[6]   HOU J, DAI A, NIEßNER M. 3D-SIS: 3d semantic instance segmentation of RGB-d scans [C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4421-4430.
[7]   YI L, ZHAO W, WANG H, et al. GSPN: generative shape proposal network for 3d instance segmentation in point cloud [C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3947-3956.
[8]   WANG W Y, YU R, HUANG Q, et al. SGPN: similarity group proposal network for 3d point cloud instance segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 2569-2578.
[9]   WANG X L, LIU S, SHEN X Y, et al. Associatively segmenting instances and semantics in point clouds [C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4096-4105.
[10]   PHAM Q H, NGUYEN T, HUA B S, et al. Jsis3d: joint semantic-instance segmentation of 3d point clouds with multi-task pointwise networks and multi-value conditional random fields [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8827-8836.
[11]   LAHOUD J, GHANEM B, POLLEFEYS M, et al. 3D instance segmentation via multi-task metric learning [C]// IEEE International Conference on Computer Vision Workshops. Seoul: IEEE, 2019: 9256-9266.
[12]   DU J, CAI G R, WANG Z Y, et al. Convertible sparse convolution for point cloud instace segmentation [C]// IEEE International Geoscience and Remote Sensing Symposium. Brussels: IEEE, 2021: 4111-4114.
[13]   PAN R Y, HUANG C M. Accuracy improvement of deep learning 3d point cloud instance segmentation [C]// IEEE International Conference on Consumer Electronics Taiwan. Taiwan: IEEE, 2021: 1-12.
[14]   QI R, SU H, MO K, et al. PointNet: deep learning on point sets for 3d classification and segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 652-660.
[15]   GRAHAM B, ENGELCKE M, VAN DER MAATEN L. 3D semantic segmentation with submanifold sparse convolutional networks [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 9224-9232.
[16]   CHOY C, GWAK J Y, SAVARESE S. 4D spatio-temporal convnets: minkowski convolutional neural networks [C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 998-1008.
[17]   LIANG Z, YANG M, LI H, et al 3D instance embedding learning with a structure-aware loss function for point cloud segmentation[J]. IEEE Robotics and Automation Letters, 2020, 5 (3): 4915- 4922
[18]   HE K M, ZHANG X, REN S Q, et al. Deep residual learning for image recognition [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[19]   WEN Y D, ZHANG K P, LI Z F, et al. A discriminative feature learning approach for deep face recognition [C]// European Conference on Computer Vision. Amsterdam: Springer, 2016: 499-515.
[20]   DE BRABANDERE B, NEVEN D, VAN GOOL L. Semantic instance segmentation with a discriminative loss function [EB/OL]. [2017-08-08]. https://arxiv.org/abs/1708.02551.
[21]   COMANICIU D, MEER P Mean shift: a robust approach toward feature space analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24 (5): 603- 619
[22]   WANG Y, SUN Y B, LIU Z W, et al Dynamic graph CNN for learning on point clouds[J]. Acm Transactions on Graphics, 2019, 38 (5): 1- 12
[23]   LIU W Y, WEN Y, YU Z, et al. Large-margin softmax loss for convolutional neural networks [C]// International Conference on Machine Learning. New York City: IMLS, 2016: 7-18.
[24]   LIN M, CHEN Q, YAN S. Network in network [EB/OL]. [2013-12-16]. https://arxiv.org/abs/1312.4400.
[25]   WOO S, PARK J, LEE J Y, et al. Cbam: convolutional block attention module [C]// European Conference on Computer Vision. Munich: Springer, 2018: 3-19.
[26]   ARMENI I, SENER O, ZAMIR A, et al. 3D semantic parsing of large-scale indoor spaces [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 1534-1543.
[27]   MENGYE R, RICHARD Z. End-to-end instance segmentation with recurrent attention [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6656-6664.
[28]   LIU S R, JIA J, FIDLER S, et al. SGN: sequential grouping networks for instance segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6656-6664.
[29]   ZHUO W, SALZMANN M, HE X, et al. Indoor scene parsing with instance segmentation, semantic labeling and support relationship inference [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 6656-6664.
[30]   MO K, ZHU S, CHANG A X, et al. PartNet: a large-scale benchmark for fine-grained and hierarchical part-level 3d object understanding [C]// IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 998-1008.
[31]   YANG B, WANG J, CLARK R, ET AL. Learning object bounding boxes for 3d instance segmentation on point clouds [C]// Proceedings of the Advances in Neural Information Processing Systems. Vancouver: NIPS, 2019: 563-575.   
[32]   QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space [C]// Proceedings of the Advances in Neural Information Processing Systems. Long Beach: NIPS, 2017: 5099-5108.
[1] Yu-ting SU,Rong-xuan LU,Wei ZHANG. Vehicle re-identification algorithm based on attention mechanism and adaptive weight[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(4): 712-718.
[2] Bai-cheng BIAN,Tian CHEN,Ru-jun WU,Jun LIU. Improved YOLOv3-based defect detection algorithm for printed circuit board[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(4): 735-743.
[3] Yan-fen CHENG,Jia-jun WU,Fan HE. Aspect level sentiment analysis based on relation gated graph convolutional network[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(3): 437-445.
[4] Huan LAN,Jian-bo YU. Steel surface defect detection based on deep learning 3D reconstruction[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(3): 466-476.
[5] Fan YANG,Bo NING,Huai-qing LI,Xin ZHOU,Guan-yu LI. Multimodal image retrieval model based on semantic-enhanced feature fusion[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 252-258.
[6] Chao LIU,Bing KONG,Guo-wang DU,Li-hua ZHOU,Hong-mei CHEN,Chong-ming BAO. Deep clustering via high-order mutual information maximization and pseudo-label guidance[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(2): 299-309.
[7] Lin-tao WANG,Qi MAO. Position measurement method for tunnel segment grabbing based on RGB and depth information fusion[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(1): 47-54.
[8] Li-zhou FENG,Yang YANG,You-wei WANG,Gui-jun YANG. New method for news recommendation based on Transformer and knowledge graph[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(1): 133-143.
[9] Kun HAO,Kuo WANG,Bei-bei WANG. Lightweight underwater biological detection algorithm based on improved Mobilenet-YOLOv3[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(8): 1622-1632.
[10] Guang-yu FAN,Yu-chen GONG,Lei RAO,Nian-sheng CHEN. Registration of laser point cloud and panoramic image based on gray similarity[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(8): 1633-1639.
[11] Ren-peng MO,Xiao-sheng SI,Tian-mei LI,Xu ZHU. Bearing life prediction based on multi-scale features and attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(7): 1447-1456.
[12] You-wei WANG,Shuang TONG,Li-zhou FENG,Jian-ming ZHU,Yang LI,Fu CHEN. New inductive microblog rumor detection method based on graph convolutional network[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 956-966.
[13] Xiao-chen JU,Xin-xin ZHAO,Sheng-sheng QIAN. Self-attention mechanism based bridge bolt detection algorithm[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 901-908.
[14] Xue-qin ZHANG,Tian-ren LI. Breast cancer pathological image classification based on Cycle-GAN and improved DPN network[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 727-735.
[15] Meng XU,Dan WANG,Zhi-yuan LI,Yuan-fang CHEN. IncepA-EEGNet: P300 signal detection method based on fusion of Inception network and attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 745-753, 782.