Please wait a minute...
浙江大学学报(理学版)  2023, Vol. 50 Issue (6): 770-780    DOI: 10.3785/j.issn.1008-9497.2023.06.012
第15届全国几何设计与计算学术会议专题     
局部信息和全局信息相结合的点云处理网络
刘玉杰,原亚夫(),孙晓瑞,李宗民
中国石油大学(华东) 计算机科学与技术学院,山东 青岛 266580
A point cloud processing network combining global and local information
Yujie LIU,Yafu YUAN(),Xiaorui SUN,Zongmin LI
College of Computer Science and Technology,China University of Petroleum,Qingdao 266580,Shandong Province,China
 全文: PDF(2182 KB)   HTML( 3 )
摘要:

针对当前主流点云处理网络仅依靠局部邻域进行特征聚合导致特征提取能力不足,以及使用最大值池化造成信息损失的问题,提出了一种基于注意力的局部信息和全局信息相结合的点云处理网络。首先提出了基于通道自注意力进行局部特征聚合的方法,减少了信息的损失;然后为捕获点的远程依赖信息,设计了一种动态学习关键点的方法获取全局信息; 最后构建了一种基于空间注意力的特征融合模块,使每个点均能学习全局上下文信息。在几个常用点云数据集上对方法进行了实验验证,在ModelNet40分类任务上实现了94.0%的总体分类精度、91.7%的平均分类精度;在ScanObjectNN分类任务上实现了81.5%的总体分类精度、78.1%的平均分类精度;在ShapeNet 分割任务上实现了86.5%的平均交并比。表明提出的点云处理网络在分类、分割等任务中的精度均较PointNet、PointNet++、DGCNN等经典网络有显著提升,较其他点云处理网络也有不同程度的提高。

关键词: 点云分类点云分割注意力机制全局信息局部信息    
Abstract:

To address the limitations of current mainstream networks, which rely solely on local neighborhoods for feature aggregation and suffering from insufficient feature extraction capabilities and information loss due to max-pooling, we propose an attention-based point cloud processing network that combines both local and global information. First, we introduce channel attention for local feature aggregation to minimize information loss. Next, we design a dynamic key point learning method to capture the remote dependency information of points and obtain global information. Finally, we develop a spatial attention fusion module to allow each point to learn the global con-textual information. Our proposed method has been benchmarked on several point cloud analysis tasks. It achieved an overall classification accuracy of 94.0% and an average classification accuracy of 91.7% on the ModelNet40 classification task. On the ScanObjectNN classification task, our method reached an overall class fication accuracy of 81.5% and an average classification accuracy of 78.1%. In the ShapeNet segmentation task, we obtained a mean intersection over union of 86.5%. The experimental results show that the proposed network has significantly improved accuracy compared to classical networks such as PointNet, PointNet++, and DGCNN in classification and segmentation tasks, and has also achieved improvement in deferent degree compared to other point cloud processing networks.

Key words: point cloud classification    point cloud segmentation    attention mechanism    global information    local information
收稿日期: 2023-06-21 出版日期: 2023-11-30
CLC:  TP 391.41  
基金资助: 国家重点研发计划项目(2019YFF0301800);国家自然科学基金资助项目(61379106);山东省自然科学基金资助项目(ZR2013FM036)
通讯作者: 原亚夫     E-mail: yuanyafu@s.upc.edu.cn
作者简介: 刘玉杰(1971—),ORCID:https://orcid.org/0000-0003-1001-963X,男,博士,副教授,主要从事计算机图形图像处理、行人重识别等研究.
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
刘玉杰
原亚夫
孙晓瑞
李宗民

引用本文:

刘玉杰,原亚夫,孙晓瑞,李宗民. 局部信息和全局信息相结合的点云处理网络[J]. 浙江大学学报(理学版), 2023, 50(6): 770-780.

Yujie LIU,Yafu YUAN,Xiaorui SUN,Zongmin LI. A point cloud processing network combining global and local information. Journal of Zhejiang University (Science Edition), 2023, 50(6): 770-780.

链接本文:

https://www.zjujournals.com/sci/CN/10.3785/j.issn.1008-9497.2023.06.012        https://www.zjujournals.com/sci/CN/Y2023/V50/I6/770

图1  特征聚合的3种方法
图2  A和B两点空间示意以及邻域平面示意
图3  主流方法的特征聚合过程
图4  基于通道自注意力的局部信息聚合模块
图5  基于空间注意力的特征融合模块
图6  LGA模块
图7  LGANet
方法MA/%OA/%
PointNet186.089.2
PointNet++2-90.7
PointCNN[1088.192.3
A-CNN490.392.6
DGCNN1790.292.9
PAConv12-93.6
PointASNL22-92.9
GDANet23-93.4
CurveNet25-93.8
Point Trans1890.693.7
PointMLP4491.394.1
PointNeXt45-94.0
LGANet91.794.0
表1  不同方法在ModelNet40数据集上的准确率
方法MA/%OA/%
PointNet163.268.2
PointNet++275.477.9
DGCNN1773.678.1
SpiderCNN1169.873.7
PointCNN1075.178.5
DRNet4278.080.3
GBNet4377.880.5
LGANet78.181.5
表2  不同方法在ScanObjectNN数据集上的准确率
类别mIoU/%
PointNet1PointNet++2DGCNN17SpiderCNN11PointASNL22GS-Net21PointCNN10LGANet
飞机83.482.484.083.584.182.984.185.1
书包78.779.083.481.084.784.386.585.1
帽子82.587.786.787.287.988.686.090.1
汽车74.977.377.877.579.778.480.880.0
椅子89.690.390.690.792.289.790.691.6
耳机73.076.874.776.873.778.379.777.6
吉他91.591.091.291.191.091.792.392.0
85.985.987.587.387.286.788.487.8
80.883.782.883.384.281.285.385.3
电脑95.395.395.795.895.895.696.196.1
摩托65.271.666.370.274.472.877.273.2
杯子93.094.194.993.595.294.795.295.5
手枪81.281.381.182.781.093.184.281.6
火箭57.958.763.559.763.062.364.259.0
滑板72.876.474.575.876.381.580.077.0
桌子80.682.682.682.883.283.883.083.9
整体83.785.185.285.386.185.386.186.5
表3  用ShapeNet 部件分割各类别及整体的平均交并比(mIoU)比较
图8  ShapeNet数据集分割部分结果可视化
模型模块OA/%
LIACA-1LIACA-2DLKSAFA
A92.5
B??93.0
C????93.3
D??????93.7
E????92.9
F????????94.0
表4  消融实验
策略OA/%
FPS93.1
Random92.7
DLK94.0
表5  关键点获取策略的总体分类精度
关键点数OA/%
093.3
25692.9
50094.0
1 02493.4
表6  不同关键点数的总体分类精度
方法参数量/MOA/%
PointNet13.5089.2
PointNet++21.4890.7
GS-Net211.5192.9
GBNet438.3993.8
LGANet2.2094.0
表7  分类模型复杂性的比较
1 QI C R, SU H, MO K, et al. PointNet: Deep learning on point sets for 3D classification and segmentation[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 652-660. DOI:10.1109/CVPR.2017.16
doi: 10.1109/CVPR.2017.16
2 QI C R, YI L, SU H, et al. PointNet++: Deep hierarchical feature learning on point sets in a metric space[J]. Advances in Neural Information Processing Systems, 2017: 5099-5108.
3 HORNIK K. Approximation capabilities of multilayer feedforward networks[J]. Neural Networks, 1991, 4(2): 251-257. DOI:10.1016/0893-6080(91)90009-T
doi: 10.1016/0893-6080(91)90009-T
4 KOMARICHEV A, ZHONG Z C, HUA J. A-CNN: Annularly convolutional neural networks on point clouds[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7421-7430. DOI:10.1109/CVPR.2019.00760
doi: 10.1109/CVPR.2019.00760
5 ZHAO H S, JIANG L, FU C W, et al. Pointweb: Enhancing local neighborhood features for point cloud processing[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5565-5573. DOI:10.1109/CVPR.2019.00571
doi: 10.1109/CVPR.2019.00571
6 SIMONOVSKY M, KOMODAKIS N. Dynamic edge-conditioned filters in convolutional neural networks on graphs[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 3693-3702. DOI:10.1109/CVPR.2017.11
doi: 10.1109/CVPR.2017.11
7 LIU Y C, FAN B, XIANG S M, et al. Relation-shape convolutional neural network for point cloud analysis[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 8895-8904. DOI:10.1109/CVPR.2019.00910
doi: 10.1109/CVPR.2019.00910
8 JIANG L, ZHAO H S, LIU S, et al. Hierarchical point-edge interaction network for point cloud semantic segmentation[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 10433-10441. DOI:10. 1109/ICCV.2019.01053
doi: 10. 1109/ICCV.2019.01053
9 LIU Y C, FAN B, MENG G F, et al. Densepoint: Learning densely contextual representation for efficient point cloud processing[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE,2019: 5239-5248. DOI:10. 1109/ICCV.2019.00534
doi: 10. 1109/ICCV.2019.00534
10 LI Y Y, BU R, SUN M C, et al. PointCNN: Convolution on x-transformed points[J]. Advances in Neural Information Processing Systems, 2018, 31.
11 XU Y F, FAN T Q, XU M Y, et al. SpiderCNN: Deep learning on point sets with parameterized convolutional filters[C]// Proceedings of the European Conference on Computer Vision (ECCV). 2018: 87-102. doi:10.1007/978-3-030-01237-3_6
doi: 10.1007/978-3-030-01237-3_6
12 XU M T, DING R Y, ZHAO H S, et al. PAConv: Position adaptive convolution with dynamic kernel assembling on point clouds[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 3173-3182. DOI:10.1109/CVPR46437.2021.00319
doi: 10.1109/CVPR46437.2021.00319
13 SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition[C]// Proceedings of the IEEE international Conference on Computer Vision. Santiago: IEEE, 2015: 945-953. DOI:10.1109/ICCV.2015.114
doi: 10.1109/ICCV.2015.114
14 QI C R, SU H, NIESSNER M, et al. Volumetric and multi-view CNNs for object classification on 3D data[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 5648-5656. DOI:10.1109/CVPR.2016.609
doi: 10.1109/CVPR.2016.609
15 WANG C, PELILLO M, SIDDIQI K. Dominant set clustering and pooling for multi-view 3D object recognition[Z]. (2019-06-04). https://arxiv.org/abs/1906.01592. doi:10.5244/c.31.64
doi: 10.5244/c.31.64
16 MA C, GUO Y L, YANG J G, et al. Learning multi-view representation with LSTM for 3D shape recognition and retrieval[J]. IEEE Transactions on Multimedia, 2018, 21(5): 1169-1182. DOI:10. 1109/TMM.2018.2875512
doi: 10. 1109/TMM.2018.2875512
17 WANG Y, SUN Y B, LIU Z W, et al. Dynamic graph cnn for learning on point clouds[J]. ACM Transactions on Graphics, 2019, 38(5): 1-12. DOI:10.1145/3326362
doi: 10.1145/3326362
18 ZHAO H S, JIANG L, JIA J Y, et al. Point transformer[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 16259-16268. DOI:10.1109/ICCV48922. 2021.01595
doi: 10.1109/ICCV48922. 2021.01595
19 GUO M H, CAI J X, LIU Z N, et al. PCT: Point cloud transformer[J]. Computational Visual Media, 2021, 7: 187-199. DOI:10.1007/s41095-021-0229-5
doi: 10.1007/s41095-021-0229-5
20 VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in Neural Information Processing Systems, 2017:30.
21 XU M Y, ZHOU Z P, QIAO Y. Geometry sharing network for 3D point cloud classification and segmentation[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020, 34(7): 12500-12507. DOI:10. 1609/aaai.v34i07.6938
doi: 10. 1609/aaai.v34i07.6938
22 YAN X, ZHENG C D, LI Z, et al. PointASNL: Robust point clouds processing using nonlocal neural networks with adaptive sampling[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 5589-5598. DOI:10.1109/CVPR42600.2020.00563
doi: 10.1109/CVPR42600.2020.00563
23 XU M T, ZHANG J H, ZHOU Z P, et al. Learning geometry-disentangled representation for complementary understanding of 3D object point cloud[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2021, 35(4): 3056-3064. DOI:10.1609/aaai.v35i4.16414
doi: 10.1609/aaai.v35i4.16414
24 YANG J C, ZHANG Q, NI B B, et al. Modeling point clouds with self-attention and gumbel subset sampling[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3323-3332. DOI:10.1109/CVPR.2019.00344
doi: 10.1109/CVPR.2019.00344
25 XIANG T, ZHANG C Y, SONG Y, et al. Walk in the cloud: Learning curves for point clouds shape analysis[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 915-924. DOI:10.1109/ICCV48922. 2021.00095
doi: 10.1109/ICCV48922. 2021.00095
26 FAN S Q, DONG Q L, ZHU F H, et al. SCF-Net: Learning spatial contextual features for large-scale point cloud segmentation[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 14504-14513. DOI:10.1109/CVPR46437.2021.01427
doi: 10.1109/CVPR46437.2021.01427
27 HU Q Y, YANG B, XIE L H, et al. RandLA-Net: Efficient semantic segmentation of large-scale point clouds[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 11108-11117. DOI:10.1109/CVPR42600.2020.01112
doi: 10.1109/CVPR42600.2020.01112
28 CHEN J T, KAKILLIOGLU B, REN H T, et al. Why discard if you can recycle? A recycling max pooling module for 3D point cloud analysis[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 559-567. DOI:10.1109/CVPR52688.2022. 00064
doi: 10.1109/CVPR52688.2022. 00064
29 GAO H Y, JI S W. Graph U-nets[C]// Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research. California: IEEE, 2019, 97: 2083-2092. DOI:10. 1109/TPAMI.2021.3081010
doi: 10. 1109/TPAMI.2021.3081010
30 SZEGEDY C, LIU W, JIA Y Q, et al. Going deeper with convolutions[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9. DOI:10. 1109/CVPR.2015.7298594
doi: 10. 1109/CVPR.2015.7298594
31 SHI S, WANG X, LI H. PointrCNN: 3D object proposal generation and detection from point cloud[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 770-779. DOI:10.1109/CVPR.2019.00086
doi: 10.1109/CVPR.2019.00086
32 WANG D Z, POSNER I. Voting for voting in online point cloud object detection[J]. Robotics: Science and Systems. 2015, 1(3): 10-15. DOI:10.15607/RSS.2015.XI.035
doi: 10.15607/RSS.2015.XI.035
33 LIU Z, ZHOU S B, SUO C Z, et al. LPD-Net: 3D point cloud learning for large-scale place recognition and environment analysis[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 2831-2840. DOI:10.1109/ICCV.2019.00292
doi: 10.1109/ICCV.2019.00292
34 MATURANA D, SCHERER S. VoxNet: A 3D convolutional neural network for real-time object recognition[C]// 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Hamburg: IEEE, 2015: 922-928. DOI:10. 1109/IROS.2015.7353481
doi: 10. 1109/IROS.2015.7353481
35 WU Z R, SONG S R, KHOSLA A, et al. 3D ShapeNets: A deep representation for volumetric shapes[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1912-1920. DOI:10.1109/CVPR. 2015.7298801
doi: 10.1109/CVPR. 2015.7298801
36 RIEGLER G, OSMAN ULUSOY A, GEIGER A. OctNet: Learning deep 3D representations at high resolutions[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 3577-3586. DOI:10.1109/CVPR.2017.701
doi: 10.1109/CVPR.2017.701
37 SHEN Y R, FENG C, YANG Y Q, et al. Mining point cloud local structures by kernel correlation and graph pooling[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4548-4557. DOI:10.1109/CVPR.2018.00478
doi: 10.1109/CVPR.2018.00478
38 LIU J X, NI B B, LI C Y, et al. Dynamic points agglomeration for hierarchical point sets learning[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 7546-7555. DOI:10.1109/ICCV.2019.00764
doi: 10.1109/ICCV.2019.00764
39 WU Z R, SONG S R, KHOSLA A, et al. 3D ShapeNets: A deep representation for volumetric shapes[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1912-1920. DOI:10.1109/CVPR. 2015.7298801
doi: 10.1109/CVPR. 2015.7298801
40 UY M A, PHAM Q H, HUA B S, et al. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 1588-1597. DOI:10.1109/ICCV.2019.00167
doi: 10.1109/ICCV.2019.00167
41 YI L, KIM V G, CEYLAN D, et al. A scalable active framework for region annotation in 3D shape collections[J]. ACM Transactions on Graphics (ToG), 2016, 35(6): 1-12. DOI:10.1145/2980179. 2980238
doi: 10.1145/2980179. 2980238
42 QIU S, ANWAR S, BARNES N. Dense-resolution network for point cloud classification and segmentation[C]// Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2021: 3813-3822. DOI:10. 1109/WACV48630.2021.00386
doi: 10. 1109/WACV48630.2021.00386
43 QIU S, ANWAR S, BARNES N. Geometric back-projection network for point cloud classification[J]. IEEE Transactions on Multimedia, 2021, 24: 1943-1955. DOI:10.1109/TMM.2021.3074240
doi: 10.1109/TMM.2021.3074240
44 MA X, QIN C, YOU H X, et al. Rethinking network design and local geometry in point cloud: A simple residual MLP framework[Z]. (2022-02-15). .
[1] 方于华,叶枫. MFDC-Net:一种融合多尺度特征和注意力机制的乳腺癌病理图像分类算法[J]. 浙江大学学报(理学版), 2023, 50(4): 455-464.
[2] 祝锦泰, 叶继华, 郭凤, 江蕗, 江爱文. FSAGN: 一种自主选择关键帧的表情识别方法[J]. 浙江大学学报(理学版), 2022, 49(2): 141-150.
[3] 杨冰, 徐丹, 张豪远, 罗海妮. 基于改进的DenseNet-BC对少数民族服饰的识别[J]. 浙江大学学报(理学版), 2021, 48(6): 676-683.
[4] 傅颖颖, 张丰, 杜震洪, 刘仁义. 融合图卷积神经网络和注意力机制的PM2.5小时浓度多步预测[J]. 浙江大学学报(理学版), 2021, 48(1): 74-83.
[5] 李红军, 刘欣莹, 张晓鹏, 严冬明. 局部形状特征概率混合的半自动三维点云分类[J]. 浙江大学学报(理学版), 2017, 44(1): 1-9.