Please wait a minute...
浙江大学学报(工学版)  2021, Vol. 55 Issue (12): 2342-2351    DOI: 10.3785/j.issn.1008-973X.2021.12.014
计算机技术     
基于竞争注意力融合的深度三维点云分类网络
陈涵娟1,2(),达飞鹏1,2,3,*(),盖绍彦1,2
1. 东南大学 自动化学院,江苏 南京 210096
2. 东南大学 复杂工程系统测量与控制教育部重点实验室,江苏 南京 210096
3. 东南大学 深圳研究院,广东 深圳 518063
Deep 3D point cloud classification network based on competitive attention fusion
Han-juan CHEN1,2(),Fei-peng DA1,2,3,*(),Shao-yan GAI1,2
1. School of Automation, Southeast University, Nanjing 210096, China
2. Key Laboratory of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Southeast University, Nanjing 210096, China
3. Shenzhen Research Institute, Southeast University, Shenzhen 518063, China
 全文: PDF(1181 KB)   HTML
摘要:

为了提高三维点云深度网络分类模型对全局特征的提取与表达能力,增强模型对噪声干扰的鲁棒性,提出可迁移应用于不同分类网络的竞争性注意力融合模块,学习多层级特征的全局表征和中间特征内在相似度,对中间特征通道权值重分配. 在基准网络Pointnet++和PointASNL中嵌入所提模块并进行实验,结果显示:所提模块具有独立性和可迁移性,聚焦更利于三维点云形状分类的核心骨干特征. 与基准网络相比,所提模块在保持分类精度稳定不下降的情况下,模型对点云扰动噪声、离群点噪声和随机噪声的抗干扰能力增强,在随机噪声数分别为0、10、50、100、200的情况下,准确度分别达到93.2%、92.9%、85.7%、78.2%、63.5%. 与传统滤波方法相比,端到端的学习减少预处理步骤和人工干预过程,同时具有更优的抗噪性能.

关键词: 点云物体分类三维点云深度学习神经网络注意力机制竞争性融合    
Abstract:

A competitive attention fusion block that can be transferred to different classification networks was proposed, in order to improve the 3D point cloud deep network classification model’s ability to extract and express global features, and enhance the model’s robustness to noise interference. The global representation of multi-hierarchical features and the internal similarity of intermediate features were learned. The weights of the intermediate feature channels were re-allocated. The proposed block was embedded in the benchmark networks Pointnet++ and PointASNL for experiments. Results show that the proposed block is independent and transferable, focusing on the core and backbone features that are more conducive to 3D point cloud shape classification. Compared with the benchmark network, the proposed block enhances the model’s anti-interference ability to point cloud disturbance noise, outlier noise and random noise without decreasing the classification accuracy. The proposed method achieves the accuracy was 93.2%, 92.9%, 85.7%, 78.2%, 63.5% in the case of the number of random noises was 0, 10, 50, 100, 200, respectively. Compared with the traditional filtering method, end-to-end learning reduces the pre-processing steps and manual intervention process, and has better anti-noise performance.

Key words: point cloud object classification    3D point cloud    deep learning    neural network    attention mechanism    competitive fusion
收稿日期: 2021-01-04 出版日期: 2021-12-31
CLC:  TP 181  
基金资助: 国家自然科学基金资助项目(51475092);江苏省前沿引领技术基础研究专项资助项目(BK20192004C);江苏省自然基金资助项目(BK20181269);深圳市科技创新委员会资助项目(JCYJ20180306174455080)
通讯作者: 达飞鹏     E-mail: 220181486@seu.edu.cn;dafp@seu.edu.cn
作者简介: 陈涵娟(1996—),女,硕士生,从 事 计算机视觉三维点云处理研究.orcid.org/0000-0001-7262-8065. E-mail: 220181486@seu.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
陈涵娟
达飞鹏
盖绍彦

引用本文:

陈涵娟,达飞鹏,盖绍彦. 基于竞争注意力融合的深度三维点云分类网络[J]. 浙江大学学报(工学版), 2021, 55(12): 2342-2351.

Han-juan CHEN,Fei-peng DA,Shao-yan GAI. Deep 3D point cloud classification network based on competitive attention fusion. Journal of ZheJiang University (Engineering Science), 2021, 55(12): 2342-2351.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2021.12.014        https://www.zjujournals.com/eng/CN/Y2021/V55/I12/2342

图 1  竞争性注意力融合模块的示意图
图 2  二维和三维挤压激励子模块结构图
图 3  特征内在关联自注意力子模块结构图
图 4  基于CAF模块的三维点云分类网络框架
方法 输入 Nin/103 Acc/%
PointNet[4] Pnt 1 89.2
SO-Net[8] Pnt,Noml 2 90.9
PointNet++[5] Pnt,Noml 5 91.9
PointCNN[10] Pnt 1 92.2
Point2Sequence[12] Pnt 1 92.6
A-CNN[13] Pnt,Noml 1 92.6
PointASNL[21] Pnt 1 92.9(92.85)
PointASNL[21] Pnt,Noml 1 93.2(93.15)
本研究 Pnt 1 92.9(92.88)
本研究 Pnt,Noml 1 93.2(93.19)
表 1  在ModelNet40数据集上的平均分类精度
图 5  CAF模块对不同噪声类型的抗干扰性能
图 6  CAF模块对模型鲁棒性的影响
图 7  CAF模块与传统滤波的抗干扰性能对比
图 8  独立子模块对模型鲁棒性的影响
n Acc
Base Base+MFSE Base+FICSA Base+CAF
%
0 93.2 93.2 92.6 93.2
1 92.1 91.7 91.8 92.3
10 88.3 86.5 89.8 89.1
50 78.1 74.0 80.1 81.9
100 71.1 60.6 72.5 74.8
表 2  独立子模块对模型抗干扰性能的影响
方法 mIoU IoU
areo bag cap car chair ear phone guitar knife lamp laptop motor mug pistol rocket skate board table
PointNet[4] 83.7 83.4 78.7 82.5 74.9 89.6 73.0 91.5 85.9 80.8 95.3 65.2 93.0 81.2 57.9 72.8 80.6
SO-Net[8] 84.9 82.8 77.8 88.0 77.3 90.6 73.5 90.7 83.9 82.8 94.8 69.1 94.2 80.9 53.1 72.9 83.0
PointNet++[5] 85.1 82.4 79.0 87.7 77.3 90.8 71.8 91.0 85.9 83.7 95.3 71.6 94.1 81.3 58.7 76.4 82.6
P2Sequence[12] 85.2 82.6 81.8 87.5 77.3 90.8 77.1 91.1 86.9 83.9 95.7 70.8 94.6 79.3 58.1 75.2 82.8
PointCNN[10] 86.1 84.1 86.5 86.0 80.8 90.6 79.7 92.3 88.4 85.3 96.1 77.2 95.2 84.2 64.2 80.0 83.0
PointASNL[21] 86.1 84.1 84.7 87.9 79.7 92.2 73.7 91.0 87.2 84.2 95.8 74.4 95.2 81.0 63.0 76.3 83.2
本研究 85.9 84.2 83.2 87.4 79.2 91.9 74.3 91.5 86.4 84.3 95.7 73.7 95.4 82.6 62.4 75.0 82.7
表 3  在ShapeNetPart数据集上的零件分割性能
方法 OA mAcc mIoU IoU
ceiling floor wall beam column window door table chair sofa bookcase board clutter
PointNet[4] 78.5 66.2 47.6 88.0 88.7 69.3 42.4 23.1 47.5 51.6 42.0 54.1 38.2 9.6 29.4 35.2
A-CNN[13] 87.3 ? 62.9 92.4 96.4 79.2 59.5 34.2 56.3 65.0 66.5 78.0 28.5 56.9 48.0 56.8
PointCNN[10] 88.1 75.6 65.4 94.8 97.3 75.8 63.3 51.7 58.4 57.2 71.6 69.1 39.1 61.2 52.2 58.6
PointWeb[14] 87.3 76.2 66.7 93.5 94.2 80.8 52.4 41.3 64.9 68.1 71.4 67.1 50.3 62.7 62.2 58.5
PointASNL[21] 88.8 79.0 68.7 95.3 97.9 81.9 47.0 48.0 67.3 70.5 71.3 77.8 50.7 60.4 63.0 62.8
本研究 88.2 78.7 68.3 95.1 97.3 81.2 47.4 45.8 67.0 69.1 72.1 77.5 50.6 60.8 62.4 61.6
表 4  在S3DIS数据集上6折交叉验证的语义分割性能
1 BU S, LIU Z, HAN J, et al Learning high-level feature by deep belief networks for 3-D model retrieval and recognition[J]. IEEE Transactions on Multimedia, 2014, 16 (8): 2154- 2167
doi: 10.1109/TMM.2014.2351788
2 SU H, MAJI S, KALOGERAKIS E, et al. Multi-view convolutional neural networks for 3D shape recognition [C]// 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 945-953.
3 WU Z, SONG S, KHOSLA A, et al. 3D shapeNets: a deep representation for volumetric shapes [C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1912-1920.
4 QI C R, SU H, MO K, et al. PointNet: deep learning on point sets for 3D classification and segmentation [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 77-85.
5 QI C R, YI L, SU H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space [C]// Advances in Neural Information Processing Systems. Long Beach: MIT Press, 2017: 5099-5108.
6 GUERRERO P, KLEIMAN Y, OVSJANIKOV M, et al PCPNET learning local shape properties from raw point clouds[J]. Computer Graphics Forum, 2018, 37 (2): 75- 85
doi: 10.1111/cgf.13343
7 SHEN Y, FENG C, YANG Y, et al. Mining point cloud local structures by kernel correlation and graph pooling [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4548-4557.
8 LI J, CHEN B M, LEE G H. SO-Net: self-organizing network for point cloud analysis [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 9397-9406.
9 QI C R, LIU W, WU C, et al. Frustum PointNets for 3D object detection from RGB-D data [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 918-927.
10 LI Y, BU R, SUN M, et al. PointCNN: convolution on Χ-transformed points [C]// Advances in Neural Information Processing Systems. Montreal: MIT Press, 2018: 828-838.
11 LIU Y, FAN B, MENG G, et al. DensePoint: learning densely contextual representation for efficient point cloud processing [C]// 2019 IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 5238-5247.
12 LIU X, HAN Z, LIU Y S, et al. Point2Sequence: learning the shape representation of 3D point clouds with an attention-based sequence to sequence network [C]// Proceedings of the AAAI conference on Artificial Intelligence. Honolulu: AAAI, 2019: 8778-8785.
13 KOMARICHEV A, ZHONG Z, HUA J. A-CNN: annularly convolutional neural networks on point clouds [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7413-7422.
14 ZHAO H, JIANG L, FU C W, et al. PointWeb: enhancing local neighborhood features for point cloud processing [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 5560-5568.
15 WANG C, SAMARI B, SIDDIQI K. Local spectral graph convolution for point set feature learning [C]// 15th European Conference on Computer Vision. Munich: Springer, 2018: 56-71.
16 TE G, HU W, GUO Z, et al. RGCNN: regularized graph CNN for point cloud segmentation [C]// Proceedings of the 26th ACM international conference on Multimedia. Seoul: ACM, 2018: 746-754.
17 LANDRIEU L, SIMONOVSKY M. Large-scale point cloud semantic segmentation with superpoint graphs [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 4558-4567.
18 VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Advances in Neural Information Processing Systems. Long Beach: MIT Press, 2017: 5998-6008.
19 HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141.
20 YOU H, FENG Y, JI R, et al. PVNet: a joint convolutional network of point cloud and multi-view for 3D shape recognition [C]// Proceedings of the 26th ACM international conference on Multimedia. Seoul: ACM, 2018: 1310-1318.
21 YAN X, ZHENG C, LI Z, et al. PointASNL: robust point clouds processing using nonlocal neural networks with adaptive sampling [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 5588-5597.
22 HU Y, WEN G, LUO M, et al. Competitive inner-imaging squeeze and excitation for residual network [EB/OL]. (2018-12-23)[2020-12-29]. https://arxiv.org/abs/1807.08920.
23 YI L, KIM V G, CEYLAN D, et al A scalable active framework for region annotation in 3D shape collections[J]. ACM Transactions on Graphics, 2016, 35 (6): 210
[1] 鞠晓臣,赵欣欣,钱胜胜. 基于自注意力机制的桥梁螺栓检测算法[J]. 浙江大学学报(工学版), 2022, 56(5): 901-908.
[2] 何立,庞善民. 结合年龄监督和人脸先验的语音-人脸图像重建[J]. 浙江大学学报(工学版), 2022, 56(5): 1006-1016.
[3] 王友卫,童爽,凤丽洲,朱建明,李洋,陈福. 基于图卷积网络的归纳式微博谣言检测新方法[J]. 浙江大学学报(工学版), 2022, 56(5): 956-966.
[4] 张雪芹,李天任. 基于Cycle-GAN和改进DPN网络的乳腺癌病理图像分类[J]. 浙江大学学报(工学版), 2022, 56(4): 727-735.
[5] 王云灏,孙铭会,辛毅,张博宣. 基于压电薄膜传感器的机器人触觉识别系统[J]. 浙江大学学报(工学版), 2022, 56(4): 702-710.
[6] 吴泽康,赵姗,李宏伟,姜懿芮. 遥感图像语义分割空间全局上下文信息网络[J]. 浙江大学学报(工学版), 2022, 56(4): 795-802.
[7] 许萌,王丹,李致远,陈远方. IncepA-EEGNet: 融合Inception网络和注意力机制的P300信号检测方法[J]. 浙江大学学报(工学版), 2022, 56(4): 745-753, 782.
[8] 柳长源,何先平,毕晓君. 融合注意力机制的高效率网络车型识别[J]. 浙江大学学报(工学版), 2022, 56(4): 775-782.
[9] 陈扬钊,袁伟娜. 深度学习辅助上行免调度NOMA多用户检测方法[J]. 浙江大学学报(工学版), 2022, 56(4): 816-822.
[10] 张科文,潘柏松. 考虑非线性模型不确定性的航天器自主交会控制[J]. 浙江大学学报(工学版), 2022, 56(4): 833-842.
[11] 陈巧红,裴皓磊,孙麒. 基于视觉关系推理与上下文门控机制的图像描述[J]. 浙江大学学报(工学版), 2022, 56(3): 542-549.
[12] 褚晶辉,史李栋,井佩光,吕卫. 适用于目标检测的上下文感知知识蒸馏网络[J]. 浙江大学学报(工学版), 2022, 56(3): 503-509.
[13] 程若然,赵晓丽,周浩军,叶翰辰. 基于深度学习的中文字体风格转换研究综述[J]. 浙江大学学报(工学版), 2022, 56(3): 510-519, 530.
[14] 王婷,朱小飞,唐顾. 基于知识增强的图卷积神经网络的文本分类[J]. 浙江大学学报(工学版), 2022, 56(2): 322-328.
[15] 温佩芝,陈君谋,肖雁南,温雅媛,黄文明. 基于生成式对抗网络和多级小波包卷积网络的水下图像增强算法[J]. 浙江大学学报(工学版), 2022, 56(2): 213-224.