Please wait a minute...
浙江大学学报(工学版)  2023, Vol. 57 Issue (7): 1345-1353    DOI: 10.3785/j.issn.1008-973X.2023.07.009
自动化技术     
基于3D-ResNet双流网络的VR病评估模型
权巍(),蔡永青,王超,宋佳,孙鸿凯,李林轩
长春理工大学 计算机科学技术学院,吉林 长春 130013
VR sickness estimation model based on 3D-ResNet two-stream network
Wei QUAN(),Yong-qing CAI,Chao WANG,Jia SONG,Hong-kai SUN,Lin-xuan LI
School of Computer Science and Technology, Changchun University of Science and Technology, Changchun 130013, China
 全文: PDF(1790 KB)   HTML
摘要:

为了准确地评估VR视频引起不适的程度,提出基于3D双流卷积神经网络的VR病评估模型. 模仿人类视觉系统的2条通路,建立外观流和运动流2个子网络;将2D-ResNet50模型改为3D模型,增加一个深度通道,用以学习视频中的时序信息. 加入3D-CBAM注意力模块提高了各帧通道之间的空间关联,增强关键信息,去除冗余信息. 采用后端融合的方法,实现2个子网络结果的融合. 在公开视频数据集上进行实验验证,结果表明,通过3D-CBAM注意力模块引入注意力机制,使得外观流和运动流网络的VR病评估精度分别提升了1.7%和3.6%,与现有文献相比,融合的双流网络模型的精度得到了较大的提升,精度达到93.7%.

关键词: 虚拟现实VR病深度学习注意力机制3D卷积神经网络    
Abstract:

A VR sickness estimation method was proposed based on 3D two-stream convolutional neural network in order to accurately estimate VR sickness of VR video. Two sub-networks, which were appearance flow and motion flow, were constructed to mimic the two pathways of human visual system. 2D-ResNet50 model was changed to 3D model and a depth channel was added to learn the timing information in videos. 3D-CBAM attention module was introduced to improve the spatial correlation between channels of each frame. Then the key information was enhanced and redundant information was suppressed. The back-end fusion method was used to fuse the results of the two sub-networks. Experiments were conducted on a public video dataset. The experimental results showed that the accuracy of the appearance stream network and the motion stream network was improved by 1.7% and 3.6% respectively by introducing the attention mechanism. The accuracy of the fused two-stream network was improved to 93.7%, which outperformed other literatures.

Key words: virtual reality    VR sickness    deep learning    attention mechanism    3D convolutional neural network
收稿日期: 2022-08-20 出版日期: 2023-07-17
CLC:  TP 391  
基金资助: 吉林省科技发展计划重点研发项目(20210203218SF)
作者简介: 权巍(1981—),女,副教授,从事虚拟现实的研究. orcid.org/0000-0001-7191-3921. E-mail: quanwei@cust.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
权巍
蔡永青
王超
宋佳
孙鸿凯
李林轩

引用本文:

权巍,蔡永青,王超,宋佳,孙鸿凯,李林轩. 基于3D-ResNet双流网络的VR病评估模型[J]. 浙江大学学报(工学版), 2023, 57(7): 1345-1353.

Wei QUAN,Yong-qing CAI,Chao WANG,Jia SONG,Hong-kai SUN,Lin-xuan LI. VR sickness estimation model based on 3D-ResNet two-stream network. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1345-1353.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.07.009        https://www.zjujournals.com/eng/CN/Y2023/V57/I7/1345

图 1  基于双流网络的3D-ResNet VR病评估模型
网络层 输出大小 3D-ResNet50
Conv1 $ L \times 112 \times 112 $ $ 7 \times 7 \times 7 $,64,stride 2
Conv2_x $ L \times 56 \times 56 $ $ \left[\begin{array}{c}1\times 1\times 1,64\\ 3\times 3\times 3,64\\ 1\times 1\times 1,256\end{array}\right]\times 3 $
Conv3_x $ \dfrac{L}{2} \times 28 \times 28 $ $ \left[\begin{array}{c}1\times 1\times 1,128\\ 3\times 3\times 3,128\\ 1\times 1\times 1,512\end{array}\right]\times 4 $
Conv4_x $ \dfrac{L}{4} \times 14 \times 14 $ $\left[\begin{array}{c}1\times 1\times 1,256\\ 3\times 3\times 3,256\\ 1\times 1\times 1,1\;024\end{array}\right]\times 6$
Conv5_x $ \dfrac{L}{8} \times 7 \times 7 $ $\left[\begin{array}{c}1\times 1\times 1,512\\ 3\times 3\times 3,512\\ 1\times 1\times 1,2\;048\end{array}\right]\times 3$
$ 1 \times 1 \times 1 $ 3D-Average Pool,Fc Layer with Softmax
表 1  子网络结构
图 2  3D-CBAM结构
图 3  通道注意力模块
图 4  空间注意力模块
图 5  基于注意力机制的子网络结构
主观评分 舒适度等级 数字类别
0<Score≤10 舒适 0
10<Score≤30 轻度不适 1
31<Score≤40 明显不适 2
Score>40 重度不适 3
表 2  主观评分和舒适度等级分类
配置 参数信息
CPU Intel(R) Xeon(R) CPU E5-2620@ 2.00 GHz
GPU NVIDIA GeForce RTX 2080 SUPER 8 GB
内存 16 GB
操作系统 Windows10
通用并行计算架构 CUDA10.0、cuDNN7.6.1
深度学习框架 Pytorch1.2
开发环境 Anaconda3、Python3.6
表 3  测试平台的配置信息
参数 参数值
L 16
输入图像维度 [3,112,112]、[2,112,112]
α 10?2
w 10?3
β 0.9
Batchsize 8
Emax 120
优化器 动量SGD
表 4  子网络训练参数的设置
图 6  模型的损失曲线与精确度曲线
模型 $ P $/% $ N $ $ t $/h
外观流网络-无注意力机制 87.9 46 237 032 11. 2
运动流网络-无注意力机制 79.3 46 237 032 11. 2
无注意力机制的双流网络 91.5
外观流网络-注意力机制 89.6 51 250 085 12. 0
运动流网络-注意力机制 82.9 51 250 085 12. 0
二分类SVM [24] 81.8
三分类SVM [24] 58.0
四分类ANN [25] 90.0
包含注意力机制的双流网络 93.7
表 5  Padmanabar模型精度的对比
图 7  3D-CBAM注意力可视化
1 GUNA J, GERŠAK G, HUMAR I, et al Influence of video content type on users’ virtual reality sickness perception and physiological response[J]. Future Generation Computer Systems, 2019, 91: 263- 276
doi: 10.1016/j.future.2018.08.049
2 MCCAULEY M E, SHARKEY T J Cybersickness: perception of self-motion in virtual environments[J]. Presence: Teleoperators and Virtual Environments, 1992, 1 (3): 311- 318
doi: 10.1162/pres.1992.1.3.311
3 GUNA J, GERŠAK G, HUMAR I, et al Virtual reality sickness and challenges behind different technology and content settings[J]. Mobile Networks and Applications, 2020, 25 (4): 1436- 1445
doi: 10.1007/s11036-019-01373-w
4 CHEN S, WENG D The temporal pattern of VR sickness during 7.5-h virtual immersion[J]. Virtual Reality, 2022, 26 (3): 817- 822
doi: 10.1007/s10055-021-00592-5
5 KIM H G, LEE S, KIM S, et al. Towards a better understanding of VR sickness: physical symptom prediction for VR contents [C]// Proceedings of the AAAI Conference on Artificial Intelligence. Washington: AAAI, 2021: 836-844.
6 LIM K, LEE J, WON K, et al A novel method for VR sickness reduction based on dynamic field of view processing[J]. Virtual Reality, 2021, 25 (2): 331- 340
doi: 10.1007/s10055-020-00457-3
7 NG A K T, CHAN L K Y, LAU H Y K. A study of cybersickness and sensory conflict theory using a motion-coupled virtual reality system [C]// 2018 IEEE Conference on Virtual Reality and 3D User Interfaces. Reutlingen: IEEE, 2018: 643-644.
8 KENNEDY R S, LANE N E, BERBAUM K S, et al Simulator sickness questionnaire: an enhanced method for quantifying simulator sickness[J]. International Journal of Aviation Psychology, 1993, 3 (3): 203- 220
doi: 10.1207/s15327108ijap0303_3
9 KIM H G, BADDAR W J, LIM H, et al. Measurement of exceptional motion in VR video contents for VR sickness assessment using deep convolutional autoencoder [C]// 23rd ACM Conference on Virtual Reality Software and Technology. Gothenburg: ACM, 2017: 1-7.
10 KIM H G, LIM H T, LEE S, et al Vrsa net: VR sickness assessment considering exceptional motion for 360 VR video[J]. IEEE Transactions on Image Processing, 2018, 28 (4): 1646- 1660
11 LEE T M, YOON J C, LEE I K Motion sickness prediction in stereoscopic videos using 3D convolutional neural networks[J]. IEEE Transactions on Visualization and Computer Graphics, 2019, 25 (5): 1919- 1927
doi: 10.1109/TVCG.2019.2899186
12 GOODALE M A, MILNER A D Separate visual pathways for perception and action[J]. Trends in Neurosciences, 1992, 15 (1): 20- 25
doi: 10.1016/0166-2236(92)90344-8
13 HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
14 HUANG C, WANG F, ZHANG R. Sign language recognition based on CBAM-ResNet [C]// Proceedings of the 2019 International Conference on Artificial Intelligence and Advanced Manufacturing. New York: ACM, 2019: 1-6.
15 WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision. Munich: Springer, 2018: 3-19.
16 权巍, 王超, 耿雪娜, 等 基于运动感知的VR体验舒适度研究[J]. 系统仿真学报, 2023, 35 (1): 169- 177
QUAN Wei, WANG Chao, GENG Xue-na, et al Research on VR experience comfort based on motion perception[J]. Journal of System Simulation, 2023, 35 (1): 169- 177
doi: 10.16182/j.issn1004731x.joss.21-0966
17 KIM J, KIM W, AHN S, et al. Virtual reality sickness predictor: analysis of visual-vestibular conflict and VR contents[C]// Proceedings of 2018 10th International Conference on Quality of Multimedia Experience. Sardinia: IEEE, 2018: 1-6.
18 PADMANABAN N, RUBAN T, SITZMANN V, et al Towards a machine-learning approach for sickness prediction in 360 stereoscopic videos[J]. IEEE Transactions on Visualization and Computer Graphics, 2018, 24 (4): 1594- 1603
doi: 10.1109/TVCG.2018.2793560
19 HELL S, ARGYRIOU V. Machine learning architectures to predict motion sickness using a virtual reality rollercoaster simulation tool [C]// IEEE International Conference on Artificial Intelligence and Virtual Reality. New York: IEEE, 2018: 153-156.
20 PORCINO T, RODRIGUES E O, SILVA A, et al. Using the gameplay and user data to predict and identify causes of cybersickness manifestation in virtual reality games [C]// IEEE 8th International Conference on Serious Games and Applications for Health. Vancouver: IEEE, 2020: 1-8.
21 YILDIRIM C. A review of deep learning approaches to EEG-based classification of cybersickness in virtual reality [C]// 2020 IEEE International Conference on Artificial Intelligence and Virtual Reality. Utrecht: IEEE, 2020: 351-357.
22 LI Y, LIU A, DING L Machine learning assessment of visually induced motion sickness levels based on multiple biosignals[J]. Biomedical Signal Processing and Control, 2019, 49: 202- 211
doi: 10.1016/j.bspc.2018.12.007
23 SELVARAJU R R, COGSWELL M, DAS A, et al Grad-CAM: visual explanations from deep networks via gradient-based localization[J]. International Journal of Computer Vision, 2020, 128 (2): 336- 359
doi: 10.1007/s11263-019-01228-7
24 GARCIA-AGUNDEZ A, REUTER C, BECKER H, et al Development of a classifier to determine factors causing cybersickness in virtual reality environments[J]. Games for Health Journal, 2019, 8 (6): 439- 444
doi: 10.1089/g4h.2019.0045
[1] 李晓艳,王鹏,郭嘉,李雪,孙梦宇. 基于双注意力机制的多分支孪生网络目标跟踪[J]. 浙江大学学报(工学版), 2023, 57(7): 1307-1316.
[2] 杨哲,葛洪伟,李婷. 特征融合与分发的多专家并行推荐算法框架[J]. 浙江大学学报(工学版), 2023, 57(7): 1317-1325.
[3] 李云红,段姣姣,苏雪平,张蕾涛,于惠康,刘杏瑞. 基于改进生成对抗网络的书法字生成算法[J]. 浙江大学学报(工学版), 2023, 57(7): 1326-1334.
[4] 周欣磊,顾海挺,刘晶,许月萍,耿芳,王冲. 基于集成学习与深度学习的日供水量预测方法[J]. 浙江大学学报(工学版), 2023, 57(6): 1120-1127.
[5] 刘沛丰,钱璐,赵兴炜,陶波. 航空装配领域中命名实体识别的持续学习框架[J]. 浙江大学学报(工学版), 2023, 57(6): 1186-1194.
[6] 韩俊,袁小平,王准,陈烨. 基于YOLOv5s的无人机密集小目标检测算法[J]. 浙江大学学报(工学版), 2023, 57(6): 1224-1233.
[7] 赵嘉墀,王天琪,曾丽芳,邵雪明. 基于GRU的扑翼非定常气动特性快速预测[J]. 浙江大学学报(工学版), 2023, 57(6): 1251-1256.
[8] 曹晓璐,卢富男,朱翔,翁立波,卢书芳,高飞. 基于草图的兼容性服装生成方法[J]. 浙江大学学报(工学版), 2023, 57(5): 939-947.
[9] 项学泳,王力,宗文鹏,李广云. ASIS模块支持下融合注意力机制KNN的点云实例分割算法[J]. 浙江大学学报(工学版), 2023, 57(5): 875-882.
[10] 苏育挺,陆荣烜,张为. 基于注意力和自适应权重的车辆重识别算法[J]. 浙江大学学报(工学版), 2023, 57(4): 712-718.
[11] 卞佰成,陈田,吴入军,刘军. 基于改进YOLOv3的印刷电路板缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(4): 735-743.
[12] 马庆禄,鲁佳萍,唐小垚,段学锋. 改进YOLOv5s的公路隧道烟火检测方法[J]. 浙江大学学报(工学版), 2023, 57(4): 784-794.
[13] 程艳芬,吴家俊,何凡. 基于关系门控图卷积网络的方面级情感分析[J]. 浙江大学学报(工学版), 2023, 57(3): 437-445.
[14] 曾耀,高法钦. 基于改进YOLOv5的电子元件表面缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(3): 455-465.
[15] 兰欢,余建波. 基于深度学习三维成型的钢板表面缺陷检测[J]. 浙江大学学报(工学版), 2023, 57(3): 466-476.