Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2025, Vol. 59 Issue (6): 1110-1118    DOI: 10.3785/j.issn.1008-973X.2025.06.002
    
Visual induced motion sickness estimation model based on attention mechanism
Yongqing CAI(),Cheng HAN*(),Wei QUAN,Wudi CHEN
School of Computer Science and Technology, Changchun University of Science and Technology, Changchun 130022, China
Download: HTML     PDF(1295KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A visual induced motion sickness (VIMS) estimation model based on attention mechanism was proposed to accurately assess the degree of VIMS experienced by users when interacting with virtual products. The model was constructed upon Transformer architecture, incorporating the self-attention mechanism within temporal and spatial sequences to capture the complex interactions between temporal and spatial features. By utilizing the optical flow information and user attention information, two sub-networks of motion flow and attention flow were designed to form a dual-flow network structure. The motion flow sub-network was responsible for capturing the motion features in the visual content, and the attention flow sub-network focused on extracting critical information, such as objects, textures, and other key elements within the user’s attention area. A late fusion strategy was employed to effectively combine the outputs of the dual-flow network. Experimental validation conducted on public video datasets demonstrated that the synergistic interaction between the attention flow sub-network and the Transformer architecture significantly enhanced the model accuracy. The VIMS model achieved optimal results in terms of the F1 score, accuracy and precision with values of 0.8468, 89.19% and 92.28%, respectively, representing a notable advancement over existing approaches.



Key wordsvirtual reality      visual induced motion sickness      attention mechanism      deep learning      Transformer architecture     
Received: 25 November 2024      Published: 30 May 2025
CLC:  TP 391  
Fund:  吉林省教育厅科学研究项目(JJKH20250531BS).
Corresponding Authors: Cheng HAN     E-mail: 1364392394@qq.com;hancheng@cust.edu.cn
Cite this article:

Yongqing CAI,Cheng HAN,Wei QUAN,Wudi CHEN. Visual induced motion sickness estimation model based on attention mechanism. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1110-1118.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.06.002     OR     https://www.zjujournals.com/eng/Y2025/V59/I6/1110


基于注意力机制的视觉诱导晕动症评估模型

为了准确评估用户在体验虚拟产品时由视觉内容诱发的晕动症程度,提出基于注意力机制的视觉诱导晕动症(VIMS)评估模型. 该模型依托Transformer架构构建网络,分别在时间序列和空间序列上建立自注意力机制,捕捉时间与空间特征之间的关系. 利用光流信息和用户关注信息,设计运动流和关注流2个子网络,构成双流网络结构;运动流子网络解析视觉内容中的运动特征,关注流子网络专注于提取用户关注区域的物体、纹理等重要信息. 采用后端融合策略实现双流网络结果的融合. 在公开视频数据集上进行实验验证,结果表明,关注流子网络和Transformer架构在注意力机制方面的协同作用有效提升了模型准确性. VIMS模型在F1指数、准确度和精确率方面均取得了最优结果,分别为0.8468、89.19%和92.28%,相较于现有方法有显著的性能提升.


关键词: 虚拟现实,  视觉诱导晕动症,  注意力机制,  深度学习,  Transformer架构 
Fig.1 Overall framework of visual induced motion sickness estimation model based on attention mechanism
Fig.2 Extracting region of interest
Fig.3 Flowchart of subnetwork based on Transformer architecture
Fig.4 Composition of mixed dataset and examples of training/testing sets
配置参数信息
CPUIntel CORE i9 13900K
GPUNVIDIA GeForce RTX 2080TI
操作系统Ubuntu
通用并行计算架构CUDA 10.0、cuDNN 7.6.1
深度学习框架Pytorch 1.2
开发环境Anaconda 3、Python 3.7
Tab.1 Experimental environment configuration information
方法输入特征模型F1A/%P/%
Lee等[24]运动流+视差流+显著流3D CNN0.649 474.3681.87
Du等[25]运动流+视差流+显著流3D CNN+attention0.689 878.3883.34
权巍等[26]运动流+外观流3D CNN+attention0.816 786.4988.91
本研究运动流+关注流Transformer0.846 889.1992.28
Tab.2 Comparison of experimental results of different VIMS estimation methods
方法F1A/%P/%
外观流网络-ResNet架构0.516 654.0570.79
关注流网络-ResNet架构0.619 862.1662.10
运动流网络-ResNet架构0.665 272.9786.49
外观流网络-Transformer架构0.748 878.3872.36
关注流网络- Transformer架构0.792 583.7889.10
运动流网络- Transformer架构0.824 086.4890.99
Tab.3 Comparison of experimental results for different subnetworks in ResNet/Transformer architectures
Fig.5 Accuracy and loss variation curves for different subnetwork models
视野范围F1A/%P/%
完整视野(360°×180°)0.748 878.3872.36
关注视野(180°×100°)0.792 583.7889.10
HMD视野(100°×90°)0.732 175.6771.31
Tab.4 Comparison of experimental results of different fields of view
方法F1A/%P/%
外观流+关注流-ResNet架构0.619 862.1662.10
外观流+运动流-ResNet架构0.742 981.0884.83
关注流+运动流-ResNet架构0.766 581.0887.16
外观流+关注流-Transformer架构0.805 083.7877.61
外观流+运动流-Transformer架构0.820 086.4889.12
关注流+运动流-Transformer架构0.846 889.1992.27
Tab.5 Comparison of experimental results of different fusion networks
[1]   SOUCHET A D, LOURDEAUX D, PAGANI A, et al A narrative review of immersive virtual reality’s ergonomics and risks at the workplace: cybersickness, visual fatigue, muscular fatigue, acute stress, and mental overload[J]. Virtual Reality, 2023, 27 (1): 19- 50
[2]   VON MAMMEN S, KNOTE A, EDENHOFER S. Cyber sick but still having fun [C]// Proceedings of the 22nd ACM Conference on Virtual Reality Software and Technology. Munich: ACM, 2016: 325–326.
[3]   SHEN Z, SUN F, WANG Y, et al Research progress in physiological evaluation and treatment of visually induced motion sickness in virtual reality[J]. Acta Academiae Medicinae Sinicae, 2023, 45 (6): 980- 986
[4]   NG A K T, CHAN L K Y, LAU H Y K A study of cyber-sickness and sensory conflict theory using a motion-coupled virtual reality system[J]. Displays, 2020, 61: 101922
[5]   KIM H G, LIM H T, LEE S, et al VRSA net: VR sickness assessment considering exceptional motion for 360° VR video[J]. IEEE Transactions on Image Processing, 2019, 28 (4): 1646- 1660
[6]   KENNEDY R S, LANE N E, BERBAUM K S, et al Simulator sickness questionnaire: an enhanced method for quantifying simulator sickness[J]. The International Journal of Aviation Psychology, 1993, 3 (3): 203- 220
[7]   BRUCK S, WATTERS P A. Estimating cybersickness of simulated motion using the simulator sickness questionnaire (SSQ): a controlled study [C]// Proceedings of the International Conference on Computer Graphics. Tianjin: IEEE, 2009: 486–488.
[8]   ARAFAT I M, FERDOUS S M S, QUARLES J. The effects of cybersickness on persons with multiple sclerosis [C]// Proceedings of the 22nd ACM Conference on Virtual Reality Software and Technology. Munich: ACM, 2016: 51–59.
[9]   SEVINC V, BERKMAN M I Psychometric evaluation of simulator sickness questionnaire and its variants as a measure of cybersickness in consumer virtual environments[J]. Applied Ergonomics, 2020, 82: 102958
[10]   ISLAM R, DESAI K, QUARLES J. Towards forecasting the onset of cybersickness by fusing physiological, head-tracking and eye-tracking with multimodal deep fusion network [C]// Proceedings of the IEEE International Symposium on Mixed and Augmented Reality. Singapore: IEEE, 2022: 121–130.
[11]   SEONG S, PARK J Tracking motion sickness in dynamic VR environments with EDA signals[J]. International Journal of Industrial Ergonomics, 2024, 99: 103543
[12]   YANG A H X, KASABOV N K, CAKMAK Y O Prediction and detection of virtual reality induced cybersickness: a spiking neural network approach using spatiotemporal EEG brain data and heart rate variability[J]. Brain Informatics, 2023, 10 (1): 15
[13]   LI R, WANG Y, YIN H, et al. A deep cybersickness predictor through kinematic data with encoded physiological representation [C]// Proceedings of the IEEE International Symposium on Mixed and Augmented Reality. Sydney: IEEE, 2023: 1132–1141.
[14]   ISLAM R, DESAI K, QUARLES J. VR sickness prediction from integrated HMD’s sensors using multimodal deep fusion network [EB/OL]. (2021-08-14) [2024-10-22]. https://arxiv.org/abs/2108.06437.
[15]   SHIMADA S, PANNATTEE P, IKEI Y, et al High-frequency cybersickness prediction using deep learning techniques with eye-related indices[J]. IEEE Access, 2023, 11: 95825- 95839
[16]   YAO S H, FAN C L, HSU C H. Towards quality-of-experience models for watching 360° videos in head-mounted virtual reality [C]// Proceedings of the Eleventh International Conference on Quality of Multimedia Experience. Berlin: IEEE, 2019: 1–3.
[17]   QUAN W, LI L, HAN C, et al. Objective evaluation of VR sickness and analysis of its relationship with VR presence [C]// Proceedings of the International Conference on Intelligent Computing. Singapore: Springer, 2024: 416–427.
[18]   CAO Z, KOPPER R. Real-time viewport-aware optical flow estimation in 360-degree videos for visually-induced motion sickness mitigation [C]// Proceedings of the 25th Symposium on Virtual and Augmented Reality. Rio Grande: ACM, 2024: 210–218.
[19]   BALA P, DIONÍSIO D, NISI V, et al. Visually induced motion sickness in 360° videos: comparing and combining visual optimization techniques [C]// Proceedings of the IEEE International Symposium on Mixed and Augmented Reality Adjunct. Munich: IEEE, 2018: 244–249.
[20]   KIM J, KIM W, AHN S, et al. Virtual reality sickness predictor: analysis of visual-vestibular conflict and VR contents [C]// Proceedings of the Tenth International Conference on Quality of Multimedia Experience. Cagliari: IEEE, 2018: 1–6.
[21]   LEE J, KIM W, KIM J, et al. A study on virtual reality sickness and visual attention [C]// Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. Tokyo: IEEE, 2021: 1465–1469.
[22]   ZHAO J, TRAN K T P, CHALMERS A, et al. Deep learning-based simulator sickness estimation from 3D motion [C]// Proceedings of the IEEE International Symposium on Mixed and Augmented Reality. Sydney: IEEE, 2023: 39–48.
[23]   LU Z, YU M, JIANG G, et al Prediction of motion sickness degree of stereoscopic panoramic videos based on content perception and binocular characteristics[J]. Digital Signal Processing, 2023, 132: 103787
[24]   LEE T M, YOON J C, LEE I K Motion sickness prediction in stereoscopic videos using 3D convolutional neural networks[J]. IEEE Transactions on Visualization and Computer Graphics, 2019, 25 (5): 1919- 1927
[25]   DU M, CUI H, WANG Y, et al Learning from deep stereoscopic attention for simulator sickness prediction[J]. IEEE Transactions on Visualization and Computer Graphics, 2023, 29 (2): 1415- 1423
[26]   权巍, 蔡永青, 王超, 等 基于3D-ResNet双流网络的VR病评估模型[J]. 浙江大学学报: 工学版, 2023, 57 (7): 1345- 1353
QUAN Wei, CAI Yongqing, WANG Chao, et al VR sickness estimation model based on 3D-ResNet two-stream network[J]. Journal of Zhejiang University: Engineering Science, 2023, 57 (7): 1345- 1353
[27]   ALEXEY D. An image is worth 16×16 words: Transformers for image recognition at scale [EB/OL]. (2021-06-03) [2024-10-22]. https://arxiv.org/abs/2010.11929.
[28]   FREMEREY S, SINGLA A, MESEBERG K, et al. AVtrack360: an open dataset and software recording people’s head rotations watching 360° videos on an HMD [C]// Proceedings of the 9th ACM Multimedia Systems Conference. Amsterdam: ACM, 2018: 403–408.
[29]   VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. California: ACM, 2017: 6000–6010.
[30]   权巍, 王超, 耿雪娜, 等 基于运动感知的VR体验舒适度研究[J]. 系统仿真学报, 2023, 35 (1): 169- 177
QUAN Wei, WANG Chao, GENG Xuena, et al Research on VR experience comfort based on motion perception[J]. Journal of System Simulation, 2023, 35 (1): 169- 177
[31]   KUO P C, CHUANG L C, LIN D Y, et al. VR sickness assessment with perception prior and hybrid temporal features [C]// Proceedings of the 25th International Conference on Pattern Recognition. Milan: IEEE, 2021: 5558–5564.
[32]   PORCINO T, RODRIGUES E O, SILVA A, et al. Using the gameplay and user data to predict and identify causes of cybersickness manifestation in virtual reality games [C]// Proceedings of the IEEE 8th International Conference on Serious Games and Applications for Health. Vancouver: IEEE, 2020: 1–8.
[33]   YILDIRIM C. A review of deep learning approaches to EEG-based classification of cybersickness in virtual reality [C]// Proceedings of the IEEE International Conference on Artificial Intelligence and Virtual Reality. Utrecht: IEEE, 2020: 351–357.
[34]   LI Y, LIU A, DING L Machine learning assessment of visually induced motion sickness levels based on multiple biosignals[J]. Biomedical Signal Processing and Control, 2019, 49: 202- 211
[1] Zongmin LI,Chang XU,Yun BAI,Shiyang XIAN,Guangcai RONG. Dual-neighborhood graph convolution method for point cloud understanding[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 879-889.
[2] Zan CHEN,Ran LI,Yuanjing FENG,Yongqiang LI. Video snapshot compressive imaging reconstruction based on temporal super-resolution[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 956-963.
[3] Hongwei LIU,Lei WANG,Yang LIU,Pengchao ZHANG,Shi QIAO. Short term load forecasting based on recombination quadratic decomposition and LSTNet-Atten[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 1051-1062.
[4] Li MA,Yongshun WANG,Yao HU,Lei FAN. Pre-trained long-short spatiotemporal interleaved Transformer for traffic flow prediction applications[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 669-678.
[5] Qiaohong CHEN,Menghao GUO,Xian FANG,Qi SUN. Image captioning based on cross-modal cascaded diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 787-794.
[6] Zhengyu GU,Feifei LAI,Chen GENG,Ximing WANG,Yakang DAI. Knowledge-guided infarct segmentation of ischemic stroke[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 814-820.
[7] Dengfeng LIU,Wenjing GUO,Shihai CHEN. Content-guided attention-based lane detection network[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 451-459.
[8] Minghui YAO,Yueyan WANG,Qiliang WU,Yan NIU,Cong WANG. Siamese networks algorithm based on small human motion behavior recognition[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 504-511.
[9] Liming LIANG,Pengwei LONG,Jiaxin JIN,Renjie LI,Lu ZENG. Steel surface defect detection algorithm based on improved YOLOv8s[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 512-522.
[10] Xianglei YIN,Shaopeng QU,Yongfang XIE,Ni SU. Occluded bird nest detection based on asymptotic feature fusion and multi-scale dilated attention[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 535-545.
[11] Kaibo YANG,Mingen ZHONG,Jiawei TAN,Zhiying DENG,Mengli ZHOU,Ziji XIAO. Small-scale sparse smoke detection in multiple fire scenarios based on semi-supervised learning[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(3): 546-556.
[12] Yali XUE,Yiming HE,Shan CUI,Quan OUYANG. Oriented ship detection algorithm in SAR image based on improved YOLOv5[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(2): 261-268.
[13] Zhichao CHEN,Jie YANG,Fan LI,Zhicheng FENG. Review on deep learning-based key algorithm for train running environment perception[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(1): 1-17.
[14] Dengfeng LIU,Shihai CHEN,Wenjing GUO,Zhilei CHAI. Efficient halftone algorithm based on lightweight residual networks[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(1): 62-69.
[15] Yi ZHAO,Chun AN,Minghao LI,Jianxiao MA,Shuo HUAI. Selection of lane-changing distance for vehicles in urban expressway interchange weaving section[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(1): 205-212.