Please wait a minute...
Journal of Zhejiang University (Science Edition)  2021, Vol. 48 Issue (4): 391-401    DOI: 10.3785/j.issn.1008-9497.2021.04.001
Data Visual Analysis and Vitual Reality     
Visual analysis of cohorts and treatments of breast cancer based on electronic health records
XU Min1, WANG Ke2, DAI Haoran3, LUO Xiaobo3, YU Weilun3, TAO Yubo3, LIN Hai3
1.Department of Medical Information, The First Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou 310003, China
2.Department of Breast Surgery, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou 310003, China
3.State Key Lab of CAD&CG, Zhejiang University, Hangzhou 310058, China
Download: HTML (   PDF(5682KB)
Export: BibTeX | EndNote (RIS)      

Abstract  Breast cancer is one of the most common malignant tumors. It is important to analyze electronic health records (EHRs) and discover the patterns for breast cancer during the treatment and prognosis. By cooperating with physicians on breast cancer, this paper integrates well-designed visualization methods and an accurate prediction model to construct a visual analytics system for cohort analysis and treatment planning based on EHRs. For cohort analysis, the dimension reduction and clustering algorithm are first applied to patients with high-dimensional attributes to group them into the cohort, and Nightingale graph, word cloud, and timeline visualization methods are then used to explore the patterns in different cohort and correlation between features. For treatment planning, an SVM model is trained to predict the treatment plan for patients, and parallel coordinates, matrix heat map, and classification map are designed to explore the relation between features, the weights of features and the prediction result, respectively. Finally, case studies and expert interviews are presented to evaluate the effectiveness of cohort analysis and treatment planning in discovering the relationship between treatment plans and patient features and helping physicians in decision making.

Key wordselectronic health record      breast cancer      visual analytics      cohort      prediction model     
Received: 30 November 2020      Published: 25 July 2021
CLC:  TP 391.41  
Cite this article:

XU Min, WANG Ke, DAI Haoran, LUO Xiaobo, YU Weilun, TAO Yubo, LIN Hai. Visual analysis of cohorts and treatments of breast cancer based on electronic health records. Journal of Zhejiang University (Science Edition), 2021, 48(4): 391-401.

URL:

https://www.zjujournals.com/sci/EN/Y2021/V48/I4/391


基于电子病历的乳腺癌群组与治疗方案可视分析

乳腺癌是当前最常见的恶性肿瘤之一,其电子病历数据可用于挖掘隐含规律,对治疗与预后分析有重要意义。通过与乳腺科医生合作,选择合适的预测模型和可视化方法,搭建了一个基于电子病历的乳腺癌群组和治疗方案可视分析系统。首先,对具有高维属性的病人进行降维和聚类处理,形成病人群组,并采用南丁格尔图、词云和时间轴可视化方法,直观展示病人群组间特征的差异;然后,用支持向量机(support vector machine,SVM)模型预测治疗方案,用平行坐标、矩阵热力图和分类图分别展示属性相关性、训练后的特征权重和预测结果;最后,用真实案例验证了系统在群组分析、治疗方案及病人属性关联分析中的有效性,从而较好地帮助医生选择合适的治疗方案。

关键词: 乳腺癌,  电子病历,  可视分析,  病人群组,  预测模型 
1 JAGANNATHA A N,YU H. Structured prediction models for RNN based sequence labeling in clinical text[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Austin:NIH Public Access,2016:856. DOI:10.18653/v1/d16-1082
2 CHOI E,SCHUETZ A,STEWART W F,et al. Medical concept representation learning from electronic health records and its application on heart failure prediction[Z/OL]. (2016-02-11).https://arXiv.org/abs/1602.03686.
3 LI H,LI X,RAMANATHAN M,et al. Identifying informative risk factors and predicting bone disease progression via deep belief networks[J]. Methods,2014,69(3):257-265. DOI:10.1016/j.ymeth.2014. 06.011
4 LIN H,GAO S,GOTZ D,et al. RCLens:Interactive rare category exploration and identification[J]. IEEE Transactions on Visualization and Computer Graphics,2017,24(7):2223-2237.
5 POWSNER S M,TUFTE E R. Graphical summary of patient status[J]. Lancet,1994,344(8919):386-389. DOI:10.1016/s0140-6736(94)91406-0
6 PLAISANT C,MUSHLIN R,SNYDER A,et al. LifeLines:Using visualization to enhance navigation and analysis of patient records[J]. The Craft of Information Visualization,2003:308-312. DOI:10.1016/B978-155860915-0/50038-X
7 COMBI C,PORTONI L,PINCIROLI F. Visualizing temporal clinical data on the WWW[C]// Joint European Conference on Artificial Intelligence in Medicine and Medical Decision Making. Berlin/ Heidelberg:Springer,1999:301-311. DOI:10.1007/3-540-48720-4_33
8 ORDONEZ P,OATES T,LOMBARDI M E,et al. Visualization of multivariate time-series data in a neonatal ICU[J]. IBM Journal of Research and Development,2012,56(5):1-7. DOI:10.1147/jrd.2012.2200431
9 WANG T D,WONGSUPHASAWAT K,PLAISANT C,et al. Extracting insights from electronic health records:Case studies,a visual analytics process model,and design recommendations[J]. Journal of Medical Systems,2011,35(5):1135-1152. DOI:10.1007/s10916-011-9718-X
10 MALIK S,DU F,MONROE M,et al. An evaluation of visual analytics approaches to comparing cohorts of event sequences[C]// Proceedings of IEEE VIS 2014 Workshop on Visualizing Electronic Health Record Data at VIS. Paris:IEEE,2014:1-6.
11 KWON B C,CHOI M J,KIM J T,et al. RetainVis:Visual analytics with interpretable and interactive recurrent neural networks on electronic medical records[J]. IEEE Transactions on Visualization and Computer Graphics,2018,25:299-309. DOI:10. 1109/TVCG.2018.2865027
12 GOTZ D,STAVROPOULOS H. DecisionFlow:Visual analytics for high-dimensional temporal event sequence data[J]. IEEE Transactions on Visualization and Computer Graphics,2014,20(12):1783-1792. DOI:10.1109/TVCG.2014.2346682
13 DU F,PLAISANT C,SPRING N,et al. EventAction:Visual analytics for temporal event sequence recommendation[C]// 2016 IEEE Conference on Visual Analytics Science and Technology. Baltimore:IEEE,2016:61-70. DOI:10.1109/vast.2016.7883512
14 SYEDA-MAHMOOD T,WANG F,BEYMER D,et al. AALIM:Multimodal mining for cardiac decision support[C]// 2007 Computers in Cardiology. Durham:IEEE,2007:209-212. DOI:10.1109/cic.2007.4745458
15 HUANG Z. Extensions to the K-means algorithm for clustering large data sets with categorical values[J]. Data Mining and Knowledge Discovery,1998,2(3):283-304. DOI:10.1023/A:1009769707641
16 KRUSKAL J B. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis[J]. Psychometrika,1964,29(1):1-27. DOI:10. 1007/bf02289565
17 KRUSKAL J B. Nonmetric multidimensional scaling:A numerical method[J]. Psychometrika,1964,29(2):115-129. DOI:10.1007/bf02289694
[1] Yuhua FANG,Feng YE. MFDC-Net: A breast cancer pathological image classification algorithm incorporating multi-scale feature fusion and attention mechanism[J]. Journal of Zhejiang University (Science Edition), 2023, 50(4): 455-464.
[2] YANG Yi, LI Guoqing, WANG Jian, WANG Haijun, ZHAI Yichen, HUANG Weixing. O2O service-based Chinese calligraphy big data platform[J]. Journal of Zhejiang University (Science Edition), 2020, 47(4): 397-407.
[3] ZENG Liang. Grey GM(1,1|sin) power model based on oscillation sequences and its application[J]. Journal of Zhejiang University (Science Edition), 2019, 46(6): 697-704.