Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2021, Vol. 55 Issue (7): 1253-1260    DOI: 10.3785/j.issn.1008-973X.2021.07.004
    
ABC-SVM disease prediction method based on data fusion in community health care
Wei-qing PANG1(),Ning HE1,*(),Yan-hua LUO2,Xi YU3
1. School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
2. The First People's Hospital of Nanning, Nanning 530000, China
3. Qingpu District Center for Disease Control and Prevention, Shanghai 201799, China
Download: HTML     PDF(1661KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

Existing disease prediction methods in community health service stilled have some defects such as low data utilization, single type disease, poor automation and unsatisfactory disease prediction effect. A health data fusion and disease prediction approach used in community healthcare in big data and Internet of Things (IoT) environment was proposed to solve these problems. Principal component analysis (PCA) and cluster analysis were used to extract features from the physiological data of residents in communities. The artificial bee colony (ABC) algorithm was used to construct a support vector machine (SVM) non-linear classifier to fuse the features data to predict many potential diseases. Experimental results show that the disease diagnostic accuracy of the proposed method is 93.10%, which is 17.24% higher than traditional SVM method and 72.41% higher than BP neural network. This method can effectively identify potential diseases under the premise of improving data utilization and reducing computing resource consumption, which makes early detection, prevention and treatment of diseases possible. It can be widely applied in community healthcare, elderly monitoring, clinical medicine in hospital.



Key wordscommunity health care      disease prediction      support vector machine(SVM)      artificial bee colony(ABC)      cluster analysis     
Received: 18 May 2020      Published: 05 July 2021
CLC:  TP 391  
Fund:  国家自然科学基金资助项目(61661016);广西无线宽带通信与信号处理重点实验室主任基金资助项目(GXKL06180101);桂林电子科技大学研究生教育创新资助项目(2018YJCX23)
Corresponding Authors: Ning HE     E-mail: williampong@126.com;eicnhe@guet.edu.cn
Cite this article:

Wei-qing PANG,Ning HE,Yan-hua LUO,Xi YU. ABC-SVM disease prediction method based on data fusion in community health care. Journal of ZheJiang University (Engineering Science), 2021, 55(7): 1253-1260.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2021.07.004     OR     https://www.zjujournals.com/eng/Y2021/V55/I7/1253


基于数据融合的ABC-SVM社区疾病预测方法

针对现有社区医疗服务中的疾病预测方法存在数据利用率低、疾病分析类型单一、自动化程度差、疾病预测效果不理想等不足,提出在物联网大数据环境下可用于社区医疗的健康数据融合及疾病预测方法. 通过主成分分析(PCA)和聚类分析对社区中居民的生理指标数据进行特征提取;结合人工蜂群(ABC)算法构造支持向量机(SVM)非线性分类器对数据进行特征级融合分析并预测潜在疾病. 实验结果表明,所提方法的疾病识别准确率达到93.10%,相较于传统SVM方法和BP神经网络方法分别提高17.24% 和72.41%. 该方法能够在提高数据利用率、降低计算资源消耗的前提下有效识别多种潜在疾病,可实现疾病早发现、早预防、早治疗;可广泛应用于社区健康管理、老年社区监护甚至临床医疗.


关键词: 社区医疗,  疾病预测,  支持向量机(SVM),  人工蜂群(ABC),  聚类分析 
Fig.1 Disease prediction model in community
Fig.2 Flow chart for parameter optimization of SVM disease prediction model
序号 性别 年龄/岁 CREA/
(μmol·L?1)
UA/
(μmol·L?1)
APOA/
(g·L?1)
APOB/
(g·L?1)
GLU/
(mmol·L?1)
LDL/
(mmol·L?1)
UREA/
(mmol·L?1)
CH/
(mmol·L?1)
注:肌酐(CREA),尿酸(UA),尿素(UREA),血糖(GLU),低密度脂蛋白胆固醇(LDL),载脂蛋白A1(APOA),载脂蛋白B(APOB),总胆固醇(CH).
1 65 66 261 1.34 0.79 4.9 2.92 5.7 4.81
2 61 45 282 1.25 0.76 4.7 3.18 5.4 4.77
3 59 63 419 1.37 0.97 4.6 3.04 4.9 4.82
4 53 60 274 1.46 1.04 10.7 2.21 4.6 3.93
5 63 89 294 1.38 0.90 4.7 1.48 6.2 4.79
6 76 69 401 1.57 0.66 5.0 2.47 5.8 4.38
$\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $
Tab.1 Physiological data set used in experiment (partial samples)
样本编码 疾病类型
B4 B3 B2 B1
注:1表示样本患有该种疾病,0表示不患有该种疾病.
0x0 0 0 0 0
0x1 0 0 0 1
0x2 0 0 1 0
0x4 0 1 0 0
0x8 1 0 0 0
Tab.2 Encoding of five representative diseases combinations in samples
Fig.3 Visualization of illness in sample data
Fig.4 Principal components and their contribution rate of physiological index
Fig.5 Cluster dendrogram generated from samples
Fig.6 Visualization of cluster analysis result from samples
Fig.7 Comparison results of predictive effects among three diagnostic models
Fig.8 RMSE curves of different diagnostic models
Fig.9 Influence of parameter $c,g$ on ABC-SVM prediction accuracy
Fig.10 Comparison of optimization effects among three optimization algorithms
寻优算法 $c$ $g$ t/s η
GA-OPT $89.72$ $12.36$ $555.70\;$ $79.31\% {\rm{ }}$( $23/29$)
PSO-OPT $40.69$ $16.77$ $872.06\;$ $13.79\% {\rm{ }}$( $4/29$)
ABC-OPT $89.76$ $91.77$ $138.10\;$ $93.10\% {\rm{ }}$( $27/29$)
Tab.3 Optimization effects of different parameter optimization algorithms
Fig.11 RMSE comparison among different disease prediction method
样本编码 ${P}$ ${R}$ ${F}$
ABC-SVM PSO-SVM GA-SVM ABC-SVM PSO-SVM GA-SVM ABC-SVM PSO-SVM GA-SVM
0x0 1 0 1 1 0 1 1 0 1
0x4 0.91 0 0.77 1 0 0.83 0.96 0 0.83
0x6 1 0 0.33 0.5 0 0.4 0.67 0 0.4
0xB 1 0 1 1 0 1 1 0 1
0xC 0 0 0 0 0 0 0 0 0
0xE 0.89 0 0.86 1 0 0.8 0.94 0 0.8
0xF 1 0.14 1 1 1 0.86 1 0.24 0.86
Tab.4 Disease prediction results with different optimization algorithms
[1]   JIA M, YU W, ZHAI X, et al Modeling and analysis of first aid command and dispatching system of cloud medical system[J]. IEEE Access, 2019, 7: 168752- 168758
doi: 10.1109/ACCESS.2019.2954451
[2]   AHMADI H, ARJI G, SHAHMORADI L, et al The application of internet of things in healthcare: a systematic literature review and classification[J]. Universal Access in the Information Society, 2019, 18 (4): 837- 869
doi: 10.1007/s10209-018-0618-4
[3]   王磊, 孟濬 基于wavelet的一类脉搏信号疾病特征量化分析[J]. 浙江大学学报: 工学版, 2012, 46 (10): 1866- 1871
WANG Lei, MENG Jun Quantitative analysis of disease features of a class of pulse signals based on wavelet[J]. Journal of Zhejiang University: Engineering Science, 2012, 46 (10): 1866- 1871
[4]   KAUR P, KUMAR R, KUMAR M A healthcare monitoring system using random forest and internet of things (IoT)[J]. Multimedia Tools and Applications, 2019, 78 (14): 19905- 19916
doi: 10.1007/s11042-019-7327-8
[5]   刘灿, 黄俊, 胡丹, 等 基于慢性病预测的老年人健康监护软件设计与实现[J]. 信息通信, 2019, (2): 93- 96
LIU Can, HUANG Jun, HU Dan, et al Elderly health monitoring software based on chronic disease prediction design and implementation[J]. Information and Communications, 2019, (2): 93- 96
[6]   王哲, 李琳, 李丞, 等 基于机器学习方法的慢性阻塞性肺疾病分期预测[J]. 中国数字医学, 2019, 14 (3): 38- 40
WANG Zhe, LI Lin, LI Cheng, et al Stage prediction of chronic obstructive pneumonia based on machine learning[J]. China Digital Medicine, 2019, 14 (3): 38- 40
[7]   MOHAN S, THIRUMALAI C, SRIVASTAVA G Effective heart disease prediction using hybrid machine learning techniques[J]. IEEE Access, 2019, 7: 81542- 81554
doi: 10.1109/ACCESS.2019.2923707
[8]   CHEN M, HAO Y, HWANG K, et al Disease prediction by machine learning over big data from healthcare communities[J]. IEEE Access, 2017, 5: 8869- 8879
doi: 10.1109/ACCESS.2017.2694446
[9]   莫太平, 王彦丽 基于脉象分析的亚健康状态识别[J]. 桂林电子科技大学学报, 2017, 37 (6): 442- 446
MO Tai-ping, WANG Yan-li Sub-health identification based on pulse analysis[J]. Journal of Guilin University of Electronic Technology, 2017, 37 (6): 442- 446
doi: 10.3969/j.issn.1673-808X.2017.06.003
[10]   LAPLANTE P A, LAPLANTE N The internet of things in healthcare potential applications and challenges[J]. IT Professional, 2016, 18 (3): 2- 4
doi: 10.1109/MITP.2016.42
[11]   TADIĆ L, BONACCI O, BRLEKOVIĆ T An example of principal component analysis application on climate change assessment[J]. Theoretical and Applied Climatology, 2019, 138 (1-2): 1049- 1062
doi: 10.1007/s00704-019-02887-9
[12]   HESAMIAN G, AKBARI M G Principal component analysis based on intuitionistic fuzzy random variables[J]. Computational and Applied Mathematics, 2019, 38 (4): 1- 14
doi: 10.1007/s40314-019-0939-9
[13]   YU Z, CHEN H, YOU J, et al Adaptive fuzzy consensus clustering framework for clustering analysis of cancer data[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 12 (4): 887- 901
doi: 10.1109/TCBB.2014.2359433
[14]   KANDUKURI S T, SENANYAKA J S L, HUYNH V K, et al A two-stage fault detection and classification scheme for electrical pitch drives in offshore wind farms using support vector machine[J]. IEEE Transactions on Industry Applications, 2019, 55 (5): 5109- 5118
doi: 10.1109/TIA.2019.2924866
[15]   GARCÍA NIETO P J, COMBARRO E F, DEL COZ DÍAZ J J, et al A SVM-based regression model to study the air quality at local scale in Oviedo urban area (Northern Spain): a case study[J]. Applied Mathematics and Computation, 2013, 219 (17): 8923- 8937
doi: 10.1016/j.amc.2013.03.018
[16]   LU W, WANG W Potential assessment of the "support vector machine" method in forecasting ambient air pollutant trends[J]. Chemosphere, 2005, 59 (5): 693- 701
doi: 10.1016/j.chemosphere.2004.10.032
[17]   WANG Y, NI Y, LU S, et al Remaining useful life prediction of lithium-ion batteries using support vector regression optimized by artificial bee colony[J]. IEEE Transactions on Vehicular Technology, 2019, 68 (10): 9543- 9553
doi: 10.1109/TVT.2019.2932605
[18]   YANG D, LIU Y, LI S, et al Gear fault diagnosis based on support vector machine optimized by artificial bee colony algorithm[J]. Mechanism and Machine Theory, 2015, 90: 219- 229
doi: 10.1016/j.mechmachtheory.2015.03.013
[19]   WANG X, XU X, SHENG Q Z, et al Novel artificial bee colony algorithms for QoS-aware service selection[J]. IEEE Transactions on Services Computing, 2019, 12 (2): 247- 261
doi: 10.1109/TSC.2016.2612663
[1] Wen-juan LI,Hong-gao DENG,Mou MA,Jun-zheng JIANG. Prediction method of infectious disease transmission based on graph signal processing[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 1017-1024.
[2] QU Zhao-wei, LUO Rui-qi, CHEN Yong-heng, CAO Ning-bo, DENG Xiao-lei, WANG Kun-wei. Characteristics of right-turning vehicle trajectories at signalized intersection[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(2): 341-351.
[3] GAO Jian-ping, SUN Zhong-bo, DING Wei, XI Jian-guo. Development of vehicle driving cycle and accuracy of research[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(10): 2046-2054.
[4] NI Guang-yi, ZHANG Xiao-can, SU Cheng, YU Wei-bin. Count adaptive clustering algorithm based on multiple-chromosome evolution[J]. Journal of ZheJiang University (Engineering Science), 2014, 48(6): 980-986.
[5] SHI Jin-he, SHENG Ji-zhong, WANG Pan. Feature extraction and classification of four-class
motor imagery EEG data
[J]. Journal of ZheJiang University (Engineering Science), 2012, 46(2): 338-344.