Please wait a minute...
浙江大学学报(工学版)  2021, Vol. 55 Issue (7): 1253-1260    DOI: 10.3785/j.issn.1008-973X.2021.07.004
计算机与控制工程     
基于数据融合的ABC-SVM社区疾病预测方法
庞维庆1(),何宁1,*(),罗燕华2,郁晞3
1. 桂林电子科技大学 信息与通信学院,广西 桂林 541004
2. 南宁市第一人民医院,广西 南宁 530000
3. 上海市青浦区疾病预防控制中心,上海 201799
ABC-SVM disease prediction method based on data fusion in community health care
Wei-qing PANG1(),Ning HE1,*(),Yan-hua LUO2,Xi YU3
1. School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
2. The First People's Hospital of Nanning, Nanning 530000, China
3. Qingpu District Center for Disease Control and Prevention, Shanghai 201799, China
 全文: PDF(1661 KB)   HTML
摘要:

针对现有社区医疗服务中的疾病预测方法存在数据利用率低、疾病分析类型单一、自动化程度差、疾病预测效果不理想等不足,提出在物联网大数据环境下可用于社区医疗的健康数据融合及疾病预测方法. 通过主成分分析(PCA)和聚类分析对社区中居民的生理指标数据进行特征提取;结合人工蜂群(ABC)算法构造支持向量机(SVM)非线性分类器对数据进行特征级融合分析并预测潜在疾病. 实验结果表明,所提方法的疾病识别准确率达到93.10%,相较于传统SVM方法和BP神经网络方法分别提高17.24% 和72.41%. 该方法能够在提高数据利用率、降低计算资源消耗的前提下有效识别多种潜在疾病,可实现疾病早发现、早预防、早治疗;可广泛应用于社区健康管理、老年社区监护甚至临床医疗.

关键词: 社区医疗疾病预测支持向量机(SVM)人工蜂群(ABC)聚类分析    
Abstract:

Existing disease prediction methods in community health service stilled have some defects such as low data utilization, single type disease, poor automation and unsatisfactory disease prediction effect. A health data fusion and disease prediction approach used in community healthcare in big data and Internet of Things (IoT) environment was proposed to solve these problems. Principal component analysis (PCA) and cluster analysis were used to extract features from the physiological data of residents in communities. The artificial bee colony (ABC) algorithm was used to construct a support vector machine (SVM) non-linear classifier to fuse the features data to predict many potential diseases. Experimental results show that the disease diagnostic accuracy of the proposed method is 93.10%, which is 17.24% higher than traditional SVM method and 72.41% higher than BP neural network. This method can effectively identify potential diseases under the premise of improving data utilization and reducing computing resource consumption, which makes early detection, prevention and treatment of diseases possible. It can be widely applied in community healthcare, elderly monitoring, clinical medicine in hospital.

Key words: community health care    disease prediction    support vector machine(SVM)    artificial bee colony(ABC)    cluster analysis
收稿日期: 2020-05-18 出版日期: 2021-07-05
CLC:  TP 391  
基金资助: 国家自然科学基金资助项目(61661016);广西无线宽带通信与信号处理重点实验室主任基金资助项目(GXKL06180101);桂林电子科技大学研究生教育创新资助项目(2018YJCX23)
通讯作者: 何宁     E-mail: williampong@126.com;eicnhe@guet.edu.cn
作者简介: 庞维庆(1994—),男,硕士生,从事物联网数据融合、疾病预测研究. orcid.org/0000-0002-6632-0336. E-mail: williampong@126.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
庞维庆
何宁
罗燕华
郁晞

引用本文:

庞维庆,何宁,罗燕华,郁晞. 基于数据融合的ABC-SVM社区疾病预测方法[J]. 浙江大学学报(工学版), 2021, 55(7): 1253-1260.

Wei-qing PANG,Ning HE,Yan-hua LUO,Xi YU. ABC-SVM disease prediction method based on data fusion in community health care. Journal of ZheJiang University (Engineering Science), 2021, 55(7): 1253-1260.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2021.07.004        https://www.zjujournals.com/eng/CN/Y2021/V55/I7/1253

图 1  社区疾病预测模型
图 2  SVM疾病预测模型的参数寻优流程图
序号 性别 年龄/岁 CREA/
(μmol·L?1)
UA/
(μmol·L?1)
APOA/
(g·L?1)
APOB/
(g·L?1)
GLU/
(mmol·L?1)
LDL/
(mmol·L?1)
UREA/
(mmol·L?1)
CH/
(mmol·L?1)
注:肌酐(CREA),尿酸(UA),尿素(UREA),血糖(GLU),低密度脂蛋白胆固醇(LDL),载脂蛋白A1(APOA),载脂蛋白B(APOB),总胆固醇(CH).
1 65 66 261 1.34 0.79 4.9 2.92 5.7 4.81
2 61 45 282 1.25 0.76 4.7 3.18 5.4 4.77
3 59 63 419 1.37 0.97 4.6 3.04 4.9 4.82
4 53 60 274 1.46 1.04 10.7 2.21 4.6 3.93
5 63 89 294 1.38 0.90 4.7 1.48 6.2 4.79
6 76 69 401 1.57 0.66 5.0 2.47 5.8 4.38
$\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $ $\vdots $
表 1  实验中用到的生理数据集(部分样本)
样本编码 疾病类型
B4 B3 B2 B1
注:1表示样本患有该种疾病,0表示不患有该种疾病.
0x0 0 0 0 0
0x1 0 0 0 1
0x2 0 0 1 0
0x4 0 1 0 0
0x8 1 0 0 0
表 2  5种代表性疾病组合的样本编码
图 3  样本数据患病情况可视化
图 4  生理指标中的主成分及其贡献率
图 5  样本的聚类分析树状图
图 6  样本的聚类结果可视化
图 7  3种诊断模型预测效果对比
图 8  不同诊断模型的RMSE曲线
图 9   $c,g$参数对ABC-SVM预测准确率的影响
图 10  3种优化算法的寻优效果对比
寻优算法 $c$ $g$ t/s η
GA-OPT $89.72$ $12.36$ $555.70\;$ $79.31\% {\rm{ }}$( $23/29$)
PSO-OPT $40.69$ $16.77$ $872.06\;$ $13.79\% {\rm{ }}$( $4/29$)
ABC-OPT $89.76$ $91.77$ $138.10\;$ $93.10\% {\rm{ }}$( $27/29$)
表 3  不同参数寻优算法的优化效果
图 11  不同疾病预测方法的RMSE比较
样本编码 ${P}$ ${R}$ ${F}$
ABC-SVM PSO-SVM GA-SVM ABC-SVM PSO-SVM GA-SVM ABC-SVM PSO-SVM GA-SVM
0x0 1 0 1 1 0 1 1 0 1
0x4 0.91 0 0.77 1 0 0.83 0.96 0 0.83
0x6 1 0 0.33 0.5 0 0.4 0.67 0 0.4
0xB 1 0 1 1 0 1 1 0 1
0xC 0 0 0 0 0 0 0 0 0
0xE 0.89 0 0.86 1 0 0.8 0.94 0 0.8
0xF 1 0.14 1 1 1 0.86 1 0.24 0.86
表 4  不同优化算法的疾病预测结果
1 JIA M, YU W, ZHAI X, et al Modeling and analysis of first aid command and dispatching system of cloud medical system[J]. IEEE Access, 2019, 7: 168752- 168758
doi: 10.1109/ACCESS.2019.2954451
2 AHMADI H, ARJI G, SHAHMORADI L, et al The application of internet of things in healthcare: a systematic literature review and classification[J]. Universal Access in the Information Society, 2019, 18 (4): 837- 869
doi: 10.1007/s10209-018-0618-4
3 王磊, 孟濬 基于wavelet的一类脉搏信号疾病特征量化分析[J]. 浙江大学学报: 工学版, 2012, 46 (10): 1866- 1871
WANG Lei, MENG Jun Quantitative analysis of disease features of a class of pulse signals based on wavelet[J]. Journal of Zhejiang University: Engineering Science, 2012, 46 (10): 1866- 1871
4 KAUR P, KUMAR R, KUMAR M A healthcare monitoring system using random forest and internet of things (IoT)[J]. Multimedia Tools and Applications, 2019, 78 (14): 19905- 19916
doi: 10.1007/s11042-019-7327-8
5 刘灿, 黄俊, 胡丹, 等 基于慢性病预测的老年人健康监护软件设计与实现[J]. 信息通信, 2019, (2): 93- 96
LIU Can, HUANG Jun, HU Dan, et al Elderly health monitoring software based on chronic disease prediction design and implementation[J]. Information and Communications, 2019, (2): 93- 96
6 王哲, 李琳, 李丞, 等 基于机器学习方法的慢性阻塞性肺疾病分期预测[J]. 中国数字医学, 2019, 14 (3): 38- 40
WANG Zhe, LI Lin, LI Cheng, et al Stage prediction of chronic obstructive pneumonia based on machine learning[J]. China Digital Medicine, 2019, 14 (3): 38- 40
7 MOHAN S, THIRUMALAI C, SRIVASTAVA G Effective heart disease prediction using hybrid machine learning techniques[J]. IEEE Access, 2019, 7: 81542- 81554
doi: 10.1109/ACCESS.2019.2923707
8 CHEN M, HAO Y, HWANG K, et al Disease prediction by machine learning over big data from healthcare communities[J]. IEEE Access, 2017, 5: 8869- 8879
doi: 10.1109/ACCESS.2017.2694446
9 莫太平, 王彦丽 基于脉象分析的亚健康状态识别[J]. 桂林电子科技大学学报, 2017, 37 (6): 442- 446
MO Tai-ping, WANG Yan-li Sub-health identification based on pulse analysis[J]. Journal of Guilin University of Electronic Technology, 2017, 37 (6): 442- 446
doi: 10.3969/j.issn.1673-808X.2017.06.003
10 LAPLANTE P A, LAPLANTE N The internet of things in healthcare potential applications and challenges[J]. IT Professional, 2016, 18 (3): 2- 4
doi: 10.1109/MITP.2016.42
11 TADIĆ L, BONACCI O, BRLEKOVIĆ T An example of principal component analysis application on climate change assessment[J]. Theoretical and Applied Climatology, 2019, 138 (1-2): 1049- 1062
doi: 10.1007/s00704-019-02887-9
12 HESAMIAN G, AKBARI M G Principal component analysis based on intuitionistic fuzzy random variables[J]. Computational and Applied Mathematics, 2019, 38 (4): 1- 14
doi: 10.1007/s40314-019-0939-9
13 YU Z, CHEN H, YOU J, et al Adaptive fuzzy consensus clustering framework for clustering analysis of cancer data[J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2015, 12 (4): 887- 901
doi: 10.1109/TCBB.2014.2359433
14 KANDUKURI S T, SENANYAKA J S L, HUYNH V K, et al A two-stage fault detection and classification scheme for electrical pitch drives in offshore wind farms using support vector machine[J]. IEEE Transactions on Industry Applications, 2019, 55 (5): 5109- 5118
doi: 10.1109/TIA.2019.2924866
15 GARCÍA NIETO P J, COMBARRO E F, DEL COZ DÍAZ J J, et al A SVM-based regression model to study the air quality at local scale in Oviedo urban area (Northern Spain): a case study[J]. Applied Mathematics and Computation, 2013, 219 (17): 8923- 8937
doi: 10.1016/j.amc.2013.03.018
16 LU W, WANG W Potential assessment of the "support vector machine" method in forecasting ambient air pollutant trends[J]. Chemosphere, 2005, 59 (5): 693- 701
doi: 10.1016/j.chemosphere.2004.10.032
17 WANG Y, NI Y, LU S, et al Remaining useful life prediction of lithium-ion batteries using support vector regression optimized by artificial bee colony[J]. IEEE Transactions on Vehicular Technology, 2019, 68 (10): 9543- 9553
doi: 10.1109/TVT.2019.2932605
18 YANG D, LIU Y, LI S, et al Gear fault diagnosis based on support vector machine optimized by artificial bee colony algorithm[J]. Mechanism and Machine Theory, 2015, 90: 219- 229
doi: 10.1016/j.mechmachtheory.2015.03.013
19 WANG X, XU X, SHENG Q Z, et al Novel artificial bee colony algorithms for QoS-aware service selection[J]. IEEE Transactions on Services Computing, 2019, 12 (2): 247- 261
doi: 10.1109/TSC.2016.2612663
[1] 高新智,刘作军,张燕,陈玲玲. 基于GWO-SVM的下肢假肢穿戴者骑行相位识别[J]. 浙江大学学报(工学版), 2021, 55(4): 648-657.
[2] 黄杰,王东,王新晴,殷勤,邵发明. 液压挖掘机作业循环状态智能识别方法[J]. 浙江大学学报(工学版), 2019, 53(9): 1663-1673.
[3] 都明宇, 鲍官军, 杨庆华, 王志恒, 张立彬. 基于改进支持向量机的人手动作模式识别方法[J]. 浙江大学学报(工学版), 2018, 52(7): 1239-1246.
[4] 曲昭伟, 罗瑞琪, 陈永恒, 曹宁博, 邓晓磊, 汪昆维. 信号交叉口右转机动车轨迹特性[J]. 浙江大学学报(工学版), 2018, 52(2): 341-351.
[5] 赵晓东, 刘作军, 陈玲玲, 杨鹏. 下肢假肢穿戴者跑动步态识别方法[J]. 浙江大学学报(工学版), 2018, 52(10): 1980-1988.
[6] 李麟玮, 吴益平, 苗发盛. 基于灰狼支持向量机的非等时距滑坡位移预测[J]. 浙江大学学报(工学版), 2018, 52(10): 1998-2006.
[7] 袁红, 王波, 王丽, 许睦旬. 以轮廓为对象的体态特征情绪分类与预测[J]. 浙江大学学报(工学版), 2018, 52(1): 160-165.
[8] 尤海辉, 马增益, 唐义军, 王月兰, 郑林, 俞钟, 吉澄军. 循环流化床入炉垃圾热值软测量[J]. 浙江大学学报(工学版), 2017, 51(6): 1163-1172.
[9] 廖苗, 赵于前, 曾业战, 黄忠朝, 张丙奎, 邹北骥. 基于支持向量机和椭圆拟合的细胞图像自动分割[J]. 浙江大学学报(工学版), 2017, 51(4): 722-728.
[10] 高建平, 孙中博, 丁伟, 郗建国. 车辆行驶工况的开发和精度研究[J]. 浙江大学学报(工学版), 2017, 51(10): 2046-2054.
[11] 钟崴, 彭梁, 周永刚, 徐剑, 从飞云. 基于小波包分析和支持向量机的锅炉结渣诊断[J]. 浙江大学学报(工学版), 2016, 50(8): 1499-1506.
[12] 赵凌, 黄平捷, 刘宝玲, 赵树浩, 侯迪波, 张光新. 多层导电结构内部状态脉冲涡流检测分析方法[J]. 浙江大学学报(工学版), 2016, 50(4): 603-608.
[13] 潘翔,童伟淮,张三元,郑河荣. 结合语义本体与泊松方程的动画角色模型分割[J]. 浙江大学学报(工学版), 2015, 49(9): 1634-1641.
[14] 倪广翼, 章孝灿, 苏程, 俞伟斌. 基于多染色体演化的自适应类别数聚类方法[J]. 浙江大学学报(工学版), 2014, 48(6): 980-986.
[15] 施锦河, 沈继忠, 王攀. 四类运动想象脑电信号特征提取与分类算法[J]. J4, 2012, 46(2): 338-344.