|
|
Application of Logistic regression and decision tree analysis in prediction of acute myocardial infarction events |
ZHANG Sheng1( ),HU Zhenjie2,YE Lu3,ZHENG Yaru4,*( ) |
1. Department of Neurology, Zhejiang Provincial People's Hospital, People's Hospital of Hangzhou Medical College, Hangzhou 310014, China 2. Department of Respiratory and Critical Medicine, No. 906 Hospital of Chinese PLA, Ningbo 315040, China 3. Clinical Laboratory, Mental Health Center of Zhejiang University School of Medicine, Hangzhou Seventh People's Hospital, Hangzhou 310013, China 4. Department of Cardiology, Zhejiang Provincial People's Hospital, People's Hospital of Hangzhou Medical College, Hangzhou 310014, China |
|
|
Abstract Objective: To evaluate the application of decision tree method and Logistic regression in the prediction of acute myocardial infarction (AMI) events. Methods: The clinical data of 295 patients, who underwent coronary angiography due to angina or chest pain with unidentified causes in Zhejiang provincial People's Hospital during October 2018 and April 2019, were retrospectively analyzed. Fifty five patients were identified as AMI. Logistic regression and decision tree methods were performed to establish predictive models for the occurrence of AMI, respectively; and the models created by decision tree analysis were divided into Logistic regression-independent model (Tree 1) and Logistic regression-dependent model (Tree 2). The performance of Logistic regression and decision tree models were compared using the area under the receiver operating characteristic (ROC) curve. Results: Logistic regression analysis showed that history of coronary artery disease, multi-vessel coronary artery disease, statin use and apolipoprotein (ApoA1) level were independent influencing factors of AMI events (all P < 0.05). Logistic regression-independent decision tree model (Tree 1) showed that multi-vessel coronary artery disease was the root node, and history of coronary artery disease, ApoA1 level (the cutoff value:1.314 g/L) and anti-platelet drug use were descendant nodes. In Logistic regression-dependent decision tree model (Tree 2), multi-vessel coronary artery disease was still the root node, but only followed by two descendant nodes including history of coronary artery disease and ApoA1 level. The area under the curve (AUC) of ROC of Logistic regression model was 0.826, and AUCs of decision tree models were 0.765 and 0.726, respectively. AUC of Logistic regression model was significantly higher than that of Tree 2 (95% CI=0.041-0.145, Z=3.534, P < 0.001), but was not higher than that of Tree 1 (95% CI=-0.014-0.121, Z=-1.173, P>0.05). Conclusion: The predictive value for AMI event was comparable between Logistic regression-independent decision tree model and Logistic regression model, implying the data mining methods are feasible and effective in AMI prevention and control.
|
Received: 05 June 2019
Published: 19 January 2020
|
|
Corresponding Authors:
ZHENG Yaru
E-mail: xiaoxiaoqing_23@hotmail.com;zhengyaru@zjheart.com
|
决策树分析在急性心肌梗死事件预测中的应用
目的: 评价和比较Logistic回归和决策树分析用于预测急性心肌梗死(AMI)事件的可行性和有效性。方法: 回顾性分析2018年10月至2019年4月在浙江省人民医院因心绞痛或不明原因胸痛行选择性冠状动脉造影的295例患者的临床资料,其中55例诊断为AMI。分别利用Logistic回归分析和决策树分析建立AMI事件预测模型,并在是否根据Logistic回归结果条件下建立决策树分析模型(决策树1和决策树2),继而利用ROC曲线评估上述三组模型预测AMI的价值。结果: 二元Logistic回归分析结果显示,冠心病史、冠状动脉多支病变、他汀类药物史和载脂蛋白A1是AMI发生的独立影响因素(均P < 0.05)。不根据Logistic回归分析结果建立的决策树模型(决策树1)显示,冠状动脉多支病变为根节点,其后分别是冠心病史、载脂蛋白A1水平(以1.314 g/L作为分界点)和抗血小板聚集药物史作为子节点;而根据Logistic回归分析结果建立的决策树模型(决策树2)显示,冠状动脉多支病变为根节点,其后是冠心病史和载脂蛋白A1作为子节点。在对AMI事件的预测中,Logistic回归模型的AUC为0.826,而决策树模型的AUC分别为0.765(决策树1)和0.726(决策树2)。三组模型间比较结果显示,Logistic回归模型的AUC优于决策树2(95% CI:0.041~0.145,Z=3.534,P < 0.01),但与决策树1差异无统计学意义(95% CI:-0.014~0.121,Z=-1.173,P>0.05)。结论: 在对AMI事件的预测分析中,不根据Logistic回归模型结果建立的决策树模型效力与Logistic回归模型相当,未来有望应用于AMI患者的防治工作。
关键词:
心肌梗死,
急性病,
Logistic模型,
回归分析,
决策树,
预测
|
|
[1] |
GAO R , PATEL A , GAO W et al. Prospective observational study of acute coronary syndromes in China:practice patterns and outcomes[J]. Heart, 2008, 94 (5): 554- 560
doi: 10.1136/hrt.2007.119750
|
|
|
[2] |
张啸飞, 胡大一, 丁荣晶 et al. 中国心脑血管疾病死亡现况及流行趋势[J]. 中华心血管病杂志, 2012, 40 (3): 179- 187 ZHANG Xiaofei , HU Dayi , DING Rongjin et al. Status and trend of cardio-cerebral-vascular diseases mortality in China:data from national disease surveillance system between 2004 and 2008[J]. Chinese Journal of Cardiology, 2012, 40 (3): 179- 187
doi: 10.3760/cma.j.issn.0253-3758.2012.03.002
|
|
|
[3] |
CHANG J , LIU X , SUN Y . Mortality due to acute myocardial infarction in China from 1987 to 2014:Secular trends and age-period-cohort effects[J]. Int J Cardiol, 2017, 227 229- 238
doi: 10.1016/j.ijcard.2016.11.130
|
|
|
[4] |
陈伟伟, 高润霖, 刘力生 et al. 中国心血管病报告2013概要[J]. 中国循环杂志, 2014, 8 (7): 487- 491 CHEN Weiwei , GAO Runlin , LIU Lisheng et al. China cardiovascular diseases report 2013:A summary[J]. Chinese Circulation Journal, 2014, 8 (7): 487- 491
doi: 10.3969/j.issn.1000-3614.2014.07.003
|
|
|
[5] |
KITAMURA A , YAMAGISHI K , IMANO H et al. Impact of hypertension and subclinical organ damage on the incidence of cardiovascular disease among Japanese residents at the population and individual levels-the circulatory risk in communities study (CIRCS)[J]. Circ J, 2017, 81 (7): 1022- 1028
doi: 10.1253/circj.CJ-16-1129
|
|
|
[6] |
BHATIA R S , DORIAN P . Screening for cardiovascular disease risk with electrocardiography[J]. JAMA Intern Med, 2018, 178 (9): 1163- 1164
doi: 10.1001/jamainternmed.2018.2773
|
|
|
[7] |
陈振明, 纪双斌, 史湘铃 et al. Markov决策树模型在优化15~49岁女性戊型肝炎免疫接种策略中的应用[J]. 中华流行病学杂志, 2017, 38 (2): 267- 271 CHEN Zhengmin , JI Shuangbin , SHI Xiangling et al. Use the Markov-decision tree model to optimize vaccination strategies of hepatitis E among women aged 15 to 49[J]. Chinese Journal of Epidemiology, 2017, 38 (2): 267- 271
doi: 10.3760/cma.j.issn.0254-6450.2017.02.026
|
|
|
[8] |
LE RAY I , LEE B , WIKMAN A et al. Evaluation of a decision tree for efficient antenatal red blood cell antibody screening[J]. Epidemiology, 2018, 29 (3): 453- 457
doi: 10.1097/EDE.0000000000000805
|
|
|
[9] |
帅健, 李丽萍, 陈业群 . 决策树模型与Logistic回归模型在伤害发生影响因素分析中的作用[J]. 中华疾病控制杂志, 2015, 19 (2): 185- 189 SHUAI Jian , LI Liping , CHEN Yequn . The role of Decision tree model and Logistic regression in injury influencing factors analysis[J]. Chinese Journal of Disease Control & Prevention, 2015, 19 (2): 185- 189
|
|
|
[10] |
THYGESEN K , ALPERT J S , JAFFE A S et al. Fourth universal definition of myocardial infarction (2018)[J]. Eur Heart J, 2019, 40 (3): 237- 269
doi: 10.1093/eurheartj/ehy462
|
|
|
[11] |
ROBERTS J K , RAO S V , SHAW L K et al. Comparative efficacy of coronary revascularization procedures for multivessel coronary artery disease in patients with chronic kidney disease[J]. Am J Cardiol, 2017, 119 (9): 1344- 1351
doi: 10.1016/j.amjcard.2017.01.029
|
|
|
[12] |
XU T , ZUO P , CAO L et al. Omentin-1 is associated with carotid plaque instability among ischemic stroke patients[J]. J Atheroscler Thromb, 2018, 25 (6): 505- 511
doi: 10.5551/jat.42135
|
|
|
[13] |
华扬, 刘蓓蓓, 凌晨 et al. 超声检查对颈动脉狭窄50%~69%和70%~99%诊断准确性的评估[J]. 中国脑血管病杂志, 2006, 3 (5): 211- 218 HUA Yang , LIU Beibei , LING Chen et al. Accurate assessment of the diagnosis between 50-69%and 70-99%carotid stenoses with ultrasono-graphy[J]. Chinese Journal of Cerebrovascular Diseases, 2006, 3 (5): 211- 218
doi: 10.3969/j.issn.1672-5921.2006.05.006
|
|
|
[14] |
HE J , CHEN P , LUO Y et al. Relationship between the maximum carotid plaque area and the severity of coronary atherosclerosis[J]. Int Angiol, 2018, 37 (4): 300- 309
|
|
|
[15] |
何跃, 邓唯茹, 刘司寰 . 基于组合决策树的急诊等待时间预测[J]. 统计与决策, 2016, 1 (6): 72- 74 HE Yue , DENG Weiru , LIU Sihuan . Emergency waiting time prediction based on combined decision tree[J]. Statistics and Decision, 2016, 1 (6): 72- 74
|
|
|
[16] |
赵自强, 郑明 . 应用分类树模型筛选logistic回归中的交互因素[J]. 中国卫生统计, 2007, 24 (2): 114- 116 ZHAO Ziqiang , ZHENG Ming . Apply classification tree to automatically screen some potential interaction factors in Logistic regression[J]. Chinese Journal of Health Statistics, 2007, 24 (2): 114- 116
doi: 10.3969/j.issn.1002-3674.2007.02.001
|
|
|
[17] |
薛允莲 . Logistic回归结合决策树技术在冠心病患者住院费用组合分析中的应用[J]. 中国卫生统计, 2015, 32 (6): 988- 989 XUE Yunlian . The application of logistic regression combined with decision tree technology in the combination analysis of hospitalization expenses of patients with coronary heart disease[J]. Chinese Journal of Health Statistics, 2015, 32 (6): 988- 989
|
|
|
[18] |
黄晓霞, 严玉洁, 尉敏琦 et al. logistic回归、决策树和神经网络在脑卒中高危筛查中的性能比较[J]. 中国慢性病预防与控制, 2016, 24 (6): 412- 415 HUANG Xiaoxia , YAN Yujie , WEI Minqi et al. Comparison of screening group with high risk of stroke among logistic regression, decision trees and neural networks[J]. Chinese Journal of Prevention and Control of Chronic Non-Communicable Diseases, 2016, 24 (6): 412- 415
|
|
|
[19] |
张娴静, 陈政, 赵耐青 et al. 上海市嘉定区农村居民就诊单位选择的影响因素分析——决策树和多分类无序反应变量的logistic回归相结合的方法[J]. 中国卫生统计, 2005, 22 (2): 80- 84 ZHANG Xianjing , CHEN Zheng , ZHAO Naiqing et al. Researches on the factors Influencing the outpatients' choice of selecting care providers in Jiading district of Shanghai:a method of combining decision tree model with multinomial Logistic regression[J]. Chinese Journal of Health Statistics, 2005, 22 (2): 80- 84
doi: 10.3969/j.issn.1002-3674.2005.02.005
|
|
|
[20] |
王梦, 谢高强, 王浩 et al. 颈动脉最大斑块面积的进展速率与新发缺血性心血管事件的关系[J]. 中国循环杂志, 2014, 29 (7): 532- 536 WANG Meng , XIE Gaoqiang , WANG Hao et al. Relationship between the progression pate of corotid maximal plaque area and the risk of new ischemic cardiovascular disease[J]. Chinese Circulation Journal, 2014, 29 (7): 532- 536
doi: 10.3969/j.issn.1000-3614.2014.07.014
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|