Please wait a minute...
浙江大学学报(理学版)  2019, Vol. 46 Issue (3): 370-379    DOI: 10.3785/j.issn.1008-9497.2019.03.016
地球科学     
基于RNN-CNN集成深度学习模型的PM2.5小时浓度预测
黄婕1,2, 张丰1,2, 杜震洪1,2, 刘仁义1,2, 曹晓裴1,2
1.浙江大学 浙江省资源与环境信息系统重点实验室,浙江 杭州 310028
2.浙江大学 地理信息科学研究所,浙江 杭州 310027
Hourly concentration prediction of PM2.5 based on RNN-CNN ensemble deep learning model
Jie HUANG1,2, Feng ZHANG1,2, Zhenhong DU1,2, Renyi LIU1,2, Xiaopei CAO1,2
1.Zhejiang Provincial Key Lab of GIS, Zhejiang University, Hangzhou 310028, China
2.Department of Geographic Information Science, Zhejiang University, Hangzhou 310027, China
 全文: PDF(2101 KB)   HTML  
摘要: 针对目前大部分PM2.5预测模型预测效果不稳定、泛化能力不强的现状,以记忆能力较强的循环神经网络(RNN)和特征表达能力较强的卷积神经网络(CNN)为基础,采取Stacking集成策略对两者进行融合,提出了RNN-CNN集成深度学习预测模型。该模型不仅充分利用时间轴上的前后关联信息去预测未来的浓度,而且在不同层次上将自动提取的高维时序数据通用特征用于预测,以保证预测结果的稳定性。最后,对集成之前的RNN、CNN和集成之后的RNN-CNN模型,以2016年中国大陆地区1 466个监测站点的空气质量数据为样本进行实例验证,结果表明,RNN-CNN在PM2.5时间序列预测上的表现明显优于集成之前的RNN和CNN,而且泛化误差更低,在34%站点上的拟合度超过0.97,该模型可用于大范围区域的PM2.5小时浓度预测。
关键词: PM2.5小时浓度预测RNNCNN深度学习集成学习    
Abstract: Most of the current PM2.5 prediction models show unstable prediction effect and weak generalization ability. This research aims to design a prediction model called RNN-CNN for PM2.5 hourly concentration prediction based on ensemble deep learning. We choose Recurrent Neural Network (RNN) with strong memory and Convolutional Neural Network (CNN) with strong feature expression ability as individual learners and choose Stacking, an ensemble learning technique, to combine RNN and CNN so that we can take full advantages of both in the forecast. RNN-CNN can not only use the contextual information on the timeline to predict the future concentration ,but also extract different levels of essential features from the high dimensional features for prediction, ensuring the stability of the forecast. We take the air quality data of 1 466 monitoring stations in mainland China in 2016 as samples to compare the performance of RNN-CNN with the individual learners RNN and CNN. Experiment results show that RNN-CNN performs better and achieves higher prediction accuracy and stronger generalization ability than the individual learners RNN and CNN, What’s more, its index of agreement on the 34% of test stations is higher than 0.97, which indicates that the ensemble deep learning model RNN-CNN can be effectively used for prediction of PM2.5 hourly concentration at large scales.
Key words: PM2.5 hourly concentration prediction    RNN    CNN    deep learning    ensemble learning
收稿日期: 2018-01-23 出版日期: 2019-05-25
CLC:  TP391  
基金资助: 国家自然科学基金资助项目(41671391,41471313);国家海洋公益性行业科研专项经费资助项目(201505003).
通讯作者: ORCID:http://orcid.org/0000-0003-1475-8480, 通信作者:E-mail:zfcarnation @zju.edu.cn.     E-mail: zfcarnation @zju.edu.cn.
作者简介: 黄婕(1993—),ORCID:http://orcid.org/0000-0003-4257-9666 ,女,硕士研究生,主要从事时空大数据挖掘研究.
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
黄婕
张丰
杜震洪
刘仁义
曹晓裴

引用本文:

黄婕, 张丰, 杜震洪, 刘仁义, 曹晓裴. 基于RNN-CNN集成深度学习模型的PM2.5小时浓度预测[J]. 浙江大学学报(理学版), 2019, 46(3): 370-379.

Jie HUANG, Feng ZHANG, Zhenhong DU, Renyi LIU, Xiaopei CAO. Hourly concentration prediction of PM2.5 based on RNN-CNN ensemble deep learning model. Journal of ZheJIang University(Science Edition), 2019, 46(3): 370-379.

链接本文:

https://www.zjujournals.com/sci/CN/10.3785/j.issn.1008-9497.2019.03.016        https://www.zjujournals.com/sci/CN/Y2019/V46/I3/370

1 SORENSENM, DANESHVARB, HANSENM, et al.Personal PM2.5 exposure and markers of oxidative stress in blood[J]. Environmental Health Perspectives, 2003, 111(2): 161-166. DOI: 10.1289/ehp.111-1241344
2 GIBSONM D, KUNDUS, SATISHM.Dispersion model evaluation of PM2.5, NOx, and SO2, from point and major line sources in Nova Scotia, Canada using AERMOD Gaussian plume air dispersion model[J]. Atmospheric Pollution Research, 2013, 4(2):157-167. DOI:10.5094/APR.2013.016
3 WUY, GROSSB M, MOSHARYF. Assessing satellite AOD based and WRF/CMAQ output PM2.5 estimators[C]// SPIE Defense, Security, and Sensing,2013:872319. DOI: 10.1117/12.2027430
4 KLOOGI, NORDIOF, COULLB A, et al.Incorporating local land use regression and satellite aerosol optical depth in a hybrid model of spatio-temporal PM2.5 exposures in the mid-Atlantic states[J]. Environmental Science & Technology, 2012, 46(21):11913. DOI: 10.1021/es302673e
5 XUW, HUANGZ C, ZHANGQ N. Prediction and interpolation of PM2.5 based on space-time model[J]. Journal of Jiangsu Normal University(Natural Science Edition), 2016,34(3):70-75. DOI: 10.3969/j.issn.2095-4298.2016.03.016
6 LIX, ZHANGC, LIW, et al.Evaluating the use of DMSP/OLS nighttime light imagery in predicting PM2.5 concentrations in the Northeastern United States[J]. Remote Sensing, 2017, 9(6): 620.doi:10.3390/rs9060620
7 HEQ, HUANGB.Satellite-based mapping of daily high-resolution ground PM2.5 in China via space-time regression modeling[J]. Remote Sensing of Environment, 2018, 206: 72-83. DOI: 10.1016/j.rse.2017.12.018
8 LYU B, HUY, CHANGH H, et al.Improving the accuracy of daily PM2.5 distributions derived from the fusion of ground-level measurements with aerosol optical depth observations, a case study in North China[J]. Environmental Science & Technology, 2016, 50(9): 4752. DOI: 10.1021/acs.est.5b05940
9 YINQ, WANGJ, HUM, et al.Estimation of daily PM2.5 concentration and its relationship with meteorological conditions in Beijing[J]. Journal of Environmental Sciences, 2016, 48(10): 161-168.doi:10.1016/j.jes.2016.03.024
10 HOUJ X, LIQ, ZHUY J, et al.Real-time forecasting system of PM2.5 concentration based on spark framework and random forest model[J]. Science of Surveying & Mapping, 2017, 42(1):1-6. DOI: 10.16251/j.cnki.1009-2307.2017.01.001
11 ONG B T, SUGIURAK, ZETTSUK.Dynamically pre-trained deep recurrent neural networks using environmental monitoring data for predicting PM2.5[J]. Neural Computing and Applications, 2016, 27(6): 1553-1566. DOI:10.1007/s00521-015-1955-3
12 LIT, SHENH, YUANQ, et al. Estimating ground‐level PM2.5 by fusing satellite and station observations: A Ge‐intelligent deep learning approach[J]. Geophysical Research Letters, 2017, 44: 1-9. DOI: 10.1002/2017GL075710
13 FANJ X, LIQ, ZHUY J, et al. A spatiotemporal prediction framework for air pollution based on deep RNN[J]. Science of Surveying and Mapping,2017,42(7):76-83.DOI: 10.16251/j.cnki.1009-2307.2017.07.013
14 LECUNY, BENGIOY, HINTONG. Deep learning[J]. Nature, 2015, 521(7553): 436-444. DOI: 10.1038/nature14539
15 HOCHREITERS, SCHMIDHUBERJ. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735. DOI: 10.1162/neco.1997.9.8.1735
16 CHO K, MERRIENBOER BVAN, BAHDANAUD, et al.On the properties of neural machine translation: Encoder-decoder approaches[J]. Computer Science, 2014.arxiv:1409.1259.
17 CHENY, JIANGH, LIC, et al.Deep feature extraction and classification of hyperspectral images based on convolutional neural networks[J]. IEEE Transactions on Geoscience & Remote Sensing, 2016, 54(10): 6232-6251. DOI: 10.1109/TGRS.2016.2584107
18 SRIVASTAVAN, HINTONG, KRIZHEVSKYA, et al.Dropout: A simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
19 TIELEMANT, HINTONG.RMSProp: Divide the gradient by a running average of its recent magnitude[J]. Neural Networks for Machine Learning, 2012(4): 26-31.
20 LECUNY. Generalization and network design strategies[J]. Connectionism in Perspective, 1989:143-155.
21 BENGIOY I, GOODFELLOWJ, COURVILLEA.Deep Learning[M]. Cambridge: The MIT Press, 2016.
22 SPRINGENBERGJ T, DOSOVITSKIYA, BROXT, et al. Striving for simplicity: The all convolutional net[J]. Eprint Arxir, 2014:1-14.
23 DIETTERICHT G.Ensemble methods in machine learning[J]. Lecture Notes in Computer Science, 2000, 1857(1): 1-15. DOI:10.1007/3-540-45014-9_1
24 WOLPERTD H.Stacked Generalization[M]. Boston: Springer US, 2011.doi:10.1016/s0893-6080(05)80023-1
25 WILLMOTTC J.On the validation of models[J]. Physical Geography, 1981, 2(55): 184-194.
[1] 徐圣嘉,苏程,朱孔阳,章孝灿. 基于深度学习的岩石薄片矿物自动识别方法[J]. 浙江大学学报(理学版), 2022, 49(6): 743-752.
[2] 刘华玲,张国祥,马俊. 图嵌入算法研究进展[J]. 浙江大学学报(理学版), 2022, 49(4): 443-456.
[3] 王昱文,杜震洪,戴震,刘仁义,张丰. 基于复合神经网络的多元水质指标预测模型[J]. 浙江大学学报(理学版), 2022, 49(3): 354-362.
[4] 钱立辉, 王斌, 郑云飞, 章佳杰, 李马丁, 于冰. 基于图像深度预测的景深视频分类算法[J]. 浙江大学学报(理学版), 2021, 48(3): 282-288.
[5] 陈园琼, 邹北骥, 张美华, 廖望旻, 黄嘉儿, 朱承璋. 医学影像处理的深度学习可解释性研究进展[J]. 浙江大学学报(理学版), 2021, 48(1): 18-29.
[6] 傅颖颖, 张丰, 杜震洪, 刘仁义. 融合图卷积神经网络和注意力机制的PM2.5小时浓度多步预测[J]. 浙江大学学报(理学版), 2021, 48(1): 74-83.
[7] 王协, 章孝灿, 苏程. 基于多尺度学习与深度卷积神经网络的遥感图像土地利用分类[J]. 浙江大学学报(理学版), 2020, 47(6): 715-723.
[8] 李君轶, 任涛, 陆路正. 游客情感计算的文本大数据挖掘方法比较研究[J]. 浙江大学学报(理学版), 2020, 47(4): 507-520.
[9] 陈善雄, 王小龙, 韩旭, 刘云, 王明贵. 一种基于深度学习的古彝文识别方法[J]. 浙江大学学报(理学版), 2019, 46(3): 261-269.
[10] 胡伟俭, 陈为, 冯浩哲, 张天平, 朱正茂, 潘巧明. 应用于平扫CT图像肺结节检测的深度学习方法综述[J]. 浙江大学学报(理学版), 2017, 44(4): 379-384.