基于深度强化学习的交通信号控制方法

doi:10.3785/j.issn.1008-973X.2022.06.024

浙江大学学报(工学版)

2022, Vol. 56

Issue (6): 1249-1256 DOI: 10.3785/j.issn.1008-973X.2022.06.024

建筑与交通工程

基于深度强化学习的交通信号控制方法

刘智敏1,2(

),叶宝林2,*(

),朱耀东2,姚青1,吴维敏3

1. 浙江理工大学信息学院，浙江杭州 310018
2. 嘉兴学院信息科学与工程学院，浙江嘉兴 314001
3. 浙江大学工业控制技术国家重点实验室，智能系统与控制研究所，浙江杭州 310027

Traffic signal control method based on deep reinforcement learning

Zhi-min LIU1,2(

),Bao-Lin YE2,*(

),Yao-dong ZHU2,Qing YAO1,Wei-min WU3

1. School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
2. College of Information Science and Engineering, Jiaxing University, Jiaxing 314001, China
3. State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027, China

全文: PDF(1377 KB) HTML

摘要：

针对基于深度强化学习的交通信号控制方法存在难以及时更新交叉口信号控制策略的问题，提出基于改进深度强化学习的单交叉口交通信号控制方法. 构建新的基于相邻采样时间步实时车辆数变化量的奖励函数，以及时跟踪并利用交叉口交通状态动态的变化过程. 采用双网络结构提高算法学习效率，利用经验回放改善算法收敛性. 基于SUMO的仿真测试结果表明，相比传统控制方法和深度强化学习方法，所提方法能明显缩短交叉口车辆平均等待时间和平均排队长度，提高交叉口通行效率.

关键词： 交通信号控制; 深度强化学习; 奖励函数; 经验回放

Abstract:

A traffic signal control method based on an improved deep reinforcement learning was proposed for an isolated intersection, aiming at a problem that the traffic signal control methods based on deep reinforcement learning were difficult to update the traffic signal control strategy in time. A new reward function of the proposed method was built by utilizing the real-time change of vehicle numbers at an intersection between two adjacent sampling time steps, whereby the dynamic change process of traffic status at the intersection was tracked and utilized in time. In addition, double network structure and experience playback were respectively used to improve the learning efficiency and convergence rate of the proposed method. SUMO simulation test results show that the proposed method can significantly shorten the average waiting time and average queue length of vehicles at the intersection, and improve the traffic efficiency at the intersection.

Key words: traffic signal control deep reinforcement learning reward function experience replay

收稿日期: 2022-03-23 出版日期: 2022-06-30

CLC:

TP 181

基金资助: 国家自然科学基金资助项目(61603154)；浙江省自然科学基金资助项目 (LY19F030014)；工业控制技术国家重点实验室开放课题 (ICT2022B52)

通讯作者: 叶宝林 E-mail: liuzhimin0223@163.com;yebaolin@zjxu.edu.cn

作者简介: 刘智敏（1998—），男，硕士生，从事智能交通方向的研究. orcid.org/0000-0002-2937-5549. E-mail： liuzhimin0223@163.com

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	刘智敏
	叶宝林
	朱耀东
	姚青
	吴维敏

引用本文:

刘智敏,叶宝林,朱耀东,姚青,吴维敏. 基于深度强化学习的交通信号控制方法[J]. 浙江大学学报(工学版), 2022, 56(6): 1249-1256.

Zhi-min LIU,Bao-Lin YE,Yao-dong ZHU,Qing YAO,Wei-min WU. Traffic signal control method based on deep reinforcement learning. Journal of ZheJiang University (Engineering Science), 2022, 56(6): 1249-1256.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.06.024 或 https://www.zjujournals.com/eng/CN/Y2022/V56/I6/1249

图 1 强化学习的基本框架

图 2 交叉口结构示意图

图 3 定义强化学习中的状态要素

图 4 交叉口相位配置示意图

图 5 构建强化学习所需的动作空间

图 6 用于拟合Q值的卷积神经网络

图 7 基于Nature DQN的交通信号控制框架

图 8 不同类型的深度强化学习算法的累计奖励

图 9 不同交通信号控制方法的平均等待时间

图 10 不同交通信号控制方法的平均排队长度

表 1 不同交通信号控制方法的测试结果

表 2 模型训练时间和算法实时响应时间

1	张立立, 王力, 张玲玉城市道路交通控制概述与展望[J]. 科学技术与工程, 2020, 20 (16): 6322- 6329 ZHANG Li-li, WANG Li, ZHANG Ling-yu Urban road traffic control overview and prospect[J]. Science Technology and Engineering, 2020, 20 (16): 6322- 6329 doi: 10.3969/j.issn.1671-1815.2020.16.002
2	林晓辉车路协同下基于交通密度的交叉口交通信号控制方法与仿真[J]. 工业工程, 2014, 17 (4): 123- 128 LIN Xiao-hui Traffic signal control method and simulation based on traffic density in cooperative vehicle infrastructure system[J]. Industrial Engineering Journal, 2014, 17 (4): 123- 128 doi: 10.3969/j.issn.1007-7375.2014.04.020
3	钟馥声, 王安麟, 姜涛, 等城市交通信号自组织控制规则的邻域重构[J]. 哈尔滨工业大学学报, 2020, 52 (3): 74- 81 ZHONG Fu-sheng, WANG An-lin, JIANG Tao, et al Neighborhood reconstruction of urban traffic signal self-organizing control rules[J]. Journal of Harbin Institute of technology, 2020, 52 (3): 74- 81 doi: 10.11918/201906054
4	罗小芹, 王殿海, 金盛面向混合交通的感应式交通信号控制方法[J]. 吉林大学学报:工学版, 2019, 49 (3): 695- 704 LUO Xiao-qin, WANG Dian-hai, JIN Sheng Traffic signal actuated control at isolated intersections for heterogeneous traffic[J]. Journal of Jilin University: Engineering and Technology Edition, 2019, 49 (3): 695- 704
5	YE B, WU W, RUAN K, et al A survey of model predictive control methods for traffic signal control[J]. IEEE/CAA Journal of Automatica Sinica, 2019, 6 (3): 623- 640 doi: 10.1109/JAS.2019.1911471
6	YE B, WU W, LI L, et al A hierarchical model predictive control approach for signal splits optimization in large-scale urban road networks[J]. IEEE Transactions on Intelligent Transportation Systems, 2016, 17 (8): 2182- 2192 doi: 10.1109/TITS.2016.2517079
7	LIANG X, DU X, WANG G, et al A deep reinforcement learning network for traffic light cycle control[J]. IEEE Transactions on Vehicular Technology, 2019, 68 (2): 1243- 1253 doi: 10.1109/TVT.2018.2890726
8	YANG J, ZHANG J, WANG H Urban traffic control in software defined internet of things via a multi-agent deep reinforcement learning approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22 (6): 3742- 3754
9	TAN T, BAO F, DENG Y, et al Cooperative deep reinforcement learning for large-scale traffic grid signal control[J]. IEEE Transactions on Cybernetics, 2020, 50 (6): 2687- 2700 doi: 10.1109/TCYB.2019.2904742
10	WANG S, XIE X, HUANG K, et al Deep reinforcement learning-based traffic signal control using high-resolution event-based data[J]. Entropy, 2019, 21 (8): 744 doi: 10.3390/e21080744
11	刘皓, 吕宜生基于深度强化学习的单路口交通信号控制[J]. 交通工程, 2020, 20 (2): 54- 59 LIU Hao, LYV Yi-sheng Deep reinforcement learning for traffic signal control of isolated signalized intersections[J]. Journal of Transportation Engineering, 2020, 20 (2): 54- 59
12	郭梦杰, 任安虎基于深度强化学习的单路口信号控制算法[J]. 电子测量技术, 2019, 42 (24): 49- 52 GUO Meng-jie, REN An-hu Single control algorithm at isolated urban intersections based on deep reinforcement learning[J]. Electronic Measurement Technology, 2019, 42 (24): 49- 52
13	赖建辉基于D3QN的交通信号控制策略[J]. 计算机科学, 2019, 46 (11A): 117- 121 LAI Jian-hui Traffic signal control based on double deep Q-learning network with dueling architecture[J]. Computer science, 2019, 46 (11A): 117- 121
14	CHU T, WANG J, CODECÀ L, et al Multi-agent deep reinforcement learning for large-scale traffic signal control[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21 (3): 1086- 1095 doi: 10.1109/TITS.2019.2901791
15	WU T, ZHOU P, LIU K, et al Multi-agent deep reinforcement learning for urban traffic light control in vehicular networks[J]. IEEE Transactions on Vehicular Technology, 2020, 69 (8): 8243- 8256 doi: 10.1109/TVT.2020.2997896
16	HUANG X, YUAN T, QIAO G, et al Deep reinforcement learning for multimedia traffic control in software defined networking[J]. IEEE Network, 2018, 32 (6): 35- 41 doi: 10.1109/MNET.2018.1800097
17	WANG Z, LI H, WANG J, et al Deep reinforcement learning based conflict detection and resolution in air traffic control[J]. IET Intelligent Transport Systems, 2019, 13 (6): 1041- 1047 doi: 10.1049/iet-its.2018.5357
18	KUMAR N, RAHMAN S S, DHAKAD N Fuzzy inference enabled deep reinforcement learning-based traffic light control for intelligent transportation system[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22 (8): 4919- 4928 doi: 10.1109/TITS.2020.2984033
19	FENJIRO Y, BENBRAHIM H Deep reinforcement learning overview of the state of the art[J]. Journal of Automation Mobile Robotics and Intelligent Systems, 2018, 12 (3): 20- 39 doi: 10.14313/JAMRIS_3-2018/15
20	ARULKUMARAN K, DEISENROTH M P, BRUNDAGE M, et al Deep reinforcement learning: a brief survey[J]. IEEE Signal Processing Magazine, 2017, 34 (6): 26- 38 doi: 10.1109/MSP.2017.2743240
21	TROIA S, SAPIENZA F, VARÉ L, et al On deep reinforcement learning for traffic engineering in SD-WAN[J]. IEEE Journal on Selected Areas in Communications, 2021, 39 (7): 2198- 2212 doi: 10.1109/JSAC.2020.3041385
22	TIAN Y, WANG Z, YIN X, et al Traffic engineering in partially deployed segment routing over IPv6 network with deep reinforcement learning[J]. IEEE/ACM Transactions on Networking, 2020, 28 (4): 1573- 1586 doi: 10.1109/TNET.2020.2987866
23	LI M, LI Z, XU C, et al Deep reinforcement learning-based vehicle driving strategy to reduce crash risks in traffic oscillations[J]. Transportation research record, 2020, 2674 (10): 42- 54 doi: 10.1177/0361198120937976
24	WU Q, CHEN X, ZHOU Z Deep reinforcement learning with spatio-temporal traffic forecasting for data-driven base station sleep control[J]. IEEE/ACM Transactions on Networking, 2021, 29 (2): 935- 948 doi: 10.1109/TNET.2021.3053771
25	MNIH V, KAVUKCUOGLU K, SILVER D, et al Human-level control through deep reinforcement learning[J]. Nature, 2015, 518: 529- 533 doi: 10.1038/nature14236
26	WU T, ZHOU P, WANG B, et al Joint traffic control and multi-channel reassignment for core backbone network in SDN-IoT: a multi-agent deep reinforcement learning approach[J]. IEEE Transactions on Network Science and Engineering, 2021, 8 (1): 231- 245 doi: 10.1109/TNSE.2020.3036456

[1]	马一凡,赵凡宇,王鑫,金仲和. 基于改进指针网络的卫星对地观测任务规划方法[J]. 浙江大学学报(工学版), 2021, 55(2): 395-401.
[2]	卢凯,田鑫,林观荣,邓兴栋. 交叉口信号相位设置与配时同步优化模型[J]. 浙江大学学报(工学版), 2020, 54(5): 921-930.

Viewed

Full text

Abstract

Cited

Shared

Discussed