Please wait a minute...
浙江大学学报(工学版)  2022, Vol. 56 Issue (6): 1249-1256    DOI: 10.3785/j.issn.1008-973X.2022.06.024
建筑与交通工程     
基于深度强化学习的交通信号控制方法
刘智敏1,2(),叶宝林2,*(),朱耀东2,姚青1,吴维敏3
1. 浙江理工大学 信息学院,浙江 杭州 310018
2. 嘉兴学院 信息科学与工程学院,浙江 嘉兴 314001
3. 浙江大学 工业控制技术国家重点实验室,智能系统与控制研究所,浙江 杭州 310027
Traffic signal control method based on deep reinforcement learning
Zhi-min LIU1,2(),Bao-Lin YE2,*(),Yao-dong ZHU2,Qing YAO1,Wei-min WU3
1. School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
2. College of Information Science and Engineering, Jiaxing University, Jiaxing 314001, China
3. State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027, China
 全文: PDF(1377 KB)   HTML
摘要:

针对基于深度强化学习的交通信号控制方法存在难以及时更新交叉口信号控制策略的问题,提出基于改进深度强化学习的单交叉口交通信号控制方法. 构建新的基于相邻采样时间步实时车辆数变化量的奖励函数,以及时跟踪并利用交叉口交通状态动态的变化过程. 采用双网络结构提高算法学习效率,利用经验回放改善算法收敛性. 基于SUMO的仿真测试结果表明,相比传统控制方法和深度强化学习方法,所提方法能明显缩短交叉口车辆平均等待时间和平均排队长度,提高交叉口通行效率.

关键词: 交通信号控制深度强化学习奖励函数经验回放    
Abstract:

A traffic signal control method based on an improved deep reinforcement learning was proposed for an isolated intersection, aiming at a problem that the traffic signal control methods based on deep reinforcement learning were difficult to update the traffic signal control strategy in time. A new reward function of the proposed method was built by utilizing the real-time change of vehicle numbers at an intersection between two adjacent sampling time steps, whereby the dynamic change process of traffic status at the intersection was tracked and utilized in time. In addition, double network structure and experience playback were respectively used to improve the learning efficiency and convergence rate of the proposed method. SUMO simulation test results show that the proposed method can significantly shorten the average waiting time and average queue length of vehicles at the intersection, and improve the traffic efficiency at the intersection.

Key words: traffic signal control    deep reinforcement learning    reward function    experience replay
收稿日期: 2022-03-23 出版日期: 2022-06-30
CLC:  TP 181  
基金资助: 国家自然科学基金资助项目(61603154);浙江省自然科学基金资助项目 (LY19F030014);工业控制技术国家重点实验室开放课题 (ICT2022B52)
通讯作者: 叶宝林     E-mail: liuzhimin0223@163.com;yebaolin@zjxu.edu.cn
作者简介: 刘智敏(1998—),男,硕士生,从事智能交通方向的研究. orcid.org/0000-0002-2937-5549. E-mail: liuzhimin0223@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
刘智敏
叶宝林
朱耀东
姚青
吴维敏

引用本文:

刘智敏,叶宝林,朱耀东,姚青,吴维敏. 基于深度强化学习的交通信号控制方法[J]. 浙江大学学报(工学版), 2022, 56(6): 1249-1256.

Zhi-min LIU,Bao-Lin YE,Yao-dong ZHU,Qing YAO,Wei-min WU. Traffic signal control method based on deep reinforcement learning. Journal of ZheJiang University (Engineering Science), 2022, 56(6): 1249-1256.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.06.024        https://www.zjujournals.com/eng/CN/Y2022/V56/I6/1249

图 1  强化学习的基本框架
图 2  交叉口结构示意图
图 3  定义强化学习中的状态要素
图 4  交叉口相位配置示意图
图 5  构建强化学习所需的动作空间
图 6  用于拟合Q值的卷积神经网络
图 7  基于Nature DQN的交通信号控制框架
图 8  不同类型的深度强化学习算法的累计奖励
图 9  不同交通信号控制方法的平均等待时间
图 10  不同交通信号控制方法的平均排队长度
控制方法 R W/s L/m
定时控制 937.85 9.64
自适应控制 694.08 8.48
传统FC-DQN ?5.40 607.05 7.82
改进FC-DQN ?3.66 431.85 6.58
传统CNN-DQN ?2.54 168.59 4.14
改进CNN-DQN ?1.63 106.47 3.39
表 1  不同交通信号控制方法的测试结果
控制方法 S/min V/s
传统FC-DQN 116 < 2.0
改进FC-DQN 105 < 2.0
传统CNN-DQN 88 < 2.0
改进CNN-DQN 75 < 2.0
表 2  模型训练时间和算法实时响应时间
1 张立立, 王力, 张玲玉 城市道路交通控制概述与展望[J]. 科学技术与工程, 2020, 20 (16): 6322- 6329
ZHANG Li-li, WANG Li, ZHANG Ling-yu Urban road traffic control overview and prospect[J]. Science Technology and Engineering, 2020, 20 (16): 6322- 6329
doi: 10.3969/j.issn.1671-1815.2020.16.002
2 林晓辉 车路协同下基于交通密度的交叉口交通信号控制方法与仿真[J]. 工业工程, 2014, 17 (4): 123- 128
LIN Xiao-hui Traffic signal control method and simulation based on traffic density in cooperative vehicle infrastructure system[J]. Industrial Engineering Journal, 2014, 17 (4): 123- 128
doi: 10.3969/j.issn.1007-7375.2014.04.020
3 钟馥声, 王安麟, 姜涛, 等 城市交通信号自组织控制规则的邻域重构[J]. 哈尔滨工业大学学报, 2020, 52 (3): 74- 81
ZHONG Fu-sheng, WANG An-lin, JIANG Tao, et al Neighborhood reconstruction of urban traffic signal self-organizing control rules[J]. Journal of Harbin Institute of technology, 2020, 52 (3): 74- 81
doi: 10.11918/201906054
4 罗小芹, 王殿海, 金盛 面向混合交通的感应式交通信号控制方法[J]. 吉林大学学报:工学版, 2019, 49 (3): 695- 704
LUO Xiao-qin, WANG Dian-hai, JIN Sheng Traffic signal actuated control at isolated intersections for heterogeneous traffic[J]. Journal of Jilin University: Engineering and Technology Edition, 2019, 49 (3): 695- 704
5 YE B, WU W, RUAN K, et al A survey of model predictive control methods for traffic signal control[J]. IEEE/CAA Journal of Automatica Sinica, 2019, 6 (3): 623- 640
doi: 10.1109/JAS.2019.1911471
6 YE B, WU W, LI L, et al A hierarchical model predictive control approach for signal splits optimization in large-scale urban road networks[J]. IEEE Transactions on Intelligent Transportation Systems, 2016, 17 (8): 2182- 2192
doi: 10.1109/TITS.2016.2517079
7 LIANG X, DU X, WANG G, et al A deep reinforcement learning network for traffic light cycle control[J]. IEEE Transactions on Vehicular Technology, 2019, 68 (2): 1243- 1253
doi: 10.1109/TVT.2018.2890726
8 YANG J, ZHANG J, WANG H Urban traffic control in software defined internet of things via a multi-agent deep reinforcement learning approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22 (6): 3742- 3754
9 TAN T, BAO F, DENG Y, et al Cooperative deep reinforcement learning for large-scale traffic grid signal control[J]. IEEE Transactions on Cybernetics, 2020, 50 (6): 2687- 2700
doi: 10.1109/TCYB.2019.2904742
10 WANG S, XIE X, HUANG K, et al Deep reinforcement learning-based traffic signal control using high-resolution event-based data[J]. Entropy, 2019, 21 (8): 744
doi: 10.3390/e21080744
11 刘皓, 吕宜生 基于深度强化学习的单路口交通信号控制[J]. 交通工程, 2020, 20 (2): 54- 59
LIU Hao, LYV Yi-sheng Deep reinforcement learning for traffic signal control of isolated signalized intersections[J]. Journal of Transportation Engineering, 2020, 20 (2): 54- 59
12 郭梦杰, 任安虎 基于深度强化学习的单路口信号控制算法[J]. 电子测量技术, 2019, 42 (24): 49- 52
GUO Meng-jie, REN An-hu Single control algorithm at isolated urban intersections based on deep reinforcement learning[J]. Electronic Measurement Technology, 2019, 42 (24): 49- 52
13 赖建辉 基于D3QN的交通信号控制策略[J]. 计算机科学, 2019, 46 (11A): 117- 121
LAI Jian-hui Traffic signal control based on double deep Q-learning network with dueling architecture[J]. Computer science, 2019, 46 (11A): 117- 121
14 CHU T, WANG J, CODECÀ L, et al Multi-agent deep reinforcement learning for large-scale traffic signal control[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21 (3): 1086- 1095
doi: 10.1109/TITS.2019.2901791
15 WU T, ZHOU P, LIU K, et al Multi-agent deep reinforcement learning for urban traffic light control in vehicular networks[J]. IEEE Transactions on Vehicular Technology, 2020, 69 (8): 8243- 8256
doi: 10.1109/TVT.2020.2997896
16 HUANG X, YUAN T, QIAO G, et al Deep reinforcement learning for multimedia traffic control in software defined networking[J]. IEEE Network, 2018, 32 (6): 35- 41
doi: 10.1109/MNET.2018.1800097
17 WANG Z, LI H, WANG J, et al Deep reinforcement learning based conflict detection and resolution in air traffic control[J]. IET Intelligent Transport Systems, 2019, 13 (6): 1041- 1047
doi: 10.1049/iet-its.2018.5357
18 KUMAR N, RAHMAN S S, DHAKAD N Fuzzy inference enabled deep reinforcement learning-based traffic light control for intelligent transportation system[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22 (8): 4919- 4928
doi: 10.1109/TITS.2020.2984033
19 FENJIRO Y, BENBRAHIM H Deep reinforcement learning overview of the state of the art[J]. Journal of Automation Mobile Robotics and Intelligent Systems, 2018, 12 (3): 20- 39
doi: 10.14313/JAMRIS_3-2018/15
20 ARULKUMARAN K, DEISENROTH M P, BRUNDAGE M, et al Deep reinforcement learning: a brief survey[J]. IEEE Signal Processing Magazine, 2017, 34 (6): 26- 38
doi: 10.1109/MSP.2017.2743240
21 TROIA S, SAPIENZA F, VARÉ L, et al On deep reinforcement learning for traffic engineering in SD-WAN[J]. IEEE Journal on Selected Areas in Communications, 2021, 39 (7): 2198- 2212
doi: 10.1109/JSAC.2020.3041385
22 TIAN Y, WANG Z, YIN X, et al Traffic engineering in partially deployed segment routing over IPv6 network with deep reinforcement learning[J]. IEEE/ACM Transactions on Networking, 2020, 28 (4): 1573- 1586
doi: 10.1109/TNET.2020.2987866
23 LI M, LI Z, XU C, et al Deep reinforcement learning-based vehicle driving strategy to reduce crash risks in traffic oscillations[J]. Transportation research record, 2020, 2674 (10): 42- 54
doi: 10.1177/0361198120937976
24 WU Q, CHEN X, ZHOU Z Deep reinforcement learning with spatio-temporal traffic forecasting for data-driven base station sleep control[J]. IEEE/ACM Transactions on Networking, 2021, 29 (2): 935- 948
doi: 10.1109/TNET.2021.3053771
25 MNIH V, KAVUKCUOGLU K, SILVER D, et al Human-level control through deep reinforcement learning[J]. Nature, 2015, 518: 529- 533
doi: 10.1038/nature14236
26 WU T, ZHOU P, WANG B, et al Joint traffic control and multi-channel reassignment for core backbone network in SDN-IoT: a multi-agent deep reinforcement learning approach[J]. IEEE Transactions on Network Science and Engineering, 2021, 8 (1): 231- 245
doi: 10.1109/TNSE.2020.3036456
[1] 马一凡,赵凡宇,王鑫,金仲和. 基于改进指针网络的卫星对地观测任务规划方法[J]. 浙江大学学报(工学版), 2021, 55(2): 395-401.
[2] 卢凯,田鑫,林观荣,邓兴栋. 交叉口信号相位设置与配时同步优化模型[J]. 浙江大学学报(工学版), 2020, 54(5): 921-930.