Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2022, Vol. 56 Issue (6): 1249-1256    DOI: 10.3785/j.issn.1008-973X.2022.06.024
    
Traffic signal control method based on deep reinforcement learning
Zhi-min LIU1,2(),Bao-Lin YE2,*(),Yao-dong ZHU2,Qing YAO1,Wei-min WU3
1. School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
2. College of Information Science and Engineering, Jiaxing University, Jiaxing 314001, China
3. State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027, China
Download: HTML     PDF(1377KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A traffic signal control method based on an improved deep reinforcement learning was proposed for an isolated intersection, aiming at a problem that the traffic signal control methods based on deep reinforcement learning were difficult to update the traffic signal control strategy in time. A new reward function of the proposed method was built by utilizing the real-time change of vehicle numbers at an intersection between two adjacent sampling time steps, whereby the dynamic change process of traffic status at the intersection was tracked and utilized in time. In addition, double network structure and experience playback were respectively used to improve the learning efficiency and convergence rate of the proposed method. SUMO simulation test results show that the proposed method can significantly shorten the average waiting time and average queue length of vehicles at the intersection, and improve the traffic efficiency at the intersection.



Key wordstraffic signal control      deep reinforcement learning      reward function      experience replay     
Received: 23 March 2022      Published: 30 June 2022
CLC:  TP 181  
Fund:  国家自然科学基金资助项目(61603154);浙江省自然科学基金资助项目 (LY19F030014);工业控制技术国家重点实验室开放课题 (ICT2022B52)
Corresponding Authors: Bao-Lin YE     E-mail: liuzhimin0223@163.com;yebaolin@zjxu.edu.cn
Cite this article:

Zhi-min LIU,Bao-Lin YE,Yao-dong ZHU,Qing YAO,Wei-min WU. Traffic signal control method based on deep reinforcement learning. Journal of ZheJiang University (Engineering Science), 2022, 56(6): 1249-1256.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2022.06.024     OR     https://www.zjujournals.com/eng/Y2022/V56/I6/1249


基于深度强化学习的交通信号控制方法

针对基于深度强化学习的交通信号控制方法存在难以及时更新交叉口信号控制策略的问题,提出基于改进深度强化学习的单交叉口交通信号控制方法. 构建新的基于相邻采样时间步实时车辆数变化量的奖励函数,以及时跟踪并利用交叉口交通状态动态的变化过程. 采用双网络结构提高算法学习效率,利用经验回放改善算法收敛性. 基于SUMO的仿真测试结果表明,相比传统控制方法和深度强化学习方法,所提方法能明显缩短交叉口车辆平均等待时间和平均排队长度,提高交叉口通行效率.


关键词: 交通信号控制,  深度强化学习,  奖励函数,  经验回放 
Fig.1 Framework of reinforcement learning
Fig.2 Structure diagram of intersection
Fig.3 State elements in reinforcement learning
Fig.4 Diagram of phase setting of intersection
Fig.5 Construct action apace needed for reinforcement learning
Fig.6 Convolution neural network fitting Q−value
Fig.7 Traffic signal control framework based on nature DQN
Fig.8 Cumulative rewards of different deep reinforcement learning algorithm
Fig.9 Average waiting time of different traffic signal control methods
Fig.10 Average length of queue of different traffic signal control methods
控制方法 R W/s L/m
定时控制 937.85 9.64
自适应控制 694.08 8.48
传统FC-DQN ?5.40 607.05 7.82
改进FC-DQN ?3.66 431.85 6.58
传统CNN-DQN ?2.54 168.59 4.14
改进CNN-DQN ?1.63 106.47 3.39
Tab.1 Test results of control effects of different traffic signal control methods
控制方法 S/min V/s
传统FC-DQN 116 < 2.0
改进FC-DQN 105 < 2.0
传统CNN-DQN 88 < 2.0
改进CNN-DQN 75 < 2.0
Tab.2 Model training time and algorithm response time
[1]   张立立, 王力, 张玲玉 城市道路交通控制概述与展望[J]. 科学技术与工程, 2020, 20 (16): 6322- 6329
ZHANG Li-li, WANG Li, ZHANG Ling-yu Urban road traffic control overview and prospect[J]. Science Technology and Engineering, 2020, 20 (16): 6322- 6329
doi: 10.3969/j.issn.1671-1815.2020.16.002
[2]   林晓辉 车路协同下基于交通密度的交叉口交通信号控制方法与仿真[J]. 工业工程, 2014, 17 (4): 123- 128
LIN Xiao-hui Traffic signal control method and simulation based on traffic density in cooperative vehicle infrastructure system[J]. Industrial Engineering Journal, 2014, 17 (4): 123- 128
doi: 10.3969/j.issn.1007-7375.2014.04.020
[3]   钟馥声, 王安麟, 姜涛, 等 城市交通信号自组织控制规则的邻域重构[J]. 哈尔滨工业大学学报, 2020, 52 (3): 74- 81
ZHONG Fu-sheng, WANG An-lin, JIANG Tao, et al Neighborhood reconstruction of urban traffic signal self-organizing control rules[J]. Journal of Harbin Institute of technology, 2020, 52 (3): 74- 81
doi: 10.11918/201906054
[4]   罗小芹, 王殿海, 金盛 面向混合交通的感应式交通信号控制方法[J]. 吉林大学学报:工学版, 2019, 49 (3): 695- 704
LUO Xiao-qin, WANG Dian-hai, JIN Sheng Traffic signal actuated control at isolated intersections for heterogeneous traffic[J]. Journal of Jilin University: Engineering and Technology Edition, 2019, 49 (3): 695- 704
[5]   YE B, WU W, RUAN K, et al A survey of model predictive control methods for traffic signal control[J]. IEEE/CAA Journal of Automatica Sinica, 2019, 6 (3): 623- 640
doi: 10.1109/JAS.2019.1911471
[6]   YE B, WU W, LI L, et al A hierarchical model predictive control approach for signal splits optimization in large-scale urban road networks[J]. IEEE Transactions on Intelligent Transportation Systems, 2016, 17 (8): 2182- 2192
doi: 10.1109/TITS.2016.2517079
[7]   LIANG X, DU X, WANG G, et al A deep reinforcement learning network for traffic light cycle control[J]. IEEE Transactions on Vehicular Technology, 2019, 68 (2): 1243- 1253
doi: 10.1109/TVT.2018.2890726
[8]   YANG J, ZHANG J, WANG H Urban traffic control in software defined internet of things via a multi-agent deep reinforcement learning approach[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 22 (6): 3742- 3754
[9]   TAN T, BAO F, DENG Y, et al Cooperative deep reinforcement learning for large-scale traffic grid signal control[J]. IEEE Transactions on Cybernetics, 2020, 50 (6): 2687- 2700
doi: 10.1109/TCYB.2019.2904742
[10]   WANG S, XIE X, HUANG K, et al Deep reinforcement learning-based traffic signal control using high-resolution event-based data[J]. Entropy, 2019, 21 (8): 744
doi: 10.3390/e21080744
[11]   刘皓, 吕宜生 基于深度强化学习的单路口交通信号控制[J]. 交通工程, 2020, 20 (2): 54- 59
LIU Hao, LYV Yi-sheng Deep reinforcement learning for traffic signal control of isolated signalized intersections[J]. Journal of Transportation Engineering, 2020, 20 (2): 54- 59
[12]   郭梦杰, 任安虎 基于深度强化学习的单路口信号控制算法[J]. 电子测量技术, 2019, 42 (24): 49- 52
GUO Meng-jie, REN An-hu Single control algorithm at isolated urban intersections based on deep reinforcement learning[J]. Electronic Measurement Technology, 2019, 42 (24): 49- 52
[13]   赖建辉 基于D3QN的交通信号控制策略[J]. 计算机科学, 2019, 46 (11A): 117- 121
LAI Jian-hui Traffic signal control based on double deep Q-learning network with dueling architecture[J]. Computer science, 2019, 46 (11A): 117- 121
[14]   CHU T, WANG J, CODECÀ L, et al Multi-agent deep reinforcement learning for large-scale traffic signal control[J]. IEEE Transactions on Intelligent Transportation Systems, 2020, 21 (3): 1086- 1095
doi: 10.1109/TITS.2019.2901791
[15]   WU T, ZHOU P, LIU K, et al Multi-agent deep reinforcement learning for urban traffic light control in vehicular networks[J]. IEEE Transactions on Vehicular Technology, 2020, 69 (8): 8243- 8256
doi: 10.1109/TVT.2020.2997896
[16]   HUANG X, YUAN T, QIAO G, et al Deep reinforcement learning for multimedia traffic control in software defined networking[J]. IEEE Network, 2018, 32 (6): 35- 41
doi: 10.1109/MNET.2018.1800097
[17]   WANG Z, LI H, WANG J, et al Deep reinforcement learning based conflict detection and resolution in air traffic control[J]. IET Intelligent Transport Systems, 2019, 13 (6): 1041- 1047
doi: 10.1049/iet-its.2018.5357
[18]   KUMAR N, RAHMAN S S, DHAKAD N Fuzzy inference enabled deep reinforcement learning-based traffic light control for intelligent transportation system[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22 (8): 4919- 4928
doi: 10.1109/TITS.2020.2984033
[19]   FENJIRO Y, BENBRAHIM H Deep reinforcement learning overview of the state of the art[J]. Journal of Automation Mobile Robotics and Intelligent Systems, 2018, 12 (3): 20- 39
doi: 10.14313/JAMRIS_3-2018/15
[20]   ARULKUMARAN K, DEISENROTH M P, BRUNDAGE M, et al Deep reinforcement learning: a brief survey[J]. IEEE Signal Processing Magazine, 2017, 34 (6): 26- 38
doi: 10.1109/MSP.2017.2743240
[21]   TROIA S, SAPIENZA F, VARÉ L, et al On deep reinforcement learning for traffic engineering in SD-WAN[J]. IEEE Journal on Selected Areas in Communications, 2021, 39 (7): 2198- 2212
doi: 10.1109/JSAC.2020.3041385
[22]   TIAN Y, WANG Z, YIN X, et al Traffic engineering in partially deployed segment routing over IPv6 network with deep reinforcement learning[J]. IEEE/ACM Transactions on Networking, 2020, 28 (4): 1573- 1586
doi: 10.1109/TNET.2020.2987866
[23]   LI M, LI Z, XU C, et al Deep reinforcement learning-based vehicle driving strategy to reduce crash risks in traffic oscillations[J]. Transportation research record, 2020, 2674 (10): 42- 54
doi: 10.1177/0361198120937976
[24]   WU Q, CHEN X, ZHOU Z Deep reinforcement learning with spatio-temporal traffic forecasting for data-driven base station sleep control[J]. IEEE/ACM Transactions on Networking, 2021, 29 (2): 935- 948
doi: 10.1109/TNET.2021.3053771
[25]   MNIH V, KAVUKCUOGLU K, SILVER D, et al Human-level control through deep reinforcement learning[J]. Nature, 2015, 518: 529- 533
doi: 10.1038/nature14236
[26]   WU T, ZHOU P, WANG B, et al Joint traffic control and multi-channel reassignment for core backbone network in SDN-IoT: a multi-agent deep reinforcement learning approach[J]. IEEE Transactions on Network Science and Engineering, 2021, 8 (1): 231- 245
doi: 10.1109/TNSE.2020.3036456
[1] Yi-fan MA,Fan-yu ZHAO,Xin WANG,Zhong-he JIN. Satellite earth observation task planning method based on improved pointer networks[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 395-401.
[2] Kai LU,Xin TIAN,Guan-rong LIN,Xing-dong DENG. Simultaneous optimization model of signal phase design and timing at intersection[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(5): 921-930.
[3] LI Wen-jing, SUN Feng, LI Xi-yao, MA Dong-fang. Time-of-day breakpoints for traffic signal control using dynamic recurrence order clustering[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(6): 1150-1156.
[4] HAO Chuan-chuan, FANG Zhou, LI Ping. Output feedback reinforcement learning control method
based on reference model
[J]. Journal of ZheJiang University (Engineering Science), 2013, 47(3): 409-414.