|
|
Cooperative control algorithm of multi-intersection variable-direction lanes based on reinforcement learning |
Xiao-gao XU1( ),Ying-jie XIA1,*( ),Si-yu ZHU1,Li KUANG2 |
1. College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China 2. School of Computer Science and Engineering, Central South University, Changsha 410012, China |
|
|
Abstract A cooperative control algorithm of multi-intersection variable-direction lanes based on multi-agent reinforcement learning was proposed to alleviate the congestion of multi-intersection, in order to solve the problem that traditional variable-direction lane control method can't adapt to the complex traffic flow problem under multiple intersections scenarios. In this method, the deep multi-agent reinforcement learning (QMIX ) algorithm was improved. The global reward under variable-direction lane scenarios was composed of basic reward and performance reward, which improved the decision-making accuracy of lane turn control in congestion scenarios. The priority experience playback algorithm was introduced to improve the utilization efficiency of the transfer sequence in the experience playback pool and accelerate the algorithm convergence. Experimental results show that the algorithm has better performance than other control methods in case of queue length, delay times and waiting times, which can effectively coordinate the policy switch of the variable-direction lanes and improve the road network capacity in the multi-intersection scenarios.
|
Received: 23 May 2021
Published: 31 May 2022
|
|
Fund: 国家自然科学基金资助项目(61873232) |
Corresponding Authors:
Ying-jie XIA
E-mail: 21821323@zju.edu.cn;xiayingjie@zju.edu.cn
|
基于强化学习的多路口可变车道协同控制方法
为了解决传统的可变导向车道控制方法无法适应多路口场景下的复杂交通流的问题,提出基于多智能体强化学习的多路口可变导向车道协同控制方法来缓解多路口的交通拥堵状况. 该方法对多智能体强化学习 (QMIX)算法进行改进,针对可变导向车道场景下的全局奖励分配问题,将全局奖励分解为基本奖励与绩效奖励,提高了拥堵场景下对车道转向变化的决策准确性. 引入优先级经验回放算法,以提升经验回放池中转移序列的利用效率,加速算法收敛. 实验结果表明,本研究所提出的多路口可变导向车道协同控制方法在排队长度、延误时间和等待时间等指标上的表现优于其他控制方法,能够有效协调可变导向车道的策略切换,提高多路口下路网的通行能力.
关键词:
可变导向车道,
强化学习,
多智能体,
自适应控制,
智能交通
|
|
[1] |
WONG C K, WONG S C Lane-based optimization of signal timings for isolated junctions[J]. Transportation Research Part B: Methodological, 2003, 37 (1): 63- 84
doi: 10.1016/S0191-2615(01)00045-5
|
|
|
[2] |
GOLUB A Perceived costs and benefits of reversible lanes in phoenix, Arizona[J]. ITE Journal: Institute of Transportation Engineers, 2012, 82 (2): 38
|
|
|
[3] |
周立平, 董红利 信号交叉口转向可变车道长度研究[J]. 交通信息与安全, 2009, 27 (2): 58- 56 ZHOU Li-ping, DONG Hong-li Length of signal intersection turn variable lane[J]. Journal of Transport Information and Safety, 2009, 27 (2): 58- 56
|
|
|
[4] |
赵靖, 周溪召 交叉口可变车道最佳车道功能及信号转变方法[J]. 上海理工大学学报, 2016, 38 (4): 380- 386 ZHAO Jing, ZHOU Xi-zhao Optimal switching method for lane assignment and signal control for variable lanes at intersections[J]. Journal of University of Shanghai for Science and Technology, 2016, 38 (4): 380- 386
|
|
|
[5] |
聂磊, 马万经 基于车道等饱和度的交叉口车道功能优化模型[J]. 同济大学学报:自然科学版, 2020, 48 (1): 42- 50 NIE Lei, MA Wan-jing A novel model for optimization of lane allocation at isolated intersection[J]. Journal of Tongji University: Natural Science, 2020, 48 (1): 42- 50
|
|
|
[6] |
聂磊, 马万经 基于车道的交叉口车道功能和信号相位优化模型[J]. 同济大学学报:自然科学版, 2020, 48 (5): 683- 693 NIE Lei, MA Wan-jing A lane-based optimization model for lane function and signal phase at intersection[J]. Journal of Tongji University: Natural Science, 2020, 48 (5): 683- 693
|
|
|
[7] |
常玉林, 赵超, 张鹏, 等 拥堵条件下考虑相邻路口的可变导向车道自适应控制[J]. 重庆理工大学学报:自然科学, 2020, 34 (5): 17- 24 CHANG Yu-lin, ZHAO Chao, ZHANG Peng, et al An adaptive control of variable lane considering adjacent intersections under congested condition[J]. Journal of Chongqing University of Technology: Natural Science, 2020, 34 (5): 17- 24
|
|
|
[8] |
赵超. 基于可变导向车道的多路口信号自适应控制方法[D]. 镇江: 江苏大学, 2019. ZHAO Chao. Multi-intersection signal adaptive control based on variable approach lane[D]. Zhenjiang: Jiangsu University, 2019.
|
|
|
[9] |
YAO R, ZHANG X, WU N, et al Modeling and control of variable approach lanes on an arterial road: a case study of Dalian[J]. Canadian Journal of Civil Engineering, 2018, 45 (11): 986- 1003
doi: 10.1139/cjce-2017-0432
|
|
|
[10] |
LI L, QU Z, SONG X, et al. Research on variable lane signalized control method [C]// 2009 International Conference on Measuring Technology and Mechatronics Automation. Zhangjiajie: IEEE, 2009, 3: 575-578.
|
|
|
[11] |
QING M, MIN W. A new control strategy of variable lane based on video detection [C]// 2014 5th International Conference on Intelligent Systems Design and Engineering Applications. Hunan: IEEE, 2014: 40-43.
|
|
|
[12] |
HE J, ZHU Y, ZHANG J, et al. Reversible lane control system with low emission load based on VISSIM simulator [C]// 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering. Nanchang: IEEE, 2021: 911-914.
|
|
|
[13] |
许佳佳, 许倩, 潘立琼 基于短时交通状态预测的交叉口导向车道智能转换系统[J]. 喀什大学学报, 2019, 40 (3): 39- 43 XU Jia-jia, XU Qian, PAN Li-qiong Intersection-oriented lane intelligent conversion system based on short term traffic state prediction[J]. Journal of Kashi University, 2019, 40 (3): 39- 43
|
|
|
[14] |
蔡建荣, 黄汝晴, 黄中祥 考虑通行能力折减的可变车道优化[J]. 中南大学学报:自然科学版, 2018, 49 (7): 1838- 1844 CAI Jian-rong, HUANG Ru-qing, HUANG Zhong-xiang Optimization of variable lane considering reduction of capacity[J]. Journal of Central South University: Science and Technology, 2018, 49 (7): 1838- 1844
|
|
|
[15] |
WEI H, ZHENG G, YAO H, et al. Intellilight: a reinforcement learning approach for intelligent traffic light control [C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. London: [s. n.], 2018: 2496-2505.
|
|
|
[16] |
CHU T, WANG J, CODECA L, et al Multi-agent deep reinforcement learning for large-scale traffic signal control[J]. IEEE Transactions on Intelligent Transportation Systems, 2019, 21 (3): 1086- 1095
|
|
|
[17] |
WANG G, HU J, LI Z, et al Harmonious lane changing via deep reinforcement learning[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23 (5): 4642- 4650
|
|
|
[18] |
RASHID T, SAMVELYAN M, SCHROEDER C, et al. QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning [C]// International Conference on Machine Learning. Stockholm: PMLR, 2018: 4295-4304.
|
|
|
[19] |
SUNEHAG P, LEVER G, GRUSLYS A, et al. Value-decomposition networks for cooperative multi-agent learning based on team reward [C]// Proceedings of the 17th International Conference on Autonomous Agents and Multi-Agent Systems. Richland: [s. n.], 2018: 2085-2087.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|