基于异步优势演员-评论家的交通信号控制方法
叶宝林,孙瑞涛,吴维敏,陈滨,姚青

Traffic signal control method based on asynchronous advantage actor-critic
Baolin YE,Ruitao SUN,Weimin WU,Bin CHEN,Qing YAO
表 1 奖励分段规则
Tab.1 Reward setting rules
RwRpWchange_ratePchange_rate
4(0.35, +∞)
3(0.25, 0.35]
2(0.15, 0.25]
1[0, 0.15]
−1[−0.15, 0)
−2[−0.25, −0.15)
−3[−0.35, −0.25)
−4(−∞, −0.35)