考虑劣化维护的单机调度深度强化学习模型和算法
陈勇,杜习之,姜一炜,易文超,裴植,纪祖臻

Deep reinforcement learning models and algorithms for single-machine scheduling considering deteriorated maintenance
Yong CHEN,Xizhi DU,Yiwei JIANG,Wenchao YI,Zhi PEI,Zuzhen JI
表 6 DRL算法训练性能比较
Tab.6 Training performance comparison of DRL algorithms
规模算法时间步/106平均FPS训练时长/h
10A2C11498.40.18
DQN11660.80.27
PPO11141.60.24
20A2C21473.70.37
DQN21440.60.54
PPO21109.60.50
30A2C21447.30.38
DQN21393.90.54
PPO21093.30.51
50A2C41400.90.79
DQN41283.41.11
PPO41065.41.05
80A2C81360.61.63
DQN81154.32.27
PPO81038.62.15
100A2C81350.41.66
DQN81121.42.31
PPO81018.62.19
150A2C161276.13.51
DQN161051.94.68
PPO161007.24.33