考虑劣化维护的单机调度深度强化学习模型和算法
|
|
陈勇,杜习之,姜一炜,易文超,裴植,纪祖臻
|
Deep reinforcement learning models and algorithms for single-machine scheduling considering deteriorated maintenance
|
|
Yong CHEN,Xizhi DU,Yiwei JIANG,Wenchao YI,Zhi PEI,Zuzhen JI
|
|
| 表 7 DRL方法的成本优化均值和标准差 |
| Tab.7 Optimized cost mean and standard deviation of DRL algorithms |
|
| 规模 | 基准 | | A2C | | DQN | | PPO | | Mean | Std | | Mean | Std | | Mean | Std | | Mean | Std | | 10 | 191.0 | 0.0 | | 191.4 | 1.0 | | 217.8 | 61.7 | | 192.5 | 14.8 | | 20 | 973.8 | 22.5 | | 908.0 | 9.5 | | 925.6 | 47.6 | | 901.5 | 6.1 | | 30 | 2103.9 | 102.9 | | 1920.8 | 20.1 | | 1948.3 | 109.0 | | 1880.5 | 6.6 | | 50 | 3310.6 | 172.1 | | 3009.6 | 41.1 | | 6177.7 | 2280.4 | | 2936.7 | 19.0 | | 80 | 10642.0 | 494.0 | | 9712.7 | 65.1 | | 10015.8 | 480.4 | | 9469.7 | 40.3 | | 100 | 15763.6 | 766.8 | | 14551.0 | 172.2 | | 14737.7 | 708.4 | | 14020.2 | 119.5 | | 150 | 30546.1 | 1166.5 | | 27272.6 | 270.0 | | 28197.4 | 1209.3 | | 27073.3 | 275.1 |
|
|
|