考虑劣化维护的单机调度深度强化学习模型和算法
陈勇,杜习之,姜一炜,易文超,裴植,纪祖臻

Deep reinforcement learning models and algorithms for single-machine scheduling considering deteriorated maintenance
Yong CHEN,Xizhi DU,Yiwei JIANG,Wenchao YI,Zhi PEI,Zuzhen JI
表 1 动作空间描述
Tab.1 Description of action space
符号动作描述数学形式
a1SPT最短加工时间优先$ {\mathrm{m}\mathrm{i}\mathrm{n}}_{j\in B}\;{p}_{j} $
a2LPT最长加工时间优先$ {\mathrm{m}\mathrm{a}\mathrm{x}}_{j\in B}\;{p}_{j} $
a3EDD最早交付期优先$ {\mathrm{m}\mathrm{i}\mathrm{n}}_{j\in B}\;{d}_{j} $
a4FCFS最早到达时间优先$ {\mathrm{m}\mathrm{i}\mathrm{n}}_{j\in B}\;{\mathrm{a}\mathrm{r}\mathrm{r}\mathrm{i}\mathrm{v}\mathrm{e}}_{j} $
a5MST最小松弛时间$ {\mathrm{m}\mathrm{i}\mathrm{n}}_{j\in B}\;\left({d}_{j}-\left(t+{p}_{j}\right)\right) $
a6CR最小临界比率$ {\mathrm{min}}_{j\in B}\;\left({d}_{j}-t\right)/{p}_{j} $
a7MDD修正交付时间优先$ {\mathrm{m}\mathrm{i}\mathrm{n}}_{j\in B}\;\mathrm{m}\mathrm{a}\mathrm{x}\;({d}_{j},t+{p}_{j}) $
a8PM执行不完全维护$ {M}_{i-1}+{R}_{\mathrm{P}\mathrm{M}} $
a9CM执行完全维护$ {M}_{i-1}+{R}_{\mathrm{C}\mathrm{M}} $