基于自适应课程强化学习的多无人艇对抗围捕决策
陈浪,刘增力,赵宣植

Decision-making for multi-USV adversarial encirclement based on adaptive curriculum reinforcement learning
Lang CHEN,Zengli LIU,Xuanzhi ZHAO
表 5 各算法在2种目标策略下的性能指标对比
Tab.5 Comparison of performance metrics of different algorithms under two target strategies
目标策略算法$ {C}_{\text{u}} $$ {C}_{\text{T}} $$ {P}_{\text{u}} $$ {P}_{\text{T}} $$ {D}_{\text{avg}} $/m$ {R}_{\text{avg}} $
RLACL-MAPPO19440462302.32455.253
CL-MAPPO364587122367.47442.437
NOCL-MAPPO1 40113443238421.16117.367
ACL-MADDPG389678331381.19340.152
RandomACL-MAPPO94230115454285.39137.113
CL-MAPPO1 22430420552321.48230.172
NOCL-MAPPO1 55815036756343.51413.627
ACL-MADDPG1 28731722161337.64325.241