动态环境无人机导航的安全分层强化学习框架
商益铭,杜昌平,杨睿,方天睿,杜泽安,郑耀

Safe hierarchical reinforcement learning framework for dynamic UAV navigation
Yiming SHANG,Changping DU,Rui YANG,Tianrui FANG,Ze’an DU,Yao ZHENG
表 2 自适应奖励塑形下的权重分配
Tab.2 Weight distribution under adaptive reward shaping
奖励组件GlobalSafetyApproach
Goal1.01.01.0
Direction1.80.82.2
Collision1.01.01.0
Boundary1.01.01.0
Static0.51.81.2
Dynamic0.62.21.0
Deviation0.71.50.8
Speed2.50.60.9
Energy0.41.41.6
Time0.61.21.6
DWA0.62.81.8
CBF0.82.01.6