Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2021, Vol. 55 Issue (6): 1215-1224    DOI: 10.3785/j.issn.1008-973X.2021.06.023
    
Agile imaging satellite task planning method for intensive observation
Yi-fan MA1,2(),Fan-yu ZHAO1,2,*(),Xin WANG1,2,Zhong-he JIN1,2
1. Micro-satellite Research Center, Zhejiang University, Hangzhou 310027, China
2. Zhejiang Key Laboratory of Micro-nano Satellite Research, Zhejiang University, Hangzhou 310027, China
Download: HTML     PDF(1223KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

The agile imaging satellite task planning problem under intensive observation scenarios has the characteristics of large space and long input task sequence length. The agile imaging satellite task planning problem was modeled by considering the constraints of time windows, attitude adjustment time during task transfer, and satellite memory and power constraints. An algorithm model (Ind-PN) combining IndRNN and Pointer Networks was proposed to solve the agile imaging satellite task planning problem, and a multi-layer IndRNN structure was used as the decoder of the model. The input task sequence was selected based on Pointer Networks mechanism, and Mask vector was used to consider various constraints of the agile imaging satellite task planning problem. The algorithm model was trained by Actor Critic reinforcement learning algorithm in order to obtain the maximum observation reward rate. The experimental results show that Ind-PN algorithm converges faster and can achieve higher observation rate of reward for task planning under intensive observation scenarios.



Key wordsagile imaging satellite      task planning problem      intensive observation scenario      Ind-PN      reinforcement learning     
Received: 01 July 2020      Published: 30 July 2021
CLC:  V 474  
Fund:  国家自然科学基金资助项目(52075293);中央高校基本科研业务费专项资金资助项目(2021QN81002)
Corresponding Authors: Fan-yu ZHAO     E-mail: 21860251@zju.edu.cn;zfybit@zju.edu.cn
Cite this article:

Yi-fan MA,Fan-yu ZHAO,Xin WANG,Zhong-he JIN. Agile imaging satellite task planning method for intensive observation. Journal of ZheJiang University (Engineering Science), 2021, 55(6): 1215-1224.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2021.06.023     OR     https://www.zjujournals.com/eng/Y2021/V55/I6/1215


密集观测场景下的敏捷成像卫星任务规划方法

针对密集观测场景下敏捷成像卫星任务规划问题求解空间大、输入任务序列较长的特点,综合考虑时间窗口约束、任务转移时卫星姿态调整时间、存储约束和电量约束,对敏捷成像卫星任务规划问题进行建模. 提出融合IndRNN和Pointer Networks的算法模型(Ind-PN)对敏捷成像卫星任务规划问题进行求解,使用多层的IndRNN结构作为算法模型的解码器. 基于Pointer Networks机制对输入任务序列进行选择,使用Mask向量考虑敏捷成像卫星任务规划问题中的各类约束. 基于Actor Critic强化学习算法对算法模型进行训练,以获得最大的观测收益率. 实验结果表明,对于密集观测场景下的任务规划,Ind-PN算法的收敛速度更快,可以获得更高的观测收益率.


关键词: 敏捷成像卫星,  任务规划问题,  密集观测场景,  Ind-PN,  强化学习 
Fig.1 Schematic diagram of time window constraints
Fig.2 Model structure of Ind-PN algorithm
元素 设定值 数据类型
${\rm{w}}{{\rm{s}}_i}$ $\left[ {0,\;4.0} \right]$ 浮点变量
${\rm{an}}{{\rm{g}}_i}$ $\left[ {{\rm{ - }}0.25,\;0.25} \right]$ 浮点变量
${\rm{w}}{{\rm{e}}_i}$ $\left[ {{\rm{w}}{{\rm{s}}_i} + 0.04,\;{\rm{w}}{{\rm{s}}_i} + 0.15} \right]$ 浮点变量
${\rm{co}}{{\rm{n}}_i}$ $\left[ {0.02,\;0.04} \right]$ 浮点变量
${r_i}$ $\left[ {0.1,\;0.9} \right]$ 浮点变量
${m_i}$ $\left[ {0,\;0.01} \right]$ 浮点变量
${e_i}$ $\left[ {0,\;0.01} \right]$ 浮点变量
${\rm{win}}_i^0$ 初始设定为1 整型变量,取值为0或1
${\rm{acc}}_i^0$ 初始设定为1 整型变量,取值为0或1
${\rm{mem}}_i^0$ 初始设定为0.5 浮点变量
${\rm{pow}}_i^0$ 初始设定为0.5 浮点变量
${\rm{task}}_i^0$ 初始设定为0 整型变量
${\rm{exe}}_i^0$ 初始设定为0 浮点变量
${t_{\rm{s}}}$ 设定为0.2 浮点常量
${e_{\rm{s}}}$ 设定为0.01 浮点常量
Tab.1 Parameters setting of each task element and scene
Fig.3 Convergence curve of Ind-PN algorithm model training
Fig.4 Inference result when sample length is 100
Fig.5 Inference result when sample length is 200
Fig.6 Comparison of convergence curves of Reward and Loss
Fig.7 Comparison of convergence curve of model reward rate
序列长度 解码器 层数 轮次 R /%
200 GRU 1 10 45.7
200 GRU 2 10 45.5
200 IndRNN+BN+RES 2 10 45.4
200 IndRNN+BN+RES 4 10 46.1
Tab.2 Comparison of reward rate of algorithm models
Fig.8 Comparison of convergence curve of model reward rate
序列长度 解码器 层数 轮次 R /%
400 GRU 1 10 2.3
400 GRU 2 10 2.8
400 IndRNN 2 10 3.0
400 IndRNN+BN+RES 2 10 1.8
400 IndRNN+BN+RES 4 10 20.6
Tab.3 Comparison of reward rate of algorithm models
序列长度 算法 R /% tsol /s
100 ACO 56.30 9.001
100 Ind-PN 64.50 0.328
200 ACO 33.19 19.140
200 Ind-PN 41.20 0.453
300 ACO 22.32 30.342
300 Ind-PN 33.04 0.499
400 ACO 15.98 38.766
400 Ind-PN 22.63 0.578
Tab.4 Comparison of Ind-PN algorithm and ACO algorithm
[1]   谢平, 杜永浩, 姚锋, 等 敏捷成像卫星调度问题技术综述[J]. 宇航学报, 2019, 40 (2): 127- 138
XIE Ping, DU Yong-hao, YAO Feng, et al Literature review for autonomous scheduling technology of agile earth observation satellites[J]. Journal of Astronautics, 2019, 40 (2): 127- 138
[2]   郭浩, 邱涤珊, 伍国华, 等 基于改进蚁群算法的敏捷成像卫星任务调度方法[J]. 系统工程理论与实践, 2012, 32 (11): 2533- 2539
GUO Hao, QIU Di-shan, WU Guo-hua, et al Agile imaging satellite task scheduling method based on improved ant colony algorithm[J]. System Engineering Theory and Practice, 2012, 32 (11): 2533- 2539
doi: 10.3969/j.issn.1000-6788.2012.11.023
[3]   邱涤珊, 郭浩, 贺川, 等 敏捷成像卫星多星密集任务调度方法[J]. 航空学报, 2013, 34 (4): 882- 889
QIU Di-shan, GUO Hao, HE Chuan, et al Agile imaging satellite multi-satellite intensive task scheduling method[J]. Acta Aeronautica ET Astronautica Sinica, 2013, 34 (4): 882- 889
[4]   SHE Y, LI S, LI Y, et al Slew path planning of agile-satellite antenna pointing mechanism with optimal real-time data transmission performance[J]. Aerospace Science and Technology, 2019, 90 (7): 103- 114
[5]   DU B, LI S, SHE Y, et al Area targets observation mission planning of agile satellite considering the drift angle constraint[J]. Journal of Astronomical Telescopes, Instruments and Systems, 2018, 4 (4): 1- 19
[6]   SHE Y, LI S, ZHAO Y Onboard mission planning for agile satellite using modified mixed-integer linear programming[J]. Aerospace Science and Technology, 2017, 72: 204- 216
[7]   DU B, LI S A new multi-satellite autonomous mission allocation and planning method[J]. Acta Astronautica, 2019, 163: 287- 298
doi: 10.1016/j.actaastro.2018.11.001
[8]   郭浩, 伍国华, 邱涤珊, 等 敏捷成像卫星密集任务聚类方法[J]. 系统工程与电子技术, 2012, 34 (5): 931- 935
GUO Hao, WU Guo-hua, QIU Di-shan, et al Agile imaging satellite intensive task clustering method[J]. Systems Engineering and Electronics, 2012, 34 (5): 931- 935
doi: 10.3969/j.issn.1001-506X.2012.05.14
[9]   张铭, 王晋东, 卫波 基于改进烟花算法的密集任务成像卫星调度方法[J]. 计算机应用, 2018, (9): 2712- 2719
ZHANG Ming, WANG Jin-dong, WEI Bo Intensive mission imaging satellite scheduling method based on improved fireworks algorithm[J]. Journal of Computer Applications, 2018, (9): 2712- 2719
[10]   耿远卓, 郭延宁, 李传江, 等 敏捷凝视卫星密集点目标聚类与最优观测规划[J]. 控制与决策, 2020, 35 (3): 613- 621
GENG Yuan-zhuo, GUO Yan-ning, LI Chuan-jiang, et al Agile gaze satellite cluster and optimal observation planning[J]. Control and Decision, 2020, 35 (3): 613- 621
[11]   MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning [EB/OL]. [2020-05-29]. https://arxiv.org/abs/1602.01783.
[12]   LI S, LI W, COOK C, et al. Independently recurrent neural network (IndRNN): building a longer and deeper RNN[EB/OL]. [2020-05-29]. https://arxiv.org/abs/1803.04831v3.
[13]   杨文明, 褚伟杰 在线医疗问答文本的命名实体识别[J]. 计算机系统应用, 2019, 28 (2): 10- 16
YANG Wen-ming, CHU Wei-jie Named entity recognition of online medical question and answer text[J]. Computer System and Applications, 2019, 28 (2): 10- 16
[14]   IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[EB/OL]. [2020-05-29]. https://arxiv.org/abs/1502.03167.
[15]   HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016.
[16]   VINYALS O, FORTUNATO M, JAITLY N. Pointer networks[C]// International Conference on Neural Information Processing Systems. Istanbul: MIT Press, 2015.
[17]   NAZARI M, OROOJLOOY A, SNYDER L, et al. Reinforcement learning for solving the vehicle routing problem[EB/OL]. [2020-02-29]. https://arxiv.org/abs/1802.04240.
[18]   KINGMA D, BA J. Adam: a method for stochastic optimization[EB/OL]. [2020-05-29]. https://arxiv.org/abs/1412.6980.
[19]   WILLIAMS R J Simple statistical gradient-following algorithms for connectionist reinforcement learning[J]. Machine Learning, 1992, 8 (3/4): 229- 256
doi: 10.1023/A:1022672621406
[20]   CHUNG J, GULCEHRE C, CHO K, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling[EB/OL]. [2020-05-29]. https://arxiv.org/abs/1412.3555.
[21]   王海蛟, 贺欢, 杨震 敏捷成像卫星调度的改进量子遗传算法[J]. 宇航学报, 2018, 39 (11): 1266- 1274
WANG Hai-jiao, HE Huan, YANG Zhen Scheduling of agile satellites based on an improved quantum genetic algorithm[J]. Journal of Astronautics, 2018, 39 (11): 1266- 1274
[22]   丁祎男, 田科丰, 王淑一 基于遗传禁忌混合算法的敏捷卫星任务规划[J]. 空间控制技术与应用, 2019, 45 (6): 27- 32
DING Yi-nan, TIAN Ke-feng, WANG Shu-yi Mission scheduling for agile earth observation satellite based on genetic-tabu hybrid algorithm[J]. Aerospace Control and Application, 2019, 45 (6): 27- 32
doi: 10.3969/j.issn.1674-1579.2019.06.004
[23]   赵凡宇. 航天器多目标观测任务调度与规划方法研究[D]. 北京: 北京理工大学, 2015.
ZHAO Fan-yu. Research on scheduling and planning methods of spacecraft multi-object observation mission[D]. Beijing: Beijing Institute of Technology, 2015.
[1] Xiao-gao XU,Ying-jie XIA,Si-yu ZHU,Li KUANG. Cooperative control algorithm of multi-intersection variable-direction lanes based on reinforcement learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 987-994, 1005.
[2] Guang-long LI,De-rong SHEN,Tie-zheng NIE,Yue KOU. Learning query optimization method based on multi model outside database[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 288-296.
[3] Yi-fan MA,Fan-yu ZHAO,Xin WANG,Zhong-he JIN. Satellite earth observation task planning method based on improved pointer networks[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 395-401.
[4] Tie ZHANG,Meng XIAO,Yan-biao ZOU,Jia-dong XIAO. Research on robot constant force control of surface tracking based on reinforcement learning[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(10): 1865-1873.
[5] HAO Chuan-chuan, FANG Zhou, LI Ping. Output feedback reinforcement learning control method
based on reference model
[J]. Journal of ZheJiang University (Engineering Science), 2013, 47(3): 409-414.
[6] JIN Zhuo-jun, QIAN Hui, ZHU Miao-liang. Trajectory evaluation method based on intention analysis[J]. Journal of ZheJiang University (Engineering Science), 2011, 45(10): 1732-1737.