|
|
Landing Guidance of Reusable Launch Vehicle Based on Reinforcement Learning |
HE Linkun1, ZHANG Ran1, GONG Qinghai2 |
1. School of Astronautics, Beihang University, Beijing 100191,China;
2. Beijing Aerospace Automatic Control Institute, Beijing 100070,China
|
|
|
Abstract Landing guidance for reusable launch vehicle should ensure the accuracy of landing position and velocity as well as minimized fuel consumption. Landing guidance methods based on optimal control is based on accurate rocket dynamic model, which corrupts the scalability of guidance methods. To address this problem, a neural network landing guidance policy is developed based on model-free iterative reinforcement learning approach. First, a Markov decision process model of the rocket landing guidance problem is established, and a staged reward function is designed according to the terminal constraints and fuel consumption index; Further, a multilayer perceptron guidance policy network is developed, and a model-free proximal policy optimization algorithm is adopted to achieve iterative optimization of the guidance policy network through interaction with the rocket landing guidance Markov decision process; Finally, the guidance policy is validated under simulations of a reusable launch vehicle landing scenario. The results show that the proposed reinforcement learning landing guidance policy can achieve high landing accuracy, near optimal fuel consumption, and adaptivity to parameter uncertainty of the rocket model.
|
Received: 13 July 2021
Published: 06 September 2021
|
|
|
|
|
[1] |
WANG Xu, CAI Yuanli, ZHANG Xuecheng, ZHANG Rongliang, HAN Chenglong. Intercept Guidance Law with a Low Acceleration Ratio Based on Hierarchical Reinforcement Learning[J]. Air & Space Defense, 2024, 7(1): 40-47. |
[2] |
GUO Jianguo, HU Guanjie, XU Xinpeng, LIU Yue, CAO Jin. Reinforcement Learning-Based Target Assignment Method for Many-to-Many Interceptions[J]. Air & Space Defense, 2024, 7(1): 24-31. |
[3] |
MA Chi, ZHANG Guoqun, SUN Junge, LYU Guangzhe, ZHANG Tao. Deep Reinforcement Learning-Based Reconfiguration Method for Integrated Electronic Systems[J]. Air & Space Defense, 2024, 7(1): 63-70. |
[4] |
LI Mengxuan, GUO Jianguo, XU Xinpeng, SHEN Yuheng. Guidance Law Based on Proximal Policy Optimization[J]. Air & Space Defense, 2023, 6(4): 51-57. |
[5] |
XIONG Lei, MIAO Yurun, FAN Xinzhou, YAO Ye. Energy-Saving Control of Central Air-Conditioning System Based on an Improved-SSA[J]. Journal of Shanghai Jiao Tong University, 2023, 57(4): 495-504. |
[6] |
SUN Jie, LI Zihao, ZHANG Shuyu. Application of Machine Learning in Chemical Synthesis and Characterization[J]. Journal of Shanghai Jiao Tong University, 2023, 57(10): 1231-1244. |
[7] |
LÜ Qibing (吕其兵), LIU Tianyuan (刘天元), ZHANG Rong (张荣), JIANG Yanan (江亚南), XIAO Lei (肖雷), BAO Jingsong∗ (鲍劲松). Generation Approach of Human-Robot Cooperative Assembly Strategy Based on Transfer Learning[J]. J Shanghai Jiaotong Univ Sci, 2022, 27(5): 602-613. |
[8] |
YU Xinyi (禹鑫燚), WU Jiaxin (吴加鑫), XU Chengjun (许成军), LUO Huizhen (罗惠珍), OU Linlin∗ (欧林林). Adaptive Human-Robot Collaboration Control Based on Optimal Admittance Parameters[J]. J Shanghai Jiaotong Univ Sci, 2022, 27(5): 589-601. |
[9] |
SU Shan, XIE Yongji, BAI Yulian, LIU Yintian, SHAN Yongzhi. Research on Differential Game Cooperative Confrontation Guidance Law Method[J]. Air & Space Defense, 2022, 5(2): 58-64. |
[10] |
SHANG Xi, YANG Gewen, DAI Shaohuai, JIANG Yilin. Research on Resource Allocation Strategy of One-to-Many Radar Jamming Based on Reinforcement Learning[J]. Air & Space Defense, 2022, 5(1): 94-101. |
[11] |
JI Xiukun (冀秀坤), HAI Jintao (海金涛), LUO Wenguang (罗文广), LIN Cuixia (林翠霞), XIONG Yu(熊 禹), OU Zengkai (殴增开), WEN Jiayan(文家燕). Obstacle Avoidance in Multi-Agent Formation Process Based on Deep Reinforcement Learning[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(5): 680-685. |
[12] |
LI Peng, RUAN Xiaogang, ZHU Xiaoqing, CHAI Jie, REN Dingqi, LIU Pengfei. A Regionalization Vision Navigation Method Based on Deep Reinforcement Learning[J]. Journal of Shanghai Jiao Tong University, 2021, 55(5): 575-585. |
[13] |
CHEN Tao, ZHANG Ying, HUANG Xiangsong. Adaptive Interference Waveform Design Based on Reinforcement Learning[J]. Air & Space Defense, 2021, 4(2): 59-. |
[14] |
LI Zheng, CHEN Jianwei, PENG Bo . UAV Cluster Path Planning Based on Pseudo-spectral Method[J]. Air & Space Defense, 2021, 4(1): 52-59. |
[15] |
HAN Honggui, YANG Shiheng, ZHANG Lu, QIAO Junfei. Optimal Control of Effluent Ammonia Nitrogen for Municipal Wastewater Treatment Process[J]. Journal of Shanghai Jiaotong University, 2020, 54(9): 916-923. |
|
|
|
|