Abstract:To resolve cooperative encirclement by multiple missiles against a manoeuvring target in three-dimensional space, this study proposed an impact-time-control cooperative guidance using proximal policy optimisation (PPO). Firstly, the impact-time-control cooperative guidance model was constructed based on the extended proportional guidance, and the cooperative guidance time error term was improved. Then, the state and action space models for the Markov Decision Process were designed, and the reward function was constructed as a variable-step model combining dense and sparse rewards. The cooperative guidance model was trained using PPO, mapping the guidance state information to the cooperative guidance law. Finally, a multiple-missile cooperative encirclement scenario was established, showcasing the cooperative guidance's ability to achieve model-free, end-to-end coordinated attack timing. Monte Carlo experiments further verified the robustness of its guidance in disturbed environments.
张婉滢, 司马珂, 张育禾, 孟健, 杨振, 周德云. 基于近端策略优化的多弹协同围捕机动目标制导控制方法[J]. 空天防御, 2025, 8(4): 94-103.
ZHANG Wanying, SIMA Ke, ZHANG Yuhe, MENG Jian, YANG Zhen, ZHOU Deyun. The Guidance and Control Method of Multi-Missile Cooperative Encirclement of Maneuvering Targets Based on Proximal Policy Optimization. Air & Space Defense, 2025, 8(4): 94-103.