基于近端策略优化的多弹协同围捕机动目标制导控制方法

Abstract
Figure/Table
References
Related Citation (15)

Download: PDF (4210 KB) (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract To resolve cooperative encirclement by multiple missiles against a manoeuvring target in three-dimensional space, this study proposed an impact-time-control cooperative guidance using proximal policy optimisation (PPO). Firstly, the impact-time-control cooperative guidance model was constructed based on the extended proportional guidance, and the cooperative guidance time error term was improved. Then, the state and action space models for the Markov Decision Process were designed, and the reward function was constructed as a variable-step model combining dense and sparse rewards. The cooperative guidance model was trained using PPO, mapping the guidance state information to the cooperative guidance law. Finally, a multiple-missile cooperative encirclement scenario was established, showcasing the cooperative guidance's ability to achieve model-free, end-to-end coordinated attack timing. Monte Carlo experiments further verified the robustness of its guidance in disturbed environments.

Key words： multi-missile coordination guidance law reinforcement learning proximal policy optimization variable step size reward function

Received: 27 April 2025 Published: 10 September 2025

ZTFLH:

V 448

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors

Cite this article:

URL:

https://www.qk.sjtu.edu.cn/ktfy/EN/ OR https://www.qk.sjtu.edu.cn/ktfy/EN/Y2025/V8/I4/94