基于强化学习的无人机协作防守策略设计与验证

Abstract
Figure/Table
References
Related Citation (1)

Download: PDF (1727 KB) (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract The drone swarm confrontation is built based on the OODA decision loop and employs multi-agent deep reinforcement learning for algorithm design to find the optimal collaborative defence strategy for drone swarm. Specifically, a QMIX-based single-layer decision algorithm is developed to tackle contribution allocation and high-dimensional space challenges in drone cooperation. In this paper, a hierarchical decision-making model integrating rule-based methods and reinforcement learning was proposed. This model first adopted a decision layer with rule-based or HMM intention recognition to analyze combat scenarios and schedule drones, followed by an action layer utilizing the QMIX algorithm to output actions. To verify the performance of the proposed algorithms, this study established a controllable and observable simulation platform using Python and Unity and produced a challenging defensive game problem. Experiments quantitatively evaluated defence strategies in perspectives of cooperation effectiveness, resource efficiency, and generalisation. The results show that each index of hierarchical decision-making is significantly better than that of single-layer decision making, and the winning rate has been dramatically improved. The HMM-based hierarchical strategy shows the best performance, offering a promising new approach to drone swarm defence.

Key words： drone swarm cooperative defence multi-agent reinforcement learning simulation platform hierarchical decision

Received: 05 March 2025 Published: 15 July 2025

ZTFLH:

V 279

	Service

	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors

Cite this article:

URL:

https://www.qk.sjtu.edu.cn/ktfy/EN/ OR https://www.qk.sjtu.edu.cn/ktfy/EN/Y2025/V8/I3/73