|
|
Design and Verification of UAV Cooperative Defense Strategy Based on Reinforcement Learning |
LI Yijia, LI Jianuo, KE Liangjun |
School of Automation Science and Engineering, Xi'an Jiaotong University, Xi'an 710049, Shaanxi, China |
|
|
Abstract The drone swarm confrontation is built based on the OODA decision loop and employs multi-agent deep reinforcement learning for algorithm design to find the optimal collaborative defence strategy for drone swarm. Specifically, a QMIX-based single-layer decision algorithm is developed to tackle contribution allocation and high-dimensional space challenges in drone cooperation. In this paper, a hierarchical decision-making model integrating rule-based methods and reinforcement learning was proposed. This model first adopted a decision layer with rule-based or HMM intention recognition to analyze combat scenarios and schedule drones, followed by an action layer utilizing the QMIX algorithm to output actions. To verify the performance of the proposed algorithms, this study established a controllable and observable simulation platform using Python and Unity and produced a challenging defensive game problem. Experiments quantitatively evaluated defence strategies in perspectives of cooperation effectiveness, resource efficiency, and generalisation. The results show that each index of hierarchical decision-making is significantly better than that of single-layer decision making, and the winning rate has been dramatically improved. The HMM-based hierarchical strategy shows the best performance, offering a promising new approach to drone swarm defence.
|
Received: 05 March 2025
Published: 15 July 2025
|
|
|
|
|
|
|
|