基于深度强化学习的导航信号自适应干扰决策方法

全文: PDF(2076 KB)
输出: BibTeX | EndNote (RIS)

摘要在复杂的电子对抗环境中，如何高效地选择干扰信号参数一直是关键挑战。本文提出了一种基于深度强化学习的导航干扰信号自适应决策方法，可将干扰信号参数优化建模为马尔可夫决策过程：通过合理设计状态空间、动作空间和奖励函数，并引入深度强化学习算法，智能体能够实现干扰参数的动态自适应优化，在环境变化下自主调整干扰策略，从而有效平衡干扰效果与资源利用效率。基于导引头数字仿真平台的全链路仿真结果表明，本文方法在动作空间为50和100的场景下均表现出良好的收敛性与适应性，最终干扰成功率达到99%，频段匹配成功率达到98%，接近穷举法的性能上界。进一步的奖励函数消融实验表明，设计的奖励函数能够有效引导智能体在干扰效果、频段选择与功率消耗之间实现合理权衡，从而形成高效、稳定且具有工程可行性的干扰决策策略。本文研究为导引头导航干扰技术的发展提供了新的思路，并可用于评估导弹导航信号的抗干扰能力边界。

	服务

	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章

关键词 ：导航干扰, 深度强化学习, 自适应决策

Abstract：In complex electronic countermeasure environments, efficient selection of jamming signal parameters has remained a critical challenge. This paper proposes an adaptive decision-making method for navigation signal jamming based on deep reinforcement learning, modeling the optimization of jamming signal parameters as a Markov Decision Process (MDP). By rationally designing the state space, action space and reward function, and introducing deep reinforcement learning algorithms, the agent can achieve dynamic adaptive optimization of jamming parameters and autonomously adjust jamming strategies under environmental changes, thereby effectively balancing jamming effectiveness and resource utilization efficiency. Full-link simulation results based on the missile seeker digital simulation platform demonstrate that the proposed method exhibits good convergence and adaptability in scenarios with action spaces of 50 and 100, achieving a final jamming success rate of 99% and frequency band matching success rate of 98%, approaching the performance upper bound of exhaustive search. Further reward function ablation experiments indicate that the designed reward function can effectively guide the agent to achieve reasonable trade-offs among jamming effectiveness, frequency band selection and power consumption, thus forming an efficient, stable and engineering-feasible jamming decision strategy. This research provides new ideas for the development of seeker navigation jamming technology, and can be used to evaluate the anti-jamming capability boundaries of missile navigation signals.

Key words： navigation jamming deep reinforcement learning （DRL） adaptive decision-making

收稿日期: 2025-10-27 出版日期: 2026-05-06

ZTFLH:

V 249

通讯作者: 孙卓然（1996—），男，博士，工程师。

作者简介: 袁景美（1984—），女，硕士，高级工程师。

引用本文:

袁景美, 赵亮, 孙卓然, 徐志朝, 牛亚雷. 基于深度强化学习的导航信号自适应干扰决策方法[J]. 空天防御, 2026, 9(2): 41-52.
YUAN Jingmei, ZHAO Liang, SUN Zhuoran, XU Zhizhao, NIU Yalei. Adaptive Jamming Decision Method for Navigation Signals Based on Deep Reinforcement Learning. Air & Space Defense, 2026, 9(2): 41-52.

链接本文:

https://www.qk.sjtu.edu.cn/ktfy/CN/ 或 https://www.qk.sjtu.edu.cn/ktfy/CN/Y2026/V9/I2/41