引用本文: | 潘耀宗,张健,杨海涛,袁春慧,赵洪利.战机自主作战机动双网络智能决策方法[J].哈尔滨工业大学学报,2019,51(11):144.DOI:10.11918/j.issn.0367-6234.201811083 |
| PAN Yaozong,ZHANG Jian,YANG Haitao,YUAN Chunhui,ZHAO Hongli.Dual network intelligent decision method for fighter autonomous combat maneuver[J].Journal of Harbin Institute of Technology,2019,51(11):144.DOI:10.11918/j.issn.0367-6234.201811083 |
|
摘要: |
在基于深度强化学习(Deep Reinforcement Learning,DRL)的战机自主作战机动决策研究中,战机向攻击区域的自主机动是战机对目标进行有效打击的前提条件.然而,战机活动空域大、各向探索能力不均匀,直接利用DRL获取机动策略面临着训练交互空间大、攻击区域样本分布设置困难,进而训练过程难以收敛.针对该问题,提出了一种基于深度Q网络(Deep Q-Network,DQN)的双网络智能决策方法.通过在战机正前方设置锥形空间,充分利用战机前向探索性能;建立角度捕获网络,利用DRL对战机偏离角调整策略进行拟合,实现偏离角自主调整,使攻击区域处于战机正前方的锥形空间内;建立距离捕获网络,在锥形空间内利用DRL对战机向攻击区域机动策略进行拟合,实现其向攻击区域的有效机动.实验结果表明,以战机活动空域作为交互空间直接引用DRL,不能有效解决战机向攻击区域机动的决策问题;采用基于DRL的双网络决策方法,在1 000次战机自主向攻击区域机动的测试中成功率达到了83.2%,有效解决了战机向己方攻击区域自主机动的决策问题. |
关键词: 战机机动决策 深度强化学习 神经网络 深度Q网络 智能决策 |
DOI:10.11918/j.issn.0367-6234.201811083 |
分类号:V323 |
文献标识码:A |
基金项目: |
|
Dual network intelligent decision method for fighter autonomous combat maneuver |
PAN Yaozong1,2,ZHANG Jian1,YANG Haitao1,YUAN Chunhui1,ZHAO Hongli1
|
(1.Space Engineering University, Beijing 101400, China; 2.Naval Aeronautical University,Yantai 264001, Shandong, China)
|
Abstract: |
In the research of autonomous combat maneuver of fighter based on Deep Reinforcement Learning (DRL), the fighter’s autonomous maneuver to attack area is the precondition for the fighter to attack the target effectively. Because of the large active airspace and the uneven exploration ability in all directions, the direct use of DRL to acquire maneuvering strategy is confronted with the problems of large training interaction space, difficulty in setting sample distribution in attack area, and difficulty in the convergence of training process. To solve this problem, a dual network intelligent decision method based on deep Q-network (DQN) was proposed. In this method, a conical space was set up in front of the fighter to make full use of the forward exploratory performance of the fighter. With the establishment of an angle capture network, DRL was used to fit the strategy of adjusting the deviation angle to keep the attack area in the conical space, and a distance capture network was established to fit the maneuvering strategy of the fighter to the attack area based on DRL. Simulation results show that the traditional DRL method cannot effectively solve the decision-making problem of the fighter’s maneuvering to the attack area by using the active airspace of the fighter as the interactive space, whereas the success rate of the dual network decision method was 83.2% in 1 000 tests of the fighter’s autonomous maneuvering to the attack area. Therefore, the proposed method can effectively solve the decision problem of autonomous maneuvering of fighter aircraft to the attack area. |
Key words: fighter maneuver decision deep reinforcement learning neural network deep Q-network intelligent decision |