期刊检索

  • 2024年第56卷
  • 2023年第55卷
  • 2022年第54卷
  • 2021年第53卷
  • 2020年第52卷
  • 2019年第51卷
  • 2018年第50卷
  • 2017年第49卷
  • 2016年第48卷
  • 2015年第47卷
  • 2014年第46卷
  • 2013年第45卷
  • 2012年第44卷
  • 2011年第43卷
  • 2010年第42卷
  • 第1期
  • 第2期

主管单位 中华人民共和国
工业和信息化部
主办单位 哈尔滨工业大学 主编 李隆球 国际刊号ISSN 0367-6234 国内刊号CN 23-1235/T

期刊网站二维码
微信公众号二维码
引用本文:李永丰,史静平,章卫国,蒋维.深度强化学习的无人作战飞机空战机动决策[J].哈尔滨工业大学学报,2021,53(12):33.DOI:10.11918/202005108
LI Yongfeng,SHI Jingping,ZHANG Weiguo,JIANG Wei.Maneuver decision of UCAV in air combat based on deep reinforcement learning[J].Journal of Harbin Institute of Technology,2021,53(12):33.DOI:10.11918/202005108
【打印本页】   【HTML】   【下载PDF全文】   查看/发表评论  下载PDF阅读器  关闭
过刊浏览    高级检索
本文已被:浏览 1245次   下载 1457 本文二维码信息
码上扫一扫!
分享到: 微信 更多
深度强化学习的无人作战飞机空战机动决策
李永丰1,史静平1,2,章卫国1,2,蒋维1
(1. 西北工业大学 自动化学院,西安 710029;2.陕西省飞行控制与仿真技术重点实验室(西北工业大学),西安 710029)
摘要:
无人作战飞机(unmanned combat aerial vehicle,UCAV)在进行空战自主机动决策时,面临大规模计算,易受敌方不确定性操纵的影响。针对这一问题,提出了一种基于深度强化学习算法的无人作战飞机空战自主机动决策模型。利用该算法,无人作战飞机可以在空战中自主地进行机动决策以获得优势地位。首先,基于飞机控制系统,利用MATLAB/Simulink仿真平台搭建了六自由度无人作战飞机模型,选取适当的空战动作作为机动输出。在此基础上,设计了无人作战飞机空战自主机动的决策模型,通过敌我双方的相对运动构建作战评估模型,分析了导弹攻击区的范围,将相应的优势函数作为深度强化学习的评判依据。之后,对无人作战飞机进行了由易到难的分阶段训练,并通过对深度Q网络的研究分析了最优机动控制指令。从而无人作战飞机可以在不同的态势情况下选择相应的机动动作,独立评估战场态势,做出战术决策,以达到提高作战效能的目的。仿真结果表明,该方法能使无人作战飞机在空战中自主的选择战术动作,快速达到优势地位,极大地提高了无人作战飞机的作战效率。
关键词:  无人作战飞机  深度强化学习  空战自主机动决策  六自由度  优势函数  深度Q网络
DOI:10.11918/202005108
分类号:V279
文献标识码:A
基金项目:国家自然科学基金(7,6,61573286);陕西省自然科学基金(2019JM-3,0JQ-218)
Maneuver decision of UCAV in air combat based on deep reinforcement learning
LI Yongfeng1,SHI Jingping1,2,ZHANG Weiguo1,2,JIANG Wei1
(1.School of Automation,Northwestern Polytechnical University, Xian 710029, China; 2. Shaanxi Provincial Key Laboratory of Flight Control and Simulation Technology (Northwestern Polytechnical University), Xian 710029, China)
Abstract:
When an unmanned combat aerial vehicle (UCAV) is making the decision of autonomous maneuver in air combat, it faces large-scale calculation and is susceptible to the uncertain manipulation of the enemy. To tackle such problems, a decision-making model for autonomous maneuver of UCAV in air combat was proposed based on deep reinforcement learning algorithm in this study. With this algorithm, the UCAV can autonomously make maneuver decisions during air combat to achieve dominant position. First, based on the aircraft control system, a six-degree-of-freedom UCAV model was built using MATLAB/Simulink simulation platform, and the appropriate air combat action was selected as the maneuver output. On this basis, the decision-making model for the autonomous maneuver of UCAV in air combat was designed. Through the relative movement of both sides, the operational evaluation model was constructed. The range of the missile attack area was analyzed, and the corresponding advantage function was taken as the evaluation basis of the deep reinforcement learning. Then, the UCAV was trained by stages from the easy to the difficult, and the optimal maneuver control command was analyzed by investigating the deep Q network. Thereby, the UCAV could select corresponding maneuver actions in different situations and evaluate the battlefield situation independently, making tactical decisions and achieving the purpose of improving combat effectiveness. Simulation results suggest that the proposed method can make UCAV choose the tactical action independently in air combat and reach the dominant position quickly, which greatly improves the combat efficiency of the UCAV.
Key words:  unmanned combat aerial vehicle (UCAV)  deep reinforcement learning  autonomous maneuver decision in air combat  six-degree-of-freedom  advantage function  deep Q network

友情链接LINKS