依托平滑强化学习的铰接车轨迹跟踪方法

陈良发; 宋绪杰; 肖礼明; 高路路; 张发旺; 李升波; 马飞; 段京良

期刊检索

关键词检索

新闻公告MORE

【03-25】投稿请提供保密审查证明
【05-04】论文版权转让协议
【07-05】出版伦理声明
【04-04】告作者书
【07-11】审稿人的职责
【11-26】《哈尔滨工业大学学报》入选中国科技期刊卓越行动计划领军期刊
【10-17】《哈工大学报》入选“第5届中国精品科技期刊”
【12-30】《哈工大学报》入选“世界学术影响力Q2期刊”
【01-03】《哈工大学报》入选“2018中国国际影响力优秀学术期刊”
【11-01】哈工大学报荣获2016、2018、2020年度“中国高校百佳科技期刊奖”
【03-24】哈工大学报10篇论文入选中国精品科技期刊顶尖学术论文
【12-05】哈工大学报2024优秀审稿专家
【12-18】哈工大学报2023优秀审稿专家
【12-24】哈工大学报2022优秀审稿专家
【12-21】哈工大学报2021优秀审稿专家
【12-10】哈工大学报2020优秀审稿专家

主管单位 中华人民共和国
工业和信息化部 主办单位 哈尔滨工业大学主编李隆球 国际刊号ISSN 0367-6234 国内刊号CN 23-1235/T

期刊网站二维码

微信公众号二维码

引用本文:	陈良发,宋绪杰,肖礼明,高路路,张发旺,李升波,马飞,段京良.依托平滑强化学习的铰接车轨迹跟踪方法[J].哈尔滨工业大学学报,2024,56(12):116.DOI:10.11918/202310026
	CHEN Liangfa,SONG Xujie,XIAO Liming,GAO Lulu,ZHANG Fawang,LI Shengbo,MA Fei,DUAN Jingliang.Smooth reinforcement learning-based trajectory tracking for articulated vehicles[J].Journal of Harbin Institute of Technology,2024,56(12):116.DOI:10.11918/202310026

【打印本页】【HTML】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】【关闭】

过刊浏览高级检索

本文已被：浏览 137次下载 112次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
依托平滑强化学习的铰接车轨迹跟踪方法
陈良发¹,宋绪杰²,肖礼明¹,高路路¹,张发旺³,李升波²,马飞¹,段京良¹
(1.北京科技大学机械工程学院,北京 100083; 2. 清华大学车辆与运载学院,北京 100084;3.北京理工大学机械与车辆学院,北京 100081)

摘要:

为解决现有铰接车轨迹跟踪控制面临的动作波动问题,提高铰接车轨迹跟踪控制的精度以及平滑性,提出了一种考虑轨迹预瞄的平滑强化学习型跟踪控制方法。首先,为保证控制精度,通过将参考轨迹信息作为预瞄信息引入强化学习策略网络和值网络,构建了预瞄型强化学习迭代框架。然后,为保证控制平滑性,引入LipsNet网络结构近似策略函数,从而实现策略网络Lipschitz常数的自适应限制。最后,结合值分布强化学习理论,建立了最终的平滑强化学习型轨迹跟踪控制方法,实现了铰接车轨迹跟踪的控制精度和控制平滑性的协同优化。仿真结果表明,本研究提出的平滑强化学习跟踪控制方法（SDSAC）在6种不同噪声等级下均能保持良好的动作平滑能力,且具备较高跟踪精度；与传统值分布强化学习（DSAC）相比,在高噪声条件下,SDSAC动作平滑度提升超过5.8倍。此外,与模型预测控制相比,SDSAC的平均单步求解速度提升约60倍,具有较高的在线计算效率。

关键词: 自动驾驶铰接车轨迹跟踪强化学习动作平滑

DOI：10.11918/202310026

分类号:TP273+.1

基金项目:国家自然科学基金(52202487); 汽车安全与节能国家重点实验室开放基金(KF2212)

Smooth reinforcement learning-based trajectory tracking for articulated vehicles

CHEN Liangfa¹,SONG Xujie²,XIAO Liming¹,GAO Lulu¹,ZHANG Fawang³,LI Shengbo²,MA Fei¹,DUAN Jingliang¹

(1.School of Mechanical Engineering, University of Science and Technology Beijing, Beijing 100083, China; 2.School of Vehicle and Mobility, Tsinghua University, Beijing 100084, China; 3.School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China)

Abstract:

This research tackles the challenge of action fluctuation in articulated vehicle trajectory tracking control, aiming to enhance both accuracy and smoothness. It introduces a novel approach: a smooth tracking control methodology grounded in reinforcement learning (RL). Firstly, to improve the control accuracy, we incorporate trajectory preview information as input to both the policy and value networks and establish a predictive policy iteration framework. Then, to ensure control smoothness, we employ the LipsNet network to approximate the policy function, to realize the adaptive restriction of the Lipschitz constant of the policy network. Finally, coupled with distributional RL theory, we formulate an articulated vehicle trajectory tracking control method, named smooth distributional soft actor-critic (SDSAC), focusing on achieving synergistic optimization of both control precision and action smoothness. The simulation results demonstrate that the proposed method can maintain good action smoothing ability under six different noise levels, and has strong noise robustness and high tracking accuracy. Compared with traditional value distribution reinforcement learning distributional soft actor-critic (DSAC), SDSAC improves action smoothness by more than 5.8 times under high noise conditions. In addition, compared with model predictive control, SDSAC’s average single-step solution speed is improved by about 60 times, and it has higher online computing efficiency.

Key words: automatic drive articulated vehicle trajectory tracking reinforcement learning action smoothing

期刊检索

关键词检索

新闻公告MORE

友情链接LINKS