一种基于改进RT-MDNet的全景视频目标跟踪算法

王殿伟; 方浩宇; 刘颖; 伍世虔; 谢永军; 宋海军

期刊检索

关键词检索

新闻公告MORE

【03-25】投稿请提供保密审查证明
【05-04】论文版权转让协议
【07-05】出版伦理声明
【04-04】告作者书
【07-11】审稿人的职责
【11-26】《哈尔滨工业大学学报》入选中国科技期刊卓越行动计划领军期刊
【10-17】《哈工大学报》入选“第5届中国精品科技期刊”
【12-30】《哈工大学报》入选“世界学术影响力Q2期刊”
【01-03】《哈工大学报》入选“2018中国国际影响力优秀学术期刊”
【11-01】哈工大学报荣获2016、2018、2020年度“中国高校百佳科技期刊奖”
【03-24】哈工大学报10篇论文入选中国精品科技期刊顶尖学术论文
【12-05】哈工大学报2024优秀审稿专家
【12-18】哈工大学报2023优秀审稿专家
【12-24】哈工大学报2022优秀审稿专家
【12-21】哈工大学报2021优秀审稿专家
【12-10】哈工大学报2020优秀审稿专家

主管单位 中华人民共和国
工业和信息化部 主办单位 哈尔滨工业大学主编李隆球 国际刊号ISSN 0367-6234 国内刊号CN 23-1235/T

期刊网站二维码

微信公众号二维码

引用本文:	王殿伟,方浩宇,刘颖,伍世虔,谢永军,宋海军.一种基于改进RT-MDNet的全景视频目标跟踪算法[J].哈尔滨工业大学学报,2020,52(10):152.DOI:10.11918/201910175
	WANG Dianwei,FANG Haoyu,LIU Ying,WU Shiqian,XIE Yongjun,SONG Haijun.Improved RT-MDNet for panoramic video target tracking[J].Journal of Harbin Institute of Technology,2020,52(10):152.DOI:10.11918/201910175

【打印本页】【HTML】【下载PDF全文】【查看/发表评论】【下载PDF阅读器】【关闭】

过刊浏览高级检索

本文已被：浏览 968次下载 876次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
一种基于改进RT-MDNet的全景视频目标跟踪算法
王殿伟¹,方浩宇¹,刘颖¹,伍世虔²,谢永军³,宋海军³
(1.西安邮电大学通信与信息工程学院,西安 710121; 2.武汉科技大学信息科学与工程学院,武汉 430081; 3.中国科学院西安光学精密机械研究所,西安 710119)

摘要:

为了解决全景视频目标跟踪过程中,由于光照条件变化、相似背景干扰、目标运动时产生的形变和尺度变化等因素的影响,在跟踪中会出现目标漂移、目标丢失等情况,进而导致目标跟踪算法成功率低,鲁棒性差等问题,提出一种基于长短期记忆网络和改进Real-Time MDNet网络的全景视频目标跟踪方法.算法首先采用浅层卷积神经网络提取特征,并利用自适应的RoIAlign减少特征提取过程中的像素损耗,而后运用目标特征在线更新最后一个全连接层的权重,在全连接层中实现前景背景分离并提取出目标区域,然后通过长短期记忆网络自适应地选取目标框的尺度,最终输出目标位置信息.实验结果表明:单目算法应用在全景数据集时,难以适应全景中的尺度变化和背景变化,改进算法利用3层长短期记忆网络构建的尺度预测模块,可以有效地应对全景数据存在的尺度变化和目标形变问题,在保持较好的跟踪精度的同时,可以有效地应对目标跟踪中出现的小目标、目标遮挡、多目标交叉运动的情况,获得更好的视觉效果和更高的重叠率得分.

关键词: 目标跟踪深度学习全景视频长短期记忆网络 RT-MDNet

DOI：10.11918/201910175

分类号:TP391.41;TP183

文献标识码:A

基金项目:公安部科技强警基础研究专项项目(2019GABJC42)；陕西省自然科学基础研究计划(创新创业 “双导师”)研究项目(2018JM6118)；西安邮电大学研究生创新基金(CXJJLY2018033)

Improved RT-MDNet for panoramic video target tracking

WANG Dianwei¹,FANG Haoyu¹,LIU Ying¹,WU Shiqian²,XIE Yongjun³,SONG Haijun³

(1.School of Communications and Information Engineering, Xi’an University of Posts and Telecommunications, Xi’an 710121, China; 2.School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China; 3.Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, China)

Abstract:

In the process of panoramic video target tracking, the target deformation and scale changes caused by light change, interference of similar background, and object moving may result in target drift or missing, leading to low success rate and poor robustness. To address these issues, a target tracking method based on long short-term memory (LSTM) network and improved Real-Time MDNet (RT-MDNet) network was proposed. First, shallow convolution neural network was utilized to extract features, and adaptive RoIAlign was adopted to reduce pixel loss in the convolution process. Then, the weight of the last layer of the full connection layers was updated online by utilizing the target features to achieve foreground background separation and extract the target area. Lastly, the scale of the target box was selected adaptively by means of LSTM, and the target position information was thus obtained. Experimental results show that monocular vision algorithm could hardly adapt to the scale change and background change when applied in panoramic dataset, while the proposed method that utilizes 3-layer LSTM network to construct scale prediction module could effectively solve these problems. The algorithm can efficiently deal with the situations of small target, target occlusion, and cross motion of multiple targets in target tracking while maintaining accuracy, achieving better visual effect and higher overlap rate score.

Key words: target tracking deep learning panoramic video LSTM RT-MDNet

期刊检索

关键词检索

新闻公告MORE

友情链接LINKS