引用本文: | 周大可,宋荣,杨欣.结合双重注意力机制的遮挡感知行人检测[J].哈尔滨工业大学学报,2021,53(9):156.DOI:10.11918/201904144 |
| ZHOU Dake,SONG Rong,YANG Xin.Occlusion-aware pedestrian detection combined with dual attention mechanism[J].Journal of Harbin Institute of Technology,2021,53(9):156.DOI:10.11918/201904144 |
|
摘要: |
针对行人检测算法在交通场景下应用时的遮挡问题,提出一种结合双重注意力机制的遮挡感知行人检测算法。以RetinaNet作为基础框架,在回归和分类支路分别添加空间注意力和通道注意力子网络,增强网络对于行人可见区域的关注;同时引入行人可见边界框信息对传统的回归损失函数进行优化,使其能够随着遮挡程度自适应地调节预测框贡献的权重。在Caltech和CityPerson数据集上的实验结果表明:相较于RetinaNet等8种先进算法,该方法具有较好的鲁棒性和检测精度,尤其是严重遮挡情况下,该算法的对数平均漏检率仅为45.69%,小于其他算法12%以上;此外,该算法能够实现准实时检测,在Caltech和CityPerson上的检测速度分别为11.8 帧/s和10.0 帧/s。所提出的双重注意力机制和遮挡感知回归损失函数的检测方法具有可行性和有效性,对于遮挡行人的处理有显著优势。 |
关键词: 行人检测 卷积神经网络 注意力机制 遮挡 实时 |
DOI:10.11918/201904144 |
分类号:TP399 |
文献标识码:A |
基金项目:国家自然科学基金(61573182);南京航空航天大学研究生创新基地(实验室)开放基金(kfjj20180319) |
|
Occlusion-aware pedestrian detection combined with dual attention mechanism |
ZHOU Dake1,2,SONG Rong1,YANG Xin1
|
(1.College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211100, China; 2.Jiangsu Key Laboratory of Internet of Things and Control Technologies (Nanjing University of Aeronautics and Astronautics), Nanjing 211100, China)
|
Abstract: |
To address the occlusion problem of pedestrian detection algorithm when applied in traffic scenarios, this paper presents an occlusion-aware algorithm combined with dual attention mechanism for pedestrian detection. Based on the RetinaNet framework, the spatial-wise attention mechanism and channel-wise attention mechanism were utilized in regression and classification branches respectively, guiding the detector to pay more attention to the visible parts of pedestrians. Moreover, visible bounding box information of pedestrians was introduced to optimize the traditional regression loss function, so that it can adaptively adjust the weights of predicted boxes according to the degree of occlusion. Experiments on Caltech and CityPerson datasets show that the proposed algorithm had better robustness and higher accuracy than other eight advanced algorithms such as RetinaNet. Especially in the case of heavy occlusion, the log-average miss rate of the proposed algorithm was only 45.69%, which was 12% lower than those of other algorithms. Furthermore, the proposed algorithm could detect pedestrians in quasi-real-time. It processed 11.8 frames per second on Caltech dataset and 10.0 frames per second on CityPerson dataset. The detection methods of dual attention mechanism and occlusion-aware regression loss function proposed in this paper are feasible and effective, and have significant advantages for the processing of occluded pedestrians. |
Key words: pedestrian detection convolutional neural networks attention mechanism occlusion real-time |