PYNQ框架的高精度异构无预选框检测模型实现
CSTR:
作者:
作者单位:

(1.复杂航天系统电子信息技术重点实验室(中国科学院国家空间科学中心),北京 100190; 2.中国科学院大学,北京 100049)

作者简介:

张瑞琰(1995—),女,博士研究生; 姜秀杰(1965—),女,研究员,博士生导师

通讯作者:

姜秀杰,jiangxj@nssc.ac.cn

中图分类号:

TP391.4

基金项目:

中国科学院复杂航天系统电子信息技术重点实验室自主部署基金(Y42613A32S)


Realization of high-precision heterogeneous anchor-free detection model based on PYNQ framework
Author:
Affiliation:

(1.Key Laboratory of Electronics and Information Technology for Space Systems(National Space Science Center, Chinese Academy of Sciences), Beijing 100190, China; 2.University of Chinese Academy of Sciences, Beijing 100049, China)

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    由于深度卷积网络的参数量及计算量过大,多尺度目标检测网络难以快速高精度地部署在许多资源及功耗受限的平台上。为解决此问题,本文基于Python productivity for ZYNQ(PYNQ)框架实现了无预选框检测模型CTiny的IP核设计及异构系统架构部署。首先,提出在卷积核中分段量化整体缩放系数的方式,使得预训练的高精度算法低损地部署于可编程门阵列(field programmable gate array,FPGA)上;其次,基于PYNQ框架实现了CTiny模型的系统搭建,包含ResNet主干网络、反卷积网络和分支检测网络;最后,将图片预处理及后处理等耗时计算从串行的ARM端移入并行的FPGA中,进一步缩减了总处理时长。实验结果表明:在PYNQ-Z2开发板上部署CTiny模型后,本文所提量化方式在公开光学遥感数据集NWPU VHR-10的平均检测精度达到81.60%,相较于截断量化提升了14.27%,实现了部署精简无预选框检测网络的精度低损耗的需求,且后处理的处理时长由ARM端的9.228 s缩减为了FPGA端的0.008 s,提高了检测模型的速度。

    Abstract:

    Due to the large number of parameters and large amount of calculation of deep convolutional networks, it is difficult to quickly and accurately deploy multi-scale target detection networks on many platforms with limited resources and power consumption. To solve this problem, based on the Python productivity for ZYNQ (PYNQ) framework, this paper realizes the IP core design and heterogeneous system architecture deployment of CTiny model, which is an anchor-free object detection model. First, a method of segmental quantization of the overall scaling factors in the convolution kernel was proposed, so that the pre-trained high-precision algorithm could be deployed on the field programmable gate array (FPGA) with low loss. Then, the system of the CTiny model was constructed based on the PYNQ framework, including ResNet backbone network, deconvolution network, and branch detection network. Finally, the time-consuming calculation such as picture preprocessing and post-processing was moved from serial ARM to parallel FPGA, further reducing the total processing time. Experimental results show that after deploying the CTiny model on the PYNQ-Z2 development board, the proposed quantization method achieved a mean average precision of 81.60% in the public optical remote sensing dataset NWPU VHR-10, which increased by 14.27% than truncated quantization. It has realized the requirement of deploying a tiny anchor-free object detection network with low loss. In addition, the processing time of post-processing was reduced from 9.228 s on the ARM side to 0.008 s on the FPGA side, which improved the speed of the detection model.

    参考文献
    相似文献
    引证文献
引用本文

张瑞琰,姜秀杰,安军社,崔天舒. PYNQ框架的高精度异构无预选框检测模型实现[J].哈尔滨工业大学学报,2022,54(5):24. DOI:10.11918/202111015

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-11-04
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2022-04-25
  • 出版日期:
文章二维码