期刊检索

  • 2024年第56卷
  • 2023年第55卷
  • 2022年第54卷
  • 2021年第53卷
  • 2020年第52卷
  • 2019年第51卷
  • 2018年第50卷
  • 2017年第49卷
  • 2016年第48卷
  • 2015年第47卷
  • 2014年第46卷
  • 2013年第45卷
  • 2012年第44卷
  • 2011年第43卷
  • 2010年第42卷
  • 第1期
  • 第2期

主管单位 中华人民共和国
工业和信息化部
主办单位 哈尔滨工业大学 主编 李隆球 国际刊号ISSN 0367-6234 国内刊号CN 23-1235/T

期刊网站二维码
微信公众号二维码
引用本文:游洋彪,石繁槐.短边顶点回归网络:新型自然场景文本检测器[J].哈尔滨工业大学学报,2021,53(12):89.DOI:10.11918/201908104
YOU Yangbiao,SHI Fanhuai.Short edge vertices regression network: A new natural scene text detector[J].Journal of Harbin Institute of Technology,2021,53(12):89.DOI:10.11918/201908104
【打印本页】   【HTML】   【下载PDF全文】   查看/发表评论  下载PDF阅读器  关闭
过刊浏览    高级检索
本文已被:浏览 610次   下载 611 本文二维码信息
码上扫一扫!
分享到: 微信 更多
短边顶点回归网络:新型自然场景文本检测器
游洋彪,石繁槐
(同济大学 电子与信息工程学院,上海 201804)
摘要:
近年来许多基于通用目标检测框架的文本检测方法相继被提出,这些方法往往是直接预测文本的整个边界框,受网络感受野的限制而难以有效检测长文本。为改进长文本难以有效检测的问题,提出了基于短边顶点回归网络的文本检测方法。该方法将文本区域划分为3类区域,即两条短边附近的区域及中间区域,采用分离再组合的方式检测文本,不再直接预测文本的整个边界框。首先,在一个融合多层特征的残差网络上预测分割3类文本区域,同时还将在每个短边区域的像素点处预测与之邻近的一条短边的两个顶点。然后,在后处理过程中,利用文本中间区域与短边区域相邻的关系将文本两类短边区域进行组合,两类短边区域预测的短边顶点将随之结合,便能产生完整精确的文本检测结果。在一个长文本检测数据集和公开的MSRA-TD 500,ICDAR 2015及ICDAR 2013自然场景文本检测数据集上进行测试比较,该方法的精度与速度超过目前绝大部分方法。实验结果表明,该方法在文本检测,尤其是长文本检测,具有一定的优越性。
关键词:  自然场景  文本检测  卷积神经网络  感受野  长文本
DOI:10.11918/201908104
分类号:TP391.4
文献标识码:A
基金项目:上海市科技兴农重点攻关项目(沪农科创字(2018)第3-6号)
Short edge vertices regression network: A new natural scene text detector
YOU Yangbiao,SHI Fanhuai
(College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China)
Abstract:
In recent years, many scene text detection methods based on generic object detection framework have been proposed. These methods usually predict the entire bounding box of the text directly, while it is difficult for them to detect long text effectively due to the limit of receptive field. To solve such problem, a scene text detection method based on short edge vertices regression network was proposed. The method divides the text region into three kinds of regions, namely regions near two short edges and middle region, where separate text regions are firstly predicted and then combined, whereas the entire bounding box of the text is not predicted directly. Specifically, three kinds of regions were segmented on a residual network combined with multi-scale features, and two vertices of a short edge were predicted at each pixel in the region near the short edge. Then the regions near the two short edges were combined on the basis of the adjacent relationship between middle region and short edge regions in the post process, and vertices of short edges predicted by the two regions near short edges were combined to generate complete and accurate detection results. Finally, experiments were performed on a long text detection dataset and several public scene text detection datasets such as MSRA-TD 500, ICDAR 2015, and ICDAR 2013. The proposed method outperformed most of existing methods in accuracy and speed. Experimental results demonstrate that the method has advantages in text detection, especially for long text.
Key words:  natural scene  text detection  convolutional neural network  receptive field  long text

友情链接LINKS