Please wait a minute...
浙江大学学报(工学版)
普适计算与人机交互     
多层次MSER自然场景文本检测
唐有宝, 卜巍, 邬向前
1. 哈尔滨工业大学 计算机科学与技术学院, 黑龙江 哈尔滨 150001;
2. 哈尔滨工业大学 媒体技术与艺术系, 黑龙江 哈尔滨 150001
Natural scene text detection based on multi level MSER
TANG You bao, BU Wei, WU Xiang qian
1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China;
2. Department of New Media Technologies and Arts, Harbin Institute of Technology, Harbin 150001, China
 全文: PDF(1556 KB)   HTML
摘要:

提出一种新的基于多层次最大稳定极值区域(MSER)的自然场景文本检测方法,其由候选区域的提取和文本检测组成.在候选区域提取过程中,采用多层次MSER区域提取方法:通过对原始图像进行多个颜色空间变换和多尺度放缩得到多个变换后的图像,采用多个阈值对其进行MSER区域检测,并将检测到的区域作为候选区域用于文本检测.检测过程中,对候选区域提取手工设计的底层特征和基于卷积神经网络(CNN)的深层特征,训练一个随机森林回归器对特征进行分类得到字符区域,再将其合并成单词区域,并进行相似的特征提取和分类,从而得到最终的文本检测结果.使用2个标准的数据库(ICDAR2011和ICDAR2013)对提出的方法进行性能评价,F指标在ICDAR2011和ICDAR2013上均为0.79,表明了所提出的自然场景文本检测方法的有效性.

Abstract:

A novel scene text detection method based on multilevel maximally stable extremal regions (MSER) was proposed, which consisted of two main stages, including candidate regions extraction and text regions detection. In the stage of candidate regions extraction, a multilevel MSER region extraction technique was developed by considering multiple color spaces, multiple scale transformations of original image and multiple thresholds of MSER detection. All extracted regions from the input image were used as candidate character regions for text region detection. In the stage of text detection, the handdesigned bottom features and CNN based features were extracted for each candidate character region as first, then a random forest regressor trained from training datasets was used to get the character regions. After that, the character regions were merged to form candidate word regions, from which the features were extracted and classified to get the final text detection results by using the similar process of candidate character region classification. The proposed method was evaluated on two standard benchmark datasets, including ICDAR2011 and ICDAR2013, and both got the Fmeasure performance of 0.79, respectively, Which demonstrates the effectiveness of the proposed natural scene text detection method.

出版日期: 2016-06-01
:     
基金资助:

 国家自然科学基金资助项目(61073125, 61350004);中央高校基本科研业务费专项资金资助项目(HIT.NSRIF.2013091, HIT.HSS. 201407).

通讯作者: 卜巍,女,副教授. ORCID:0000000299963733.     E-mail: buwei@hit.edu.cn
作者简介: 唐有宝(1987—),男,博士生,从事图像处理、模式识别、计算机视觉及生物特征识别研究. ORCID:0000000187193375. E-mail: tangyoubao@hit.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  

引用本文:

唐有宝, 卜巍, 邬向前. 多层次MSER自然场景文本检测[J]. 浙江大学学报(工学版), 10.3785/j.issn.1008973X.2016.06.017.

TANG You bao, BU Wei, WU Xiang qian. Natural scene text detection based on multi level MSER. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 10.3785/j.issn.1008973X.2016.06.017.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008973X.2016.06.017        http://www.zjujournals.com/eng/CN/Y2016/V50/I6/1134

[1] SHAHAB A, SHAFAIT F, DENGEL A. ICDAR 2011 robust reading competition challenge 2: reading text in scene images [C] ∥ Proceeding of International Conference on Document Analysis and Recognition. Beijing: IEEE, 2011: 1491-1496.
[2] KARATZAS D, SHAFAIT F, UCHIDA S, et al. ICDAR 2013 robust reading competition [C] ∥ Proceeding of International Conference on Document Analysis and Recognition. Washington: IEEE, 2013: 1484-1493.
[3] YE Q, DOERMANN D. Text detection and recognition in imagery: a survey [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 37(7):1480-1500.
[4] CHEN X, YUILLE A. Detecting and reading text in natural scenes [C] ∥ Proceeding of IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE, 2004: 366-373.
[5] WANG K, BABENKO B, BELONGIE S. Endtoend scene text recognition [C] ∥ Proceeding of International Conference on Computer Vision. Barcelona: IEEE, 2011: 1457-1464.
[6] MISHRA A, ALAHARI K, JAWAHAR C. Topdown and bottomup cues for scene text recognition [C] ∥ Proceeding of IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 2687-2694.
[7] JADERBERG M, VEDALDI A, ZISSERMAN A. Deep features for text spotting [C] ∥ Proceeding of European Conference on Computer Vision. Zurich: Springer, 2014: 512-528.
[8] EPSHTEIN B, OFEK E, WEXLER Y. Detecting text in natural scenes with stroke width transform [C] ∥ Proceeding of IEEE Conference on Computer Vision and Pattern Recognition. San Francisco: IEEE, 2010:2963-2970.
[9] MATAS J, CHUM O, URBAN M, et al. Robust wide baseline stereo from maximally stable extremal regions [C] ∥ Proceeding of British Machine Vision Conference. Cardiff: Elsevier, 2002: 761-767.
[10] HUANG W, LIN Z, YANG J, et al. Text localization in natural images using stroke feature transform and text covariance descriptors [C] ∥ Proceeding of International Conference on Computer Vision. Sydney: IEEE, 2013: 1241-1248.
[11] NEUMANN L, MATAS J. Scene text localization and recognition with oriented stroke detection [C] ∥ Proceeding of International Conference on Computer Vision. Sydney: IEEE, 2013: 97-104.
[12] YAO C, BAI X, LIU W, et al. Detecting texts of arbitrary orientations in natural images [C] ∥ Proceeding of IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 1083-1090.
[13] YAO C, BAI X, LIU W. A unified framework for multioriented text detection and recognition [J]. IEEE Transactions on Image Processing, 2014, 23(11):4737-4749.
[14] LI Y, JIA W, SHEN C, et al. Characterness: An indicator of text in the wild [J]. IEEE Transactions on Image Processing, 2014, 23(4): 1666-1677.
[15] HUANG W, QIAO Y, TANG X. Robust scene text detection with convolution neural network induced MSER trees [C] ∥ Proceeding of European Conference on Computer Vision. Zurich: Springer, 2014: 497-511.
[16] NEUMANN L, MATAS J. Realtime scene text localization and recognition [C] ∥ Proceeding of IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 3538-3545.
[17] YIN X, YIN X, HUANG K, et al. Robust text detection in natural scene images [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(5): 970-983.
[18] NEUMANN L, MATAS J. A method for text localization and recognition in realworld images [C] ∥ Proceeding of Asian Conference on Computer Vision. Queenstown: Springer, 2010: 770-783.
[19] ZAMBERLETTI A, NOCE L, GALLO I. Text localization based on fast feature pyramids and multiresolution maximally stable extremal regions [C] ∥ Proceeding of ACCV Workshops on Robust Reading. Singapore: Springer, 2014: 91-105.
[20] KOO H, KIM D. Scene text detection via connected component clustering and nontext filtering [J]. IEEE Transactions on Image Processing, 2013, 22(6):2296-2305.
[21] KANG L, LI Y, DOERMANN D. Orientation robust text line detection in natural images [C] ∥ Proceeding of IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 4034-4041.
[22] WANG T, WU D, COATES A, et al. Endtoend text recognition with convolutional neural networks [C] ∥ Proceeding of International Conference on Pattern Recognition. Tsukuba: IEEE, 2012: 3304-3308.
[23] ZHANG Q, XU L, JIA J. 100+ times faster weighted median filter (WMF) [C] ∥ Proceeding of IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 2830-2837.
[24] KRIZHEVSKY A, HINTON G. Convolutional deep belief networks on cifar10 [J]. Unpublished manuscript, 2010, 40.
[25] BREIMAN L. Random forests [J]. Machine learning, 2001, 45(1): 532.

[1] 董凯, 赖俊英, 钱晓倩, 詹树林, 阮方. 夏热冬冷地区居住建筑水平式外遮阳节能效果[J]. 浙江大学学报(工学版), 2016, 50(8): 1431-1437.
[2] 李佳琦, 范利武, 俞自涛. 超亲水表面在淬火冷却过程中的沸腾传热特性[J]. 浙江大学学报(工学版), 2016, 50(8): 1493-1498.
[3] 江衍铭, 张建全, 明焱. 集合神经网络的洪水预报[J]. 浙江大学学报(工学版), 2016, 50(8): 1471-1478.
[4] 钟崴, 彭梁, 周永刚, 徐剑, 从飞云. 基于小波包分析和支持向量机的锅炉结渣诊断[J]. 浙江大学学报(工学版), 2016, 50(8): 1499-1506.
[5] 夏玉峰, 任莉, 叶彩红, 王力. 基于RSM的立柱加强板定位布局多目标优化[J]. 浙江大学学报(工学版), 2016, 50(8): 1600-1607.
[6] 李林玉, 吴张华, 余国瑶, 戴巍, 罗二仓. 直线压缩机电声转换特性的实验[J]. 浙江大学学报(工学版), 2016, 50(8): 1529-1536.
[7] 曲巍崴, 唐伟, 毕运波, 李少波, 罗水均. 避免强迫装配和提升效率的预连接工艺规划[J]. 浙江大学学报(工学版), 2016, 50(8): 1561-1569.
[8] 胡小东, 顾临怡, 张范蒙. 应用于数字变量马达的高速开关阀[J]. 浙江大学学报(工学版), 2016, 50(8): 1551-1560.
[9] 杨姝, 刘国平, 亓昌, 王大志. 金属空心球梯度泡沫结构抗冲击特性仿真与优化[J]. 浙江大学学报(工学版), 2016, 50(8): 1593-1599.
[10] 杨章, 童根树, 张磊. 对称布置2根单侧加劲肋的有效刚度[J]. 浙江大学学报(工学版), 2016, 50(8): 1446-1455.
[11] 蒋翔, 童根树, 张磊. 耐火钢-混凝土组合梁抗火性能试验[J]. 浙江大学学报(工学版), 2016, 50(8): 1463-1470.
[12] 单华峰, 夏唐代, 俞峰, 胡军华, 潘金龙. 地下增层开挖托换桩的屈曲稳定临界荷载分析[J]. 浙江大学学报(工学版), 2016, 50(8): 1425-1430.
[13] 辜天来,张帅,郑耀. 咽式进气道/等直隔离段的反压特性[J]. 浙江大学学报(工学版), 2016, 50(7): 1418-1424.
[14] 程时伟, 陆煜华, 蔡红刚. 移动设备眼动跟踪技术[J]. 浙江大学学报(工学版), 2016, 50(6): 1160-1166.
[15] 郑成志, 高金良, 何文杰. 基于FastICA算法的物理漏损流量分析模型[J]. 浙江大学学报(工学版), 2016, 50(6): 1031-1039.