Loading…

Fractals based multi-oriented text detection system for recognition in mobile video images

•Fractal property, such as self-similarity has been explored for text detection.•Fractal expansion is explored further for detecting text candidates.•Optical flow is proposed for false positive elimination.•Experiments are conducted on benchmark databases for evaluating method. Text detection in mob...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2017-08, Vol.68, p.158-174
Main Authors: Shivakumara, Palaiahnakote, Wu, Liang, Lu, Tong, Tan, Chew Lim, Blumenstein, Michael, Anami, Basavaraj S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Fractal property, such as self-similarity has been explored for text detection.•Fractal expansion is explored further for detecting text candidates.•Optical flow is proposed for false positive elimination.•Experiments are conducted on benchmark databases for evaluating method. Text detection in mobile video is challenging due to poor quality, complex background, arbitrary orientation and text movement. In this work, we introduce fractals for text detection in video captured by mobile cameras. We first use fractal properties such as self-similarity in a novel way in the gradient domain for enhancing low resolution mobile video. We then propose to use k-means clustering for separating text components from non-text ones. To make the method font size independent, fractal expansion is further explored in the wavelet domain in a pyramid structure for text components in text cluster to identify text candidates. Next, potential text candidates are obtained by studying the optical flow property of text candidates. Direction guided boundary growing is finally proposed to extract multi-oriented texts. The method is tested on different datasets, which include low resolution video captured by mobile, benchmark ICDAR 2013 video, YouTube Video Text (YVT) data, ICDAR 2013, Microsoft, and MSRA arbitrary orientation natural scene datasets, to evaluate the performance of the proposed method in terms of recall, precision, F-measure and misdetection rate. To show the effectiveness of the proposed method, the results are compared with the state of the art methods.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2017.03.018