Loading…

Script-Independent Text Line Segmentation in Freestyle Handwritten Documents

Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel appro...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on pattern analysis and machine intelligence 2008-08, Vol.30 (8), p.1313-1329
Main Authors: Yi Li, Yefeng Zheng, Doermann, D., Jaeger, S.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map, where each element represents the probability that the underlying pixel belongs to a text line. The level set method is then exploited to determine the boundary of neighboring text lines by evolving an initial estimate. Unlike connected component based methods ( [1], [2] for example), the proposed algorithm does not use any script-specific knowledge. Extensive quantitative experiments on freestyle handwritten documents with diverse scripts, such as Arabic, Chinese, Korean, and Hindi, demonstrate that our algorithm consistently outperforms previous methods. Further experiments show the proposed algorithm is robust to scale change, rotation, and noise.
ISSN:0162-8828
1939-3539
2160-9292
DOI:10.1109/TPAMI.2007.70792