Loading…

Combining spatial and transform features for the recognition of middle zone components of Telugu

The transformation from the traditional paper based society to a truly paperless information society involves huge amount of knowledge with necessary algorithmic approaches in the area of Document Image Processing. Progress in Indic Script analysis gained momentum in the recent period. Individual ch...

Full description

Saved in:
Bibliographic Details
Main Authors: Sastry, A.S.C., Lanka, S., Clee, P.P., Reddy, L.P.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The transformation from the traditional paper based society to a truly paperless information society involves huge amount of knowledge with necessary algorithmic approaches in the area of Document Image Processing. Progress in Indic Script analysis gained momentum in the recent period. Individual characters in these scripts undergo large number of shape variations due to complex nature of the canonical structure resembling the phonetic sequence. Separation of individual components and establishment of the relationship between these components in the recognition process is the major approach found in literature. In this paper, an attempt is made to extract Middle Zone Components by combining Component model and Zone Separation model on Telugu Document Images. Recognition of middle zone components is achieved with a novel technique of combining spatial features for understanding the topological characteristics and transform feature for effective classification. A tree classifier is adopted with Euler Number, Compact Ratio and Zernike moment as features. Unsupervised training strategy is adopted to identify the Middle Zone components. The optimum size of the training set is evaluated for various font sizes.
ISSN:2159-3442
2159-3450
DOI:10.1109/TENCON.2008.4766721