Loading…
A convolutional recursive deep architecture for unconstrained Urdu handwriting recognition
An offline handwriting recognition system for Urdu, a language with a user base of 200 Million and written in Nastaleeq script, has been a challenge for the research community. The key problems include recognition of complex ligature shapes and lack of publicly available datasets. This paper address...
Saved in:
Published in: | Neural computing & applications 2022, Vol.34 (2), p.1635-1648 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | An offline handwriting recognition system for Urdu, a language with a user base of 200 Million and written in Nastaleeq script, has been a challenge for the research community. The key problems include recognition of complex ligature shapes and lack of publicly available datasets. This paper addresses both these problems by (i) proposing an end-to-end handwriting recognition system based on a new CNN-RNN architecture with n-gram language modeling, and (ii) presenting a new unconstrained dataset called NUST-UHWR. We compiled the first unconstrained Urdu handwritten data from around 1000 people from diverse background, age and gender population. The text in this dataset is selected carefully from seven different fields to ensure the presence of commonly used words in different domains. The model architecture is capable of incorporating fine-grained features necessary for handwritten text recognition of complex ligature languages. Our method addresses the limitations of existing architectures and provides state-of-the-art performance on Urdu handwritten text. We achieve a minimum character error rate of 5.28% on Urdu handwriting recognition (UHWR) and establish a state-of-the-art. The paper further demonstrates the generalization ability of the proposed model by training on English language and bilingual (Urdu and English) handwritten data. |
---|---|
ISSN: | 0941-0643 1433-3058 |
DOI: | 10.1007/s00521-021-06498-2 |