Loading…

Arabic Scene Text Recognition in the Deep Learning Era: Analysis on a Novel Dataset

The problem of scene text recognition has recently gained extra attention, being an essential part of scene understanding systems. The broad scope of applications and the unresolved challenges has given this problem its popularity. However, the research focus has long been on languages with Latin ch...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2021, Vol.9, p.107046-107058
Main Authors: Hassan, Heba, El-Mahdy, Ahmed, Hussein, Mohamed E.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The problem of scene text recognition has recently gained extra attention, being an essential part of scene understanding systems. The broad scope of applications and the unresolved challenges has given this problem its popularity. However, the research focus has long been on languages with Latin characters while leaving behind other languages with different characteristics, such as the Arabic language. In this paper, we focus on Arabic scene text recognition and attempt to fill two main gaps regarding this research task. First, the Arabic language is lacking a publicly available benchmark dataset to compare different proposed methods on the same grounds. Therefore, we introduce a novel Arabic/English dataset: Everyday Arabic-English Scene Text dataset (EvArEST), to fill that need. Second, while deep learning methods have continuously evolved and pushed the state of the art in languages with Latin characters, their use for the Arabic language has been very limited. Therefore, we use our new dataset to evaluate the problem of Arabic scene text recognition from three perspectives: (1) using deep learning techniques and studying their suitability for Arabic scene text recognition, where we identify essential components required for the model to obtain good performance; (2) identifying Arabic text challenges that differ from Latin text and require special attention; (3) investigating a bilingual model that concurrently deals with Arabic and English words, since Arabic text is usually found along with other languages. We determine the best model to handle bidirectional text, its challenges, and possible ways to overcome them. We offer both Arabic and Bilingual text recognition results using EvArEST dataset for upcoming research to build upon and improve. We also point to directions for future research based on the analysis performed on the dataset. The dataset is publicly available at https://github.com/HGamal11/EvArEST-dataset .
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2021.3100717