Loading…

ICDAR 2019 Historical Document Reading Challenge on Large Structured Chinese Family Records

In this paper, we present a large historical database of Chinese family records with the aim to develop robust systems for historical document analysis. In this direction, we propose a Historical Document Reading Challenge on Large Chinese Structured Family Records (ICDAR 2019 HDRC-CHINESE). The obj...

Full description

Saved in:
Bibliographic Details
Main Authors: Saini, Rajkumar, Dobson, Derek, Morrey, Jon, Liwicki, Marcus, Simistira Liwicki, Foteini
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we present a large historical database of Chinese family records with the aim to develop robust systems for historical document analysis. In this direction, we propose a Historical Document Reading Challenge on Large Chinese Structured Family Records (ICDAR 2019 HDRC-CHINESE). The objective of the competition is to recognize and analyze the layout, and finally detect and recognize the textlines and characters of the large historical document image dataset containing more than 100000 pages. Cascade R-CNN, CRNN, and U-Net based architectures were trained to evaluate the performances in these tasks. Error rate of 0.01 has been recorded for textline recognition (Task1) whereas a Jaccard Index of 99:54% has been recorded for layout analysis (Task2). The graph edit distance based total error ratio of 1:5% has been recorded for complete integrated textline detection and recognition (Task3).
ISSN:2379-2140
DOI:10.1109/ICDAR.2019.00241