Loading…

Detecting Mathematical Expressions in Scientific Document Images Using a U-Net Trained on a Diverse Dataset

A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access 2019-01, Vol.7, p.1-1
Main Authors:	Ohyama, Wataru, Suzuki, Masakazu, Uchida, Seiichi
Format:	Article
Language:	English
Subjects:	Character recognition Computer architecture convolutional Neural networks Datasets document image analysis Image segmentation Mathematical analysis mathematical expression detection Medical imaging neural networks object detection Optical character recognition Performance enhancement Retraining Training
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	A detection method for mathematical expressions in scientific document images is proposed. Inspired by the promising performance of U-Net, a convolutional network architecture originally proposed for the semantic segmentation of biomedical images, the proposed method uses image conversion by a U-Net framework. The proposed method does not use any information from mathematical and linguistic grammar so that it can be a supplemental bypass in the conventional mathematical optical character recognition (OCR) process pipeline. The evaluation experiments confirmed that (1) the performance of mathematical symbol and expression detection by the proposed method is superior to that of InftyReader, which is state-of-the-art software for mathematical OCR; (2) the coverage of the training dataset to the variation of document style is important; and (3) retraining with small additional training samples will be effective to improve the performance. An additional contribution is the release of a dataset for benchmarking the OCR for scientific documents.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2019.2945825