Loading…
Improving Formula Analysis with Line and Mathematics Identification
The explosive growth of the internet and electronic publishing has led to a huge number of scientific documents being available to users, however, they are usually inaccessible to those with visual impairments and often only partially compatible with software and modern hardware such as tablets and...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The explosive growth of the internet and electronic publishing has led to a huge number of scientific documents being available to users, however, they are usually inaccessible to those with visual impairments and often only partially compatible with software and modern hardware such as tablets and e-readers. In this paper we revisit Maxtract, a tool for analysing and converting documents into accessible formats, and combine it with two advanced segmentation techniques, statistical line identification and machine learning formula identification. We show how these advanced techniques improve the quality of both Maxtract's underlying document analysis and its output. We re-run and compare experimental results over a number of datasets, presenting a qualitative review of the improved output and drawing conclusions. |
---|---|
ISSN: | 1520-5363 2379-2140 |
DOI: | 10.1109/ICDAR.2013.74 |