Loading…
Tree-based data augmentation and mutual learning for offline handwritten mathematical expression recognition
•We propose a tree-based multi-level data augmentation strategy to effectively alleviate the problem of insufficient original annotation data, which is one of the critical technology to our champion system for the OffRaSHME20 competition.•We introduce a novel tree-based mutual learning method to dee...
Saved in:
Published in: | Pattern recognition 2022-12, Vol.132, p.108910, Article 108910 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •We propose a tree-based multi-level data augmentation strategy to effectively alleviate the problem of insufficient original annotation data, which is one of the critical technology to our champion system for the OffRaSHME20 competition.•We introduce a novel tree-based mutual learning method to deeply integrate the string decoder and the tree decoder in both the training and inference stages, which fully complement the advantages of these two types of decoders.•Our system significantly outperforms the other state-of-the-art results on both the OffRaSHME20 dataset and the CROHME14/16/19 datasets.
Recently, thanks to the successful application of the attention-based encoder-decoder framework, handwritten mathematical expression recognition (HMER) has achieved significant improvement. However, HMER is still a challenging task in the handwriting recognition area, which suffers from the ambiguity of handwritten symbols, the two-dimensional structure of mathematical expressions, and the lack of labeled data. In this paper, we attempt to improve the recognition performance and generalization ability of the existing state-of-the-art method from two perspectives: data augmentation and model design. We first propose a tree-based multi-level (including symbol level, sub-expression level, and image level) data augmentation strategy, which can generate many synthetic images. Then, we present a novel encoder-decoder hybrid model via tree-based mutual learning to fully utilize the complementarity between tree decoder and string decoder. Benefitting from our data augmentation strategy, we achieve 58.47%/57.82%/62.67% and 74.45% expression recognition accuracy respectively on the CROHME14/16/19 competition datasets and the OffRaSHME20 competition dataset. Moreover, tree-based data augmentation is a key technology to our champion system for the OffRaSHME20 competition. Our tree-based mutual learning method further improves the recognition accuracy to 61.63%/59.81%/64.38% and 75.68% on these datasets. Further quantitative and qualitative analyses also demonstrate the effectiveness and robustness of our proposed methods. |
---|---|
ISSN: | 0031-3203 1873-5142 |
DOI: | 10.1016/j.patcog.2022.108910 |