Loading…
HAM: Hidden Anchor Mechanism for Scene Text Detection
Direct regression and anchor are the two mainly effective and prevailing mechanisms in the paradigm of scene text detection. However, the use of direct regression-based methods may be challenging during optimization without the help of anchors as references. Unfortunately, the anchor-based methods a...
Saved in:
Published in: | IEEE transactions on image processing 2020, Vol.29, p.7904-7916 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Direct regression and anchor are the two mainly effective and prevailing mechanisms in the paradigm of scene text detection. However, the use of direct regression-based methods may be challenging during optimization without the help of anchors as references. Unfortunately, the anchor-based methods always suffer from the careful design of the anchors, degrading the robustness to complex scenes. To address the above-mentioned problems, we propose a novel hidden anchor mechanism (HAM) especially for scene text detection. The predictions of anchors are innovatively regarded as hidden layers, and the weighted sum of the predictions is integrated into a direct regression-based network. Hence, the architecture of our HAM still has the characteristic of simplicity as with direct regression-based methods. Moreover, it is easier to optimize anchors as references with this type of method than with direct regression-based methods. In this way, our network can take advantage of both direct regression and anchor mechanisms. In addition, we decouple three kinds of one-dimensional anchors from three-dimensional anchors, greatly reducing the number of anchors in text bounding box matching without performance degradation. We also propose a post-processing technique for long text detection, named iterative regression box (IRB), which takes a few additional computational costs and can be easily generalized to other methods. Experiments on several public datasets demonstrate that the proposed method achieves state-of-the-art performance. Code is available at https://github.com/hjbplayer/HAM . |
---|---|
ISSN: | 1057-7149 1941-0042 |
DOI: | 10.1109/TIP.2020.3008863 |