Loading…
Artistic-style text detector and a new Movie-Poster dataset
Although current text detection algorithms demonstrate effectiveness in general scenarios, their performance declines when confronted with artistic-style text featuring complex structures. This paper proposes a method that utilizes the Criss-Cross Attention and the residual dense block to address th...
Saved in:
Published in: | Expert systems with applications 2025-02, Vol.261, p.125544, Article 125544 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Although current text detection algorithms demonstrate effectiveness in general scenarios, their performance declines when confronted with artistic-style text featuring complex structures. This paper proposes a method that utilizes the Criss-Cross Attention and the residual dense block to address the incomplete and misdiagnosis of artistic-style text detection by current algorithms. Specifically, our method mainly consists of a feature extraction backbone, a Recycle Criss-Cross Attention module, a Residual Feature Pyramid Network, and a Boundary Discrimination Module. The Recycle Criss-Cross Attention module significantly enhances the model’s perceptual capabilities in complex environments by fusing horizontal and vertical contextual information, allowing it to capture detailed features overlooked in artistic-style text. We incorporate the residual dense block into the feature pyramid network to suppress the effect of background noise during feature fusion. Aiming to omit the complex post-processing, we explore a Boundary Discrimination Module that guides the correct generation of boundary proposals. Furthermore, given that movie poster titles often use stylized art fonts, we collected a Movie-Poster dataset to address the scarcity of artistic-style text data. Extensive experiments demonstrate that our proposed method performs superiorly on the Movie-Poster dataset and produces excellent results on multiple benchmark datasets. The code and the Movie-Poster dataset will be available at: https://github.com/AXNing/Artistic-style-text-detection.
•The RCCA module enhances the model’s ability to perceive artistic-style text.•The R-FPN suppresses the effect of background noise similar to the text region.•The BDM guides the correctness of boundary modeling.•We contributed an artistic-style text dataset, Movie-Poster. |
---|---|
ISSN: | 0957-4174 |
DOI: | 10.1016/j.eswa.2024.125544 |