Loading…

Layer similarity guiding few-shot Chinese style transfer

Few-shot text style transfer faces two main challenges: The first challenge is the limited availability of reference style text, while the second challenge is the varying degrees of differences between the style reference image and the source image. Existing methods mainly focus on the influence of...

Full description

Saved in:

Bibliographic Details
Published in:	The Visual computer 2024-04, Vol.40 (4), p.2265-2278
Main Authors:	Li, Yumei, Lin, Guangfeng, He, Menglan, Yuan, Dan, Liao, Kaiyang
Format:	Article
Language:	English
Subjects:	Artificial Intelligence Computer Graphics Computer Science Datasets Decoding Feature extraction Fonts Image Processing and Computer Vision Methods Natural language Original Article Semantics Similarity
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Few-shot text style transfer faces two main challenges: The first challenge is the limited availability of reference style text, while the second challenge is the varying degrees of differences between the style reference image and the source image. Existing methods mainly focus on the influence of local style and global style feature extraction on text style, but they ignore the crucial role played by the difference between the style reference image and the source image on style characteristics, especially in Chinese, which has its own unique ideograph structure. To address this issue, this paper proposes Layer Similarity Guiding Few-shot Chinese Style Transfer (LSG-FCST). LSG-FCST can not only build a transfer network by encoding the content and style characteristics from low-level to high-level semantics, but it can also discover the similarity characteristics between the style reference image and the source image through the attention mechanism. Furthermore, LSG-FCST can integrate the style features generated by the similarity features of different layers and generate the target image through asymmetric decoding. In the self-built text image dataset, we consider three types of visibility situations for the test images in the training set: seen fonts unseen characters, unseen fonts seen characters, and unseen fonts unseen characters. The experiments show that LSG-FCST outperforms the state-of-the-art methods. The code and dataset can be accessed at https://github.com/LYM1111/LSG-FCST .
ISSN:	0178-2789 1432-2315
DOI:	10.1007/s00371-023-02915-w