Loading…
Face sketch-to-photo transformation with multi-scale self-attention GAN
•Visually realistic photos can be generated with “divide and conquer” strategy.•Multi-scale structure can extract detailed features from current layer and affect the quality of features in the next layers.•Self-attention mechanism redistributes more attention on facial region.•The first and last lay...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2020-07, Vol.396, p.13-23 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •Visually realistic photos can be generated with “divide and conquer” strategy.•Multi-scale structure can extract detailed features from current layer and affect the quality of features in the next layers.•Self-attention mechanism redistributes more attention on facial region.•The first and last layers of feature extractor contribute significantly to photos generation.
In this study, we investigate the sketch-to-photo problem, which currently poses a significant challenge in the field of computer vision. A large number of GAN-based encoder-decoder methods have been proposed for image transformation, inspired by the pix2pix model; however, these methods do not produce satisfactory results for photo generation, due to the fact that (1) they miss detailed information of input images because of a single-scale convolution operator in the shallow encoder layers, and (2) they fail to learn long-range dependencies in the deep encoder layers. To better handle these challenges, we present an approach that follows a “divide and conquer” strategy. Our method combines the advantages of a multi-scale convolutional neural network and an attention mechanism and applies these two modules to different encoder layers. Additionally, by optimizing a well-designed loss function, the complex correlations between the sketch and the photo can be calculated. Experimental results show that our method is able to generate high-quality photos from sketch images, and qualitative and quantitative analysis demonstrates its effectiveness and superiority over state-of-the-art models. This work paves a path to replace the traditional encoder structure with the “divide and conquer” strategy to handle image transformation tasks. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2020.02.024 |