Loading…

Face sketch-to-photo transformation with multi-scale self-attention GAN

•Visually realistic photos can be generated with “divide and conquer” strategy.•Multi-scale structure can extract detailed features from current layer and affect the quality of features in the next layers.•Self-attention mechanism redistributes more attention on facial region.•The first and last lay...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) 2020-07, Vol.396, p.13-23
Main Authors: Lei, Yingtao, Du, Weiwei, Hu, Qinghua
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•Visually realistic photos can be generated with “divide and conquer” strategy.•Multi-scale structure can extract detailed features from current layer and affect the quality of features in the next layers.•Self-attention mechanism redistributes more attention on facial region.•The first and last layers of feature extractor contribute significantly to photos generation. In this study, we investigate the sketch-to-photo problem, which currently poses a significant challenge in the field of computer vision. A large number of GAN-based encoder-decoder methods have been proposed for image transformation, inspired by the pix2pix model; however, these methods do not produce satisfactory results for photo generation, due to the fact that (1) they miss detailed information of input images because of a single-scale convolution operator in the shallow encoder layers, and (2) they fail to learn long-range dependencies in the deep encoder layers. To better handle these challenges, we present an approach that follows a “divide and conquer” strategy. Our method combines the advantages of a multi-scale convolutional neural network and an attention mechanism and applies these two modules to different encoder layers. Additionally, by optimizing a well-designed loss function, the complex correlations between the sketch and the photo can be calculated. Experimental results show that our method is able to generate high-quality photos from sketch images, and qualitative and quantitative analysis demonstrates its effectiveness and superiority over state-of-the-art models. This work paves a path to replace the traditional encoder structure with the “divide and conquer” strategy to handle image transformation tasks.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2020.02.024