Loading…

Multi-level feature disentanglement network for cross-dataset face forgery detection

Synthesizing videos with forged faces is a fundamental yet important safety-critical task that has caused severe security issues in recent years. Although many existing face forgery detection methods have achieved superior performance on such synthetic videos, they are severely limited by the domain...

Full description

Saved in:
Bibliographic Details
Published in:Image and vision computing 2023-07, Vol.135, p.104686, Article 104686
Main Authors: Fu, Zhixiao, Chen, Xinyuan, Liu, Daizong, Qu, Xiaoye, Dong, Jianfeng, Zhang, Xuhong, Ji, Shouling
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Synthesizing videos with forged faces is a fundamental yet important safety-critical task that has caused severe security issues in recent years. Although many existing face forgery detection methods have achieved superior performance on such synthetic videos, they are severely limited by the domain-specific training data and generally perform unsatisfied when transferred to the cross-dataset scenario due to the domain gaps. Based on this observation, in this paper, we propose a multi-level feature disentanglement network to be robust to this domain bias induced by the different types of fake artifacts in different datasets. Specifically, we first detect the face image and transform it into both color-aware and frequency-aware inputs for multi-modal contextual representation learning. Then, we introduce a novel feature disentangling module that mainly utilizes a pair of complementary attention maps, to disentangle the synthetic features into separate realistic features and the features of fake artifacts. Since the features of fake artifacts are indirectly obtained from the latent features instead of the dataset-specific distribution, our forgery detection model is robust to the dataset-specific domain gaps. By applying the disentangling module to multi-levels of the feature extraction network with multi-modal inputs, we can obtain more robust feature representations. In addition, a realistic-aware adversary loss and a domain-aware adversary loss are adopted to facilitate the network for better feature disentanglement and extraction. Extensive experiments on four datasets verify the generalization of our method and present the state-of-the-art performance. •Propose to disentangle synthetic face features into realistic and artifact features.•A novel multi-level feature disentanglement network used for disentanglement.•Realistic-aware and domain-aware discrimination losses strengthen disentanglement.•Achieve state-of-the-art performance on cross-dataset forgery detection.
ISSN:0262-8856
DOI:10.1016/j.imavis.2023.104686