Loading…

Not All Samples Are Born Equal: Towards Effective Clean-Label Backdoor Attacks

•We reveal that the difficulty of clean-label backdoor attacks is mostly due to the antagonistic effects of ‘robust features’ and verify that DNNs have different learning abilities for different samples.•We revisit the paradigm of existing clean-label backdoor attacks and propose a new complementary...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2023-07, Vol.139, p.109512, Article 109512
Main Authors: Gao, Yinghua, Li, Yiming, Zhu, Linghui, Wu, Dongxian, Jiang, Yong, Xia, Shu-Tao
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We reveal that the difficulty of clean-label backdoor attacks is mostly due to the antagonistic effects of ‘robust features’ and verify that DNNs have different learning abilities for different samples.•We revisit the paradigm of existing clean-label backdoor attacks and propose a new complementary paradigm by considering the different learning difficulties of samples.•We empirically verify the effectiveness and the poisoning transferability of our method on benchmark datasets and discuss its intrinsic mechanism. Recent studies demonstrated that deep neural networks (DNNs) are vulnerable to backdoor attacks. The attacked model behaves normally on benign samples, while its predictions are misled whenever adversary-specified trigger patterns appear. Currently, clean-label backdoor attacks are usually regarded as the most stealthy methods in which adversaries can only poison samples from the target class without modifying their labels. However, these attacks can hardly succeed. In this paper, we reveal that the difficulty of clean-label attacks mainly lies in the antagonistic effects of ‘robust features’ related to the target class contained in poisoned samples. Specifically, robust features tend to be easily learned by victim models and thus undermine the learning of trigger patterns. Based on these understandings, we propose a simple yet effective plug-in method to enhance clean-label backdoor attacks by poisoning ‘hard’ instead of random samples. We adopt three classical difficulty metrics as examples to implement our method. We demonstrate that our method can consistently improve vanilla attacks, based on extensive experiments on benchmark datasets.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2023.109512