Loading…

Boosting adversarial attacks with transformed gradient

Deep neural networks (DNNs) are vulnerable to adversarial examples, which are crafted by adding imperceptible perturbations to benign examples. Increasing the attack success rate usually requires a larger noise magnitude, which leads to noticeable noise. To this end, we propose a Transformed Gradien...

Full description

Saved in:

Bibliographic Details
Published in:	Computers & security 2022-07, Vol.118, p.102720, Article 102720
Main Authors:	He, Zhengyun, Duan, Yexin, Zhang, Wu, Zou, Junhua, He, Zhengfang, Wang, Yunyun, Pan, Zhisong
Format:	Article
Language:	English
Subjects:	Adversarial attack Adversarial examples Adversarial machine learning Artificial neural networks Deep neural networks Image classification Iterative methods Perturbation Success Transferability
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Deep neural networks (DNNs) are vulnerable to adversarial examples, which are crafted by adding imperceptible perturbations to benign examples. Increasing the attack success rate usually requires a larger noise magnitude, which leads to noticeable noise. To this end, we propose a Transformed Gradient method (TG), which achieves a higher attack success rate with lower perturbations against the target model, i.e. an ensemble of black-box defense models. It consists of three steps: original gradient accumulation, gradient amplification, and gradient truncation. Besides, we introduce the Fre´chet Inception Distance (FID) and Learned Perceptual Image Patch Similarity (LPIPS) respectively to evaluate fidelity and perceived distance from the original example, which is more comprehensive than only using L∞ norm as evaluation metrics. Furthermore, we propose optimizing coefficients of the source-model ensemble to improve adversarial attacks. Extensive experimental results demonstrate that the perturbations of adversarial examples generated by our proposed method are less than the state-of-the-art baselines, namely MI, DI, TI, RF-DE based on vanilla iterative FGSM and their combinations. Compared with the baseline method, the average black-box attack success rate and total score are improved by 6.6% and 13.8, respectively. We make our codes public at Github https://github.com/Hezhengyun/Transformed-Gradient.
ISSN:	0167-4048 1872-6208
DOI:	10.1016/j.cose.2022.102720