Loading…
Benford’s law: What does it say on adversarial images?
Convolutional neural networks (CNNs) are fragile to small perturbations in the input images. These networks are thus prone to malicious attacks that perturb the inputs to force a misclassification. Such slightly manipulated images aimed at deceiving the classifier are known as adversarial images. In...
Saved in:
Published in: | Journal of visual communication and image representation 2023-05, Vol.93, p.103818, Article 103818 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Convolutional neural networks (CNNs) are fragile to small perturbations in the input images. These networks are thus prone to malicious attacks that perturb the inputs to force a misclassification. Such slightly manipulated images aimed at deceiving the classifier are known as adversarial images. In this work, we investigate statistical differences between natural images and adversarial ones. More precisely, we show that employing a proper image transformation for a class of adversarial attacks, the distribution of the leading digit of the pixels in adversarial images deviates from Benford’s law. The stronger the attack, the more distant the resulting distribution is from Benford’s law. Our analysis provides a detailed investigation of this new approach that can serve as a basis for alternative adversarial example detection methods that do not need to modify the original CNN classifier neither work on the high-dimensional pixel space for features to defend against attacks.
•We show that adversarial images, different from natural ones, tend to deviate from BL.•This deviation is higher for attack algorithms based on infinite-norm perturbations.•Deviations from Benford’s Law increase with the module of the designed perturbation.•In some cases, the proposed feature can indicate an ongoing attack.•The produced low-dimensional feature could be used as input for adversarial detection. |
---|---|
ISSN: | 1047-3203 1095-9076 |
DOI: | 10.1016/j.jvcir.2023.103818 |