Loading…

Discretization Based Solutions for Secure Machine Learning Against Adversarial Attacks

Adversarial examples are perturbed inputs that are designed (from a deep learning network's (DLN) parameter gradients) to mislead the DLN during test time. Intuitively, constraining the dimensionality of inputs or parameters of a network reduces the "space" in which adversarial exampl...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE access 2019, Vol.7, p.70157-70168
Main Authors:	Panda, Priyadarshini, Chakraborty, Indranil, Roy, Kaushik
Format:	Article
Language:	English
Subjects:	Accuracy Adversarial robustness binarized neural networks Data models Datasets deep learning Discretization discretization techniques Machine learning Manifolds Neural networks Parameters Perturbation Perturbation methods Predictive models Robustness Training
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Adversarial examples are perturbed inputs that are designed (from a deep learning network's (DLN) parameter gradients) to mislead the DLN during test time. Intuitively, constraining the dimensionality of inputs or parameters of a network reduces the "space" in which adversarial examples exist. Guided by this intuition, we demonstrate that discretization greatly improves the robustness of the DLNs against adversarial attacks. Specifically, discretizing the input space (or allowed pixel levels from 256 values or 8 bit to 4 values or 2 bit ) extensively improves the adversarial robustness of the DLNs for a substantial range of perturbations for minimal loss in test accuracy. Furthermore, we find that binary neural networks (BNNs) and related variants are intrinsically more robust than their full precision counterparts in adversarial scenarios. Combining input discretization with the BNNs furthers the robustness, even waiving the need for adversarial training for the certain magnitude of perturbation values. We evaluate the effect of discretization on MNIST, CIFAR10, CIFAR100, and ImageNet datasets. Across all datasets, we observe maximal adversarial resistance with 2 bit input discretization that incurs an adversarial accuracy loss of just ~1% - 2% as compared to clean test accuracy against single-step attacks. We also show standalone discretization remains vulnerable to stronger multi-step attack scenarios necessitating the use of adversarial training with discretization as an improved defense strategy.
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2019.2919463