DISCO: Adversarial Defense with Local Implicit Functions


UC San Diego

Overview


The problem of adversarial defenses for image classification, where the goal is to robustify a classifier against adversarial examples, is considered. Inspired by the hypothesis that these examples lie beyond the natural image manifold, a novel aDversarIal defenSe with local impliCit functiOns (DISCO) is proposed to remove adversarial perturbations by localized manifold projections. DISCO consumes an adversarial image and a query pixel location and outputs a clean RGB value at the location. It is implemented with an encoder and a local implicit module, where the former produces per-pixel deep features and the latter uses the features in the neighborhood of query pixel for predicting the clean RGB value. Extensive experiments demonstrate that both DISCO and its cascade version outperform prior defenses, regardless of whether the defense is known to the attacker. DISCO is also shown to be data and parameter efficient and to mount defenses that transfers across datasets, classifiers and attacks.

paper

Published in Conference on Neural Information Processing Systems (NeurIPS), 2022.

Supplement

Arxiv

Repository

Bibtex

Models


paper

Architecture: (a) Data preparation, (b) training and (c) testing phase of DISCO. DISCO supports different configurations of attack and classifier for training and testing. For cascade DISCO, K > 1

Code

Training, evaluation and deployment code available on GitHub.

Video


Acknowledgements

This work was partially funded by NSF awards IIS-1637941, IIS-1924937, and NVIDIA GPU donations.