Obfuscated Gradients Give a False Sense of Security Circumventing Defenses to Adversarial Examples
Introduction
Over the past few years, neural network models have been the source of major breakthroughs in a variety of problems in computer vision. However, these networks have been shown to be susceptible to adversarial attacks. In these attacks, small humanly-imperceptible changes are made to images (that are correctly classified) which causes these models to misclassify with high confidence. These attacks pose a major threat that needs to be addressed before these systems can be deployed on a large scale, especially in safety-critical scenarios.
The seriousness of this threat has generated major interest in both the design and defense against them. In this paper, the authors identify a common technique employed by several recently proposed defenses and design a set of attacks that can be used to overcome them. The use of this technique, masking gradients, is so prevalent, that 7 out of the 8 defenses proposed in the ICLR 2018 conference employed them. The authors were able to circumvent the proposed defenses and successfully brought down the accuracy of their models to below 10%.