Understanding Adversarial Examples from the Mutual Influence of Images and Perturbations

Chaoning Zhang, Philipp Benz, Tooba Imtiaz, In So Kweon

June 2020

PDF Code arXiv Workshop PDF Workshop Video (5 min) Video (1 min) Workshop Slides

Abstract

A wide variety of works have explored the reason for the existence of adversarial examples, but there is no consensus on the explanation. We propose to treat the DNN logits as a vector for feature representation, and exploit them to analyze the mutual influence of two independent inputs based on the Pearson correlation coefficient (PCC). We utilize this vector representation to understand adversarial examples by disentangling the clean images and adversarial perturbations, and analyze their influence on each other. Our results suggest a new perspective towards the relationship between images and universal perturbations: Universal perturbations contain dominant features, and images behave like noise to them. This feature perspective leads to a new method for generating targeted universal adversarial perturbations using random source images. We are the first to achieve the challenging task of a targeted universal attack without utilizing original training data. Our approach using a proxy dataset achieves comparable performance to the state-of-the-art baselines which utilize the original training dataset.

Type

Conference paper

Publication

In Conference on Computer Vision and Pattern Recognition (CVPR 2020)

Universal Adversarial Perturbations Adversarial Machine Learning

Philipp Benz

Research Team Manager @ Deeping Source (Ph.D. @ KAIST)

My research interest is in Deep Learning with a focus on robustness and security.

Understanding Adversarial Examples from the Mutual Influence of Images and Perturbations

Abstract

Philipp Benz

Research Team Manager @ Deeping Source (Ph.D. @ KAIST)

Related