This work sets out to investigate the adversarial vulnerability of the recently introduced ViT and MLP-Mixer architectures and compare their performance with CNNs.
Deep classifiers trained on balanced datasets exhibit a class-wise imbalance, which is even more severe for adversarially trained models. We propose a class-wise loss re-weighting to obtain more fair standard and robust classifiers. The final results suggest, that there exists a triangular trade-off between accuracy, robustness, and fairness.
We conjecture that the reason that backpropagating linearly (LinBP) improves the transferability is mainly due to a continuous approximation for the ReLU in the backward pass. We propose backpropagating continuously (ConBP) that adopts a smooth yet non-linear gradient approximation. Our ConBP consistently achieves equivalent or superior performance than the recently proposed LinBP
We identify two drawbacks of MI-FGSM; inducing higher average pixel discrepancy to the image as well as making the current iteration update overly dependent on the historical gradients. We propose a new momentum-free iterative method that processes the gradient with a generalizable Cut & Norm operation instead of a sign operation.
We revisit the transferable adversarial attacks and improve it from two perspectives; First, we identify over-fitting as one major factor that hinders transferability, for which we propose to augment the network input and/or feature layers with noise. Second, we propose a new cross-entropy loss with two ends; One for pushing the sample far from the source class, i.e. ground-truth class, and the other for pulling it close to the target class.
We revisit adversarial attacks by perceiving it as shifting the sample semantically close to or far from a certain class, i.e. interest class. With this perspective, we introduce a new metric called interest class rank (ICR), i.e. the rank of interest class in the adversarial example, to evaluate adversarial strength.
In contrast to most existing works that manipulate the image input for boosting transferability, our work manipulates the model architecture. Specifically, we boost the transferability with stochastic depth by randomly removing a subset of layers in networks with skip connections. Technical-wise, our proposed approach is mainly inspired by previous work improving the network generalization with stochastic depth. Motivation-wise, our approach of removing residual module instead of skip connection is inspired by the known finding that transferability of adversarial examples are positively related to local linearity of DNNs.
How to generate UAP without access to the training data remains an open problem. In this work, we attempt to address this issue progressively. First, we propose a self-supervision loss to alleviate the need for ground-truth labels with the assumption that it is easier to get access to a training dataset without labels. Second, we attempt to address this issue by utilizing a very small amount of images. Our results show that our simple approach outperforms previous work by a large margin. Third, we attempt to generate a data-free UAP, i.e. without access to the training dataset at all. To this end, we propose to utilize artificial jigsaw images as the proxy dataset, and our approach outperforms existing methods by a large margin.