Bosch Center for Artificial Intelligence

Universal Adversarial Perturbations Against Semantic Image Segmentation


While deep learning is remarkably successful on perceptual tasks, it is also vulnerable to adversarial perturbations of the input. These perturbations denote noise added to the input to fool the system while being quasi-imperceptible for humans. More severely, some universal perturbations that are input-agnostic but fool the network on the majority of inputs. While recent work has focused on image classification, this work highlights attacks against semantic image segmentation: we present an approach for generating (universal) adversarial perturbations that make the network yield a desired target segmentation as output. We show empirically that there barely perceptible universal noise patterns can result in nearly the same predicted segmentation for arbitrary inputs. Furthermore, we also show the existence of universal noise which removes a target class (e.g. all pedestrians) from the segmentation while leaving the segmentation mostly unchanged otherwise.


Bosch BCAI - Introduction
Image Source: Cityscapes Dataset [17]

Deep learning has led to significant performance increases for numerous visual perceptual tasks [5, 6, 10, 11] and is relatively robust to random noise [2]. However, several studies have found it to be vulnerable to adversarial perturbations [12, 4, 8, 13, 1, 16]. Adversarial attacks involve generating slightly perturbed versions of the input data that fool the classifier (i.e., change its output) but stay almost imperceptible to the human eye. Adversarial perturbations transfer between different network architectures, and networks trained on disjoint subsets of data [12].

Prior work on adversarial examples focuses on the task of image classification. We investigate the effect of adversarial attacks on tasks involving a localization component. More specifically, semantic image segmentation is an important methodology for scene understanding that can be used, for example, for automated driving, video surveillance, or robotics. With the wide-spread applicability in those domains comes the risk of being confronted with an adversary trying to fool the system. In this work, we derive approaches to create adversarial perturbations for semantic segmentation. In particular, we create an almost invisible pattern: when we add this pattern to an arbitrary input image, the system outputs always the same static segmentation. The main motivation for this experiment is to show how fragile current approaches for semantic segmentation are when confronted with an adversary.


Bosch BCAI - Experiments
Image source: Cityscapes Dataset [17]

We applied the approach to generate adversarial perturbations on images from the Cityscapes Dataset. On the left side, you can find an exemplary result. The first row shows the original image and the segmentation predicted by the network. The second row shows the image with added adversarial perturbation and the network prediction on this perturbed image. As you can see, the prediction corresponds to our adversarial target segmentation and does not correspond to the input image.


In this work, we have proposed a method for generating universal adversarial perturbations that change the semantic segmentation of images to the same desired static target segmentation for arbitrary input images that have nothing in common. Note that the presented method does not directly allow an adversarial attack in the physical world since it requires that the adversary is able to precisely control the digital representation of the scene. However, these results emphasize the necessity of future work to address how machine learning can become more robust against (adversarial) perturbations [14, 9, 7] and how adversarial attacks can be detected [15, 3], especially in safety or security-critical applications. We will continue our research on adversarial examples. In particular, we aim to understand the root cause of adversarial examples in order to derive networks robust against adversarial attacks.


[1] N. Carlini and D. Wagner. Towards Evaluating the Robustness of Neural Networks. In arXiv:1608.04644, Aug. 2016.

[2] A. Fawzi, S.-M. Moosavi-Dezfooli, and P. Frossard. Robustness of classifiers: from adversarial to random noise. In D. D.Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29, pages 1632–1640. Curran Associates, Inc., 2016.

[3] R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner. Detecting Adversarial Samples from Artifacts. In arXiv:1703.00410 [cs, stat], Mar. 2017.

[4] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations (ICLR), 2015.

[5] K. He, X. Zhang, S. Ren, and J. Sun. Deep Residual Learning for Image Recognition. In Computer Vision and Pattern Recognition (CVPR), 2016.

[6] J. Long, E. Shelhamer, and T. Darrell. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of Computer Vision and Pattern Recognition (CVPR), Boston, 2015.

[7] A. Kurakin, I. Goodfellow, and S. Bengio. Adversarial Machine Learning at Scale. In International Conference on Learning Representations (ICLR), 2017.

[8] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard. Deep-Fool: A simple and accurate method to fool deep neural networks. In Computer Vision and Pattern Recognition (CVPR), Las Vegas, Nevada, USA, 2016

[9] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami. Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks. In Symposium on Security & Privacy, pages 582–597, San Jose, CA, 2016.

[10] S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Advances in Neural Information Processing Systems, 2015.

[11] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. DeepFace: Closing the Gap to Human-Level Performance in Face Verification. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 1701–1708, 2014.

[12] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations (ICLR), 2014.

[13] M. Sharif, S. Bhagavatula, L. Bauer, and M. K. Reiter. Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, CCS ’16, pages 1528–1540, New York, NY, USA, 2016. ACM.

[14] S. Zheng, Y. Song, T. Leung, and I. Goodfellow. Improving the Robustness of Deep Neural Networks via Stability Training. In Computer Vision and Pattern Recognition CVPR, 2016.

[15] J. H. Metzen, T. Genewein, V. Fischer, and B. Bischoff. On Detecting Adversarial Perturbations. In International Conference on Learning Representations (ICLR), 2017.

[16] V. Fischer, M. C. Kumar, J. H. Metzen, and T. Brox. Adversarial Examples for Semantic Image Segmentation. In International Conference on Learning Representations (ICLR), Workshop, Mar. 2017. 3

[17] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele. The Cityscapes Dataset for Semantic Urban Scene Understanding. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

Read more

  • Bosch BCAI - Publication - ICCV

    Metzen et al.

    "Universal Adversarial Perturbations Against Semantic Image Segmentation"
    • Authors: Jan Hendrik Metzen, Mummadi Chaithanya Kumar, Thomas Brox, and Volker Fischer
    • Published in ICCV in 2017