Defenses against Adversarial Attacks
By LI Haoyang 2020.10.20 | 2020.11.15
Content
Regularization
There a bunch of methods trying to increase robustness of model by regularization. The idea of regularization germinated from the very first paper that proposed the problem of adversarial example, i.e. Intriguing properties of neural networks.
Adversarial Defense by Regularization
- Moustapha Cisse , Piotr Bojanowski, Edouard Grave, Yann Dauphin, Nicolas Usunier. Parseval Networks: Improving Robustness to Adversarial Examples. ICML 2017. arXiv:1704.08847
- Chongli Qin, James Martens, Sven Gowal, Dilip Krishnan, Krishnamurthy Dvijotham, Alhussein Fawzi, Soham De, Robert Stanforth, and Pushmeet Kohli. Adversarial robustness through local linearization. In NeurIPS, 2019. arXiv:1907.02610
- Alvin Chan, Yi Tay, Yew Soon Ong, Jie Fu. Jacobian Adversarially Regularized Networks for Robustness. ICLR 2020. arXiv:1912.10185
Adversarial Training
The prevailing method to defend adversarial attack is adversarial training, using the adversarial examples generated online to train the model for a more robust version. It was proposed along with Fast Gradient Sign Method.
Adversarial Defense by Adversarial Training
- Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, Adrian Vladu. Towards Deep Learning Models Resistant to Adversarial Attacks. ICLR 2018. arXiv:1706.06083
- Ali Shafahi, Mahyar Najibi, Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S. Davis, Gavin Taylor, Tom Goldstein. Adversarial Training for Free! arXiv:1904.12843v2
- Dinghuai Zhang, Tianyuan Zhang, Yiping Lu, Zhanxing Zhu, Bin Dong. You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle. NIPS 2019. Paper: http://papers.nips.cc/paper/8316-you-only-propagate-once-accelerating-adversarial-training-via-maximal-principle
- Jonathan Uesato, Jean-Baptiste Alayrac, Po-Sen Huang, Robert Stanforth, Alhussein Fawzi, Pushmeet Kohli. Are Labels Required for Improving Adversarial Robustness?. NIPS 2019. arXiv:1905.13725
- Jingfeng Zhang, Xilie Xu, Bo Han, Gang Niu, Lizhen Cui, Masashi Sugiyama, Mohan Kankanhalli. Attacks Which Do Not Kill Training Make Adversarial Learning Stronger. ICML 2020. arXiv:2002.11242
- Cihang Xie, Mingxing Tan, Boqing Gong, Alan Yuille, Quoc V. Le. Smooth Adversarial Training. arXiv:2006.14536
- Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, Michael I. Jordan. Theoretically Principled Trade-off between Robustness and Accuracy. ICML 2019. arXiv:1901.08573
- Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan Yuille, Kaiming He. Feature Denoising for Improving Adversarial Robustness. CVPR 2019. arXiv:1812.03411
- Eric Wong, Leslie Rice, and J. Zico Kolter. Fast is better than free: Revisiting adversarial training. ICLR, 2020. Paper: https://openreview.net/forum?id=BJx040EFvH¬eId=BJx040EFvH
- Maksym Andriushchenko, Nicolas Flammarion. Understanding and Improving Fast Adversarial Training. NIPS 2020. Paper: https://infoscience.epfl.ch/record/278914
- Ludwig Schmidt, Shibani Santurkar, Dimitris Tsipras, Kunal Talwar, Aleksander Madry. Adversarially Robust Generalization Requires More Data. NIPS 2018. arXiv:1804.11285
Robust Structure
Robust Structure
- Minghao Guo, Yuzhe Yang, Rui Xu, Ziwei Liu, Dahua Lin. When NAS Meets Robustness: In Search of Robust Architectures against Adversarial Attacks. CVPR 2020. arXiv:1911.10695
Defense at Inference
Adversarial Defense at Inference
- Yang Song, Taesup Kim, Sebastian Nowozin, Stefano Ermon, Nate Kushman. PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples. ICLR 2018. arXiv:1710.10766
- Tianyu Pang, Kun Xu, Jun Zhu. Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks. ICLR 2020. Paper: https://openreview.net/forum?id=ByxtC2VtPB
Ensemble
Adversarial Defense with Ensemble
- Huanrui Yang, Jingyang Zhang, Hongliang Dong, Nathan Inkawhich, Andrew Gardner, Andrew Touchet, Wesley Wilkes, Heath Berry, Hai Li. DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles. NIPS 2020. arXiv:2009.14720
Breach
Breach Adversarial Defenses
Nicholas Carlini, David Wagner. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods. AISec 2017. arXiv:1705.07263
Breached:
Warren He, James Wei, Xinyun Chen, Nicholas Carlini, Dawn Song. Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong. 2017. arXiv:1706.04701
Breached:
Anish Athalye, Nicholas Carlini, David Wagner. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. ICML 2018. arXiv:1802.00420
Un-breached:
Adversarial Training
- Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. Towards deep learning models resistant to adversarial attacks. International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rJzIBfZAb. accepted as poster (Adversarial Training with PGD)
- Na, T., Ko, J. H., and Mukhopadhyay, S. Cascade adversarial machine learning regularized with a unified embedding. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=HyRVBzap-. (Cascade Adversarial Training)
Breached: