The Landscape of Adversarial Example

By LI Haoyang 2020.10

Content

The Landscape of Adversarial ExampleContentProblemsMapAttacksDefenseExplanationInterpretaionBenchmarkFalse positive noiseAdversarial augmentation

Problems

Adversarial examples are such examples that 1) make no difference to human eyes 2) fool the targeted system to malfunction. Many methods in deep learning have been discovered to be vulnerable to adversarial examples, along with many methods to generate adversarial examples.

For a target system estimated from original map and a target input , the goal is to generate an adversarial example with a minimal perturbation , such that :

By the attacker's knowledge of the system, these methods are divided into white-box and black-box, the former assumes the attacker have full knowledge of the system while the latter does not make such assumptions, which makes the former easier and popular in literature, less meaningful in practice and the latter more difficult and weirder in literature, more meaningful in practice. Between black and white, some also call the scenario when attacker has limited knowledge as gray-box.

The problem of generating adversarial examples is a reversed optimization problem, in which the parameters of the targeted model is fixed, while the input is tuned to mislead the model under certain constraints. The white-box methods generally utilize the gradients of the model, and the black-box methods either utilize the transferability of adversarial examples, or solve the problem with zero-order optimization algorithms (e.g. evolutionary algorithms, reinforcement learning, etc. ).

Most white-box attack (i.e. the most prevailing methods) reforms the problem into a reversed training where the loss is maximized rather than minimized.

The defense of adversarial attack is more difficult and no method has defended all adversarial attacks. Theoretically, a min-max optimization is formulated with the idea of adversarial training to train a robust model.

The inner formulation searches an optimal perturbation in allowed perturbations and the outer formulation searches an optimal parameter to minimize the loss function respect to the perturbed example.

Based on the same minimax problem, while adversarial training aims to find a lower bound for the inner maximization, efforts in provable defense attempt to find an upper bound for the inner maximization. The latter is also known as verification of robustness.

Map

A Map of Adversarial Example Research

Attacks

Adversarial Attacks

Adversarial Example in Object Detection

Defense

Defenses against Adversarial Attacks

Regularization

Adversarial Training

Provable Defenses/Verification

NAS + defense

Defense at Inference

Ensemble

Breach Defense

Evaluation

Explanation

This direction germinates from the robustness analysis of machine learning algorithms, which is a domain with a long history.

Explanation of Robustness and Adversarial Example

Robustness Analysis

What is Adversarial Example?

Interpretaion

These are some empirical discoveries and some unique interpretations for adversarial examples and the robustness of models.

Interprertation for Robustness and Adversarial Example

Benchmark

These are some benchmark datasets and some methods proposed to benchmark the performance.

Benchmark Adversarial Defenses

False positive noise

Images that are meaningless to human eyes, but meaningful to classifiers.

Adversarial augmentation

This is a very fresh direction with potential to grow up.

Adversarial Augmentation


🔝