PAC-learning with adversary

By LI Haoyang 2020.11.4 - 2020.11.14 (intermittent reading)

Content

PAC-Learning

Definition

$\mathcal{C}$ $\mathcal{A}$ $\text{poly}(\cdot,\cdot,\cdot,\cdot)$ $\epsilon>0$ $\delta>0$ $\mathcal{D}$ $\mathcal{X}$ $c\in\mathcal{C}$ $m\ge\text{poly}(1/\epsilon,1/\delta,n,\text{size}(c))$ :

\Bbb{P}_{S\sim\mathcal{D}^m}[R(h_S)\le\epsilon]\ge1-\delta

$\mathcal{A}$ $\text{poly}(1/\epsilon,1/\delta,n,\text{size}(c))$ $\mathcal{C}$ $\mathcal{A}$ $\mathcal{C}$ .

$\epsilon$ $\delta$ .

VC-dimension

From VC dimension - Wikipedia:

$f$ $D$ $D$ $f$ .

$\{0,1\}$ $n$ $2^n$ .

$|\cdot|$ ) of elements in a set that can be shattered by this hypothesis.

PAC-learning in the presence of evasion adversaries - NIPS 2018

Paper: https://dl.acm.org/doi/10.5555/3326943.3326965

Daniel Cullina, Arjun Nitin Bhagoji, Prateek Mittal. PAC-learning in the presence of evasion adversaries. NIPS 2018. arXiv:1806.01471

In this paper, we step away from the attackdefense arms race and seek to understand the limits of what can be learned in the presence of an evasion adversary.

They are interested in the effect of an adversary on sample complexity, i.e.

If we add an adversary to the learning setting defined in Definition 3, what happens to the gap in performance between the optimal classifier and the learned classifier?

$l_p(p\ge 1)$ -distance constraints on adversarial perturbations, and show that it matches the standard VC-dimension.

This implies that the sample complexity of PAC-learning does not increase in the presence of an adversary.

We also show that this is not always the case by constructing hypothesis classes where the adversarial VC-dimension is arbitrarily larger or smaller than the standard one.

Adversarial agnostic PAC-learning

They extend the PAC-learning to that in the presence of an adversary.

Symbol defintion:

$\mathcal{X}$ - Space of examples
$\mathcal{C}=\{-1,1\}$ - Set of classes
$\mathcal{H}\sube (\mathcal{X}\to\mathcal{C})$ - Set of hypothesis (labeling of examples)
$\ell(c,\hat{c})=\bold{1}(c\neq\hat{c})$ - 0-1 loss function
$R\sube\mathcal{X}\times\mathcal{X}$ - Binary nearness relation
$N(x)=\{y\in\mathcal{X}:(x,y)\in R\}$ - Neighborhood of nearby adversarial examples

The learning problem is as follows.

$P\in\Bbb{P}(\mathcal{X}\times\mathcal{C})$ $(\bold{x},\bold{c})=((x_0,c_0),\dots,(x_{n-1},c_{n-1}))\sim P^n$ $\hat{h}\in\mathcal{H}$ .

$P^n$ $\mathcal{X}$ $\mathcal{C}$ . A labeled training dataset is than a sampled relation.

$(x_{Test},c_{Test})\sim P$ $y\in N(x_{Test})$ $x_{Test}$ .

$y$ $c_{Test}$ .

$\Bbb{P}(\mathcal{X}\times\mathcal{C})$ $(\mathcal{X}\times\mathcal{C},\Sigma)$ $\Sigma\sube 2^{\mathcal{X}\times\mathcal{C}}$ is a set of events.

$R$ $N(x)$ $N(x)$ $y$ is always available.

$R$ $I_{\mathcal{X}}=\{(x,x):x\in\mathcal{X}\}$ .
$N(x)=\{x\}$ $y=x_{Test}$ . The problem boils down to the standard problem of learning without adversary.
$R_1$ $R_2$ $R_1\sube R_2$ $R_2$ represents a stronger adversary.
$R$ $d$ $\mathcal{X}$ $\epsilon$ , i.e.
$R=\{(x,y):d(x,y)\le\epsilon\}$

Adversarial Expected Risk (Definition 1)

$R$ is

L_{P}(h,R)=\Bbb{E}_{(x,c)\sim P}[\max_{y\in N(x)}\ell(h(y),c)]

$h$ .

$h^*=\arg\min_{h\in\mathcal{H}}L_P(h,R)$ $\hat{h}_n$ $L_{P}(\hat{h}_n)-L_P(h^*)\to 0$ .

$P$ is approximated with the distribution of the empirical random variable for the learner.

Adversarial Empirical Risk Minimization (ERM) (Definition 2)

$AERM_{\mathcal{H},R}:(\mathcal{X}\times\mathcal{C})^n\to(\mathcal{X}\to\mathcal{C})$ is defined as

AERM_{\mathcal{H},R}(\bold{x},\bold{c})=\arg\min_{h\in\mathcal{H}}L_{(\bold{x},\bold{c})}(h,R)

$L_{(\bold{x},\bold{c})}$ is the expected loss under the empirical distribution.

$(\bold{x},\bold{c})$ is an empirical distribution drawn from the potential space of distributions.

AERM is the best hypothesis based on the empirical distribution.

Lemma 1

$A:(\mathcal{X}\times\mathcal{C})^n\to(\mathcal{X}\to\mathcal{C})$ $\mathcal{H}$ $R_1$ $R_2$ $R_1\sube R_2$ $P$ ,

\inf _{h \in \mathcal{H}} L_{P}\left(h, R_{1}\right) \leq \inf _{h \in \mathcal{H}} L_{P}\left(h, R_{2}\right)

$P$ $(\bold{x},\bold{c})$ ,

L_{P}\left(\mathrm{A}(\mathrm{x}, \mathbf{c}), R_{1}\right) \leq L_{P}\left(\mathrm{A}(\mathrm{x}, \mathbf{c}), R_{2}\right)

$R_2$ , its performance against a weaker adversary is better.

This is natural by definition.

Learnability (Definition 3)

$\mathcal{H}$ $R$ $m_{\mathcal{H},R}:(0,1)^2\to\N$ (the sample complexity) with the following property.

$0<\delta<1$ $0<\epsilon<1$ $n\ge m_{\mathcal{H},R}(\delta,\epsilon)$ $P\in\Bbb{P}(\mathcal{X}\times\mathcal{C})$ ,

P^{n}\left[\left\{(\mathbf{x}, \mathbf{c}): L_{P}\left(\operatorname{AERM}_{\mathcal{H}, R}(\mathbf{x}, \mathbf{c}), R\right)-\inf _{h \in \mathcal{H}} L_{P}(h, R) \leq \epsilon\right\}\right] \geq 1-\delta

$\epsilon$ $1-\delta$ .

Adversarial VC-dimension and sample complexity

Corrupted hypothesis

The presence of the adversary forces us to learn using a corrupted set of hypotheses.

$\bot$ that means "always wrong".

$\tilde{\mathcal{C}}=\{-1,1,\bot\}$ $\mathcal{H}$ $R$ $\tilde{\mathcal{H}}\sube(\mathcal{X}\to\tilde{\mathcal{C}})$ $\kappa_R:(\mathcal{X}\to\mathcal{C})\to(\mathcal{X}\to\tilde{\mathcal{C}})$ $\kappa_R(h):\mathcal{X}\to\tilde{\mathcal{C}}$ :

\kappa_R(h)=x\mapsto\begin{cases} -1&\forall y\in N(x):h(y)=-1\\ 1&\forall y\in N(x):h(y)=1\\ \bot&\exists y_0,y_1\in N(x):h(y_0)=-1,h(y_1)=1 \end{cases}

$x$ $x$ is labeled as corrupted.

$\tilde{\mathcal{H}}=\{\kappa_R(h):h\in \mathcal{H}\}$ .

It's equivalent between learning an ordinary hypothesis with an adversary and learning a corrupted hypothesis without an adversary.

Lemma 2

$R$ $P$ ,

L_P(h,R)=L_P(\kappa_R(h),I_{\mathcal{X}})

Proof

$\tilde{h}=\kappa_R(h)$ $(x,c)$ ,

\max_{y\in N(x)}\ell(h(y),c)=\underbrace{\bold{1}(\exists y\in N(x),h(y)\neq c)=\bold{1}(\tilde{h}(x)\neq c)}_{\text{x is labeled as corrupted}} =\max_{y\in \{x\}}\ell(\tilde{h}(y),c)

So there exists lemma 2, i.e.

L_P(h,R)=\Bbb{E}[\max_{y\in N(x)}\ell(h(y),c)]=\Bbb{E}[\max_{y\in \{x\}}\ell(\tilde{h}(y),c)]=L_P(\tilde{h},I_{\mathcal{X}})​

$R$ $R$ with the presence of an "adversary" that is not capable of changing the data point (there is only one point in the neighborhood), i.e. in the neighborhood bounded by identity relation.

Loss classes

$\mathcal{F},\tilde{\mathcal{F}}\sube (\mathcal{X}\times\tilde{\mathcal{C}}\to\{0,1\})$ $\mathcal{H}$ $\tilde{\mathcal{H}}$ respectively are defined as:

\mathcal{F}=\{\lambda(h):h\in\mathcal{H}\}\\ \tilde{\mathcal{F}}=\{\lambda(\tilde{h}):\tilde{h}\in\tilde{\mathcal{H}}\}

$\lambda$ maps a hypothesis to a loss class, i.e.

\lambda:(\mathcal{X}\to\tilde{\mathcal{C}})\to(\mathcal{X}\times\tilde{\mathcal{C}}\to\{0,1\})\\ \lambda(h):\mathcal{X}\times\tilde{C}\to\{0,1\}

$\lambda(h)$ is defined as:

\lambda(h)=(y,c)\mapsto \bold{1}(h(y)\neq c)

$\lambda(h)(h(x),c)=1$ , otherwise 0.

$\mathcal{F}$ is an error measure on the hypothesis.

Lemma 3

$\hat{f}=\lambda(\kappa(AERM_{\mathcal{H},R}(\bold{x},\bold{c})))$ $1-\delta$ ,

\Bbb{E}_P(\hat{f}(x,c))-\inf_{f\in\tilde{\mathcal{F}}}\Bbb{E}_P(f(x,c))\le 2R(\tilde{\mathcal{F}}(\bold{x},\bold{c}))+\sqrt{\frac{32\log (4/\delta)}{n}}

where

\tilde{\mathcal{F}}(\bold{x},\bold{c})=\{(\tilde{f}(x_0,c_0),\dots,\tilde{f}(x_{n-1},c_{n-1})):\tilde{f}\in\tilde{\mathcal{F}}\}\\ R(\mathcal{T})=\frac{1}{n 2^{n}}\sum_{s\in\{-1,1\}^n}\sup_{t\in\mathcal{T}}s^\top t

This lemma states that the error rate of the corrupted hypothesis corresponding to the Adversarial Empirical Risk Minimizer hypothesis minus the infimum of the error rate of the corrupted hypothesis is upper bounded.

Adversarial VC-dimension

Equivalent shattering coefficient definitions

$i^{th}$ $\mathcal{H}\sube (\mathcal{X}\times\mathcal{C})$ $\sigma(\mathcal{H},i)=\max_{\bold{y}\in\mathcal{X}^i}|\{(h(y_0),\dots,h(y_{i-1})):h\in\mathcal{H}\}|$ .

$\mathcal{F}\sube (\mathcal{X}\times\mathcal{C}\to\{0,1\})$ is

\sigma^\prime(\mathcal{F},i)=\max_{(\bold{y},\bold{c})\in\mathcal{X}^i\times\mathcal{C}^i}|\{(f(y_0,c_0),\dots,f(y_{i-1},c_{i-1})):f\in\mathcal{F}\}|

These two definitions are equivalent as they shown.

The ordinary VC-dimension is then

VC(\mathcal{H})=\sup\{n\in\N:\sigma(\mathcal{H},n)=2^n\}=\sup\{n\in\N:\sigma^\prime(\lambda(\mathcal{H}),n)=2^n\}

From VC dimension - Wikipedia:

$f$ $D$ $D$ $f$ .

$\{0,1\}$ $n$ $2^n$ .

$|\cdot|$ ) of elements in a set that can be shattered by this hypothesis.

Adversarial VC-dimension

The adversarial VC-dimension is

AVC(\mathcal{H},R)=\sup\{n\in\N:\sigma^\prime(\lambda(\tilde{H}),n)=2^n\}

The ordinary VC-dimension of corrupted hypothesis.

These definitions and lemmas can now be combined to obtain a sample complexity upper bound for PAC-learning in the presence of an evasion adversary.

Theorem 1 (Sample complexity upper bound with an evasion adversary)

$\mathcal{X}$ $\mathcal{H}$ $R$ $C$ such that

m_{\mathcal{H},R}(\delta,\epsilon)\le C\frac{d\log (d/\epsilon)+\log(1/\delta)}{\epsilon^2}

$d=AVC(\mathcal{H},\mathcal{R})$ .

It seems to state that the number of samples required by the learning in the presence of an adversary is finite?

The adversarial VC-dimension of halfspace classifiers

A halfspace classifier has a hyperplane as a binary classification boundary.

Convex constraint on binary adversarial relation

$\mathcal{B}$ be a nonempty, closed, convex origin-symmetric set.

$\mathcal{B}$ $||x||_{\mathcal{B}}=\inf\{\epsilon\in\R_{\ge 0}:x\in\epsilon \mathcal{B}\}$ $d_{\mathcal{B}}(x,y)=||x-y||_{\mathcal{B}}$ .

$V_{\mathcal{B}}$ $\mathcal{B}$ .

$\mathcal{B}$ $R=\{(x,y):y-x\in\mathcal{B}\}$ $N(x)=x+\mathcal{B}$ .

$R$ $\ell_p$ $p\ge 1$ .

Definition 7

$\mathcal{H}$ $\mathcal{X}=\R^d$ .

$x\in\mathcal{X}$ $h\in\mathcal{H}$ , define the signed distance to the boundary to be

\delta_{\mathcal{B}}(h,x,c)=c\cdot h(x)\cdot\inf_{y\in\mathcal{X}:h(y)\neq h(x)}d_{\mathcal{B}}(x,y)

$\bold{x}=(x_0,\dots,x_{n-1})\in\mathcal{X}^n$ , define the signed distance set to be

D_{\mathcal{B}}(\mathcal{H},\bold{x},\bold{c})=\{(\delta_{\mathcal{B}}(h,x_0),\dots,\delta_{\mathcal{B}}(h,x_{n-1})):h\in\mathcal{H}\}

$\mathcal{H}$ be the family of halfspace classifiers, i.e.

\mathcal{H}=\{(x\mapsto\text{sgn}(a^\top x-b)):a\in\R^d,b\in\R\}

$\text{sgn}(0)=\bot$ .

$d+1$ .

Theorem 2

$\mathcal{H}$ $\mathcal{X}=\R^d$ .

$\mathcal{B}$ be a nonempty, closed, convex, origin-symmetric set.

$R=\{(x,y):y-x\in\mathcal{B}\}$ .

$AVC(\mathcal{H},R)=d+1-\dim(V_B)$ .

$\mathcal{B}$ $\ell_p$ $\dim(V_B)=0$ $AVC(\mathcal{H},R)=d+1$ .

The proof is in the original paper.

$\ell_p$ bounded adversary is the same as a normal halfspace classifier.

Adversarial VC dimension can be larger

We have shown in the previous section that the adversarial VC-dimension can be smaller than or equal to the standard VC-dimension.

Theorem 3

$d\in\N$ $\mathcal{X}$ $R\sube\mathcal{X}\times\mathcal{X}$ $\mathcal{H}:\mathcal{X}\to\mathcal{C}$ $VC(\mathcal{H})=1$ $AVC(\mathcal{H},R)\ge d$ .

Proof

$\mathcal{X}=\Z^d$ $\mathcal{H}=\{h_x:x\in\mathcal{X}\}$ , where

h_x(y)=\begin{cases} 1&y=x\\ -1&y\neq x \end{cases}

$(1,1)$ for any pair of distinct examples.

This family of hypothesis checks if the input is some designated point or not, hence it can only shatter a set of one point.

$\ell_{\infty}$ $\bot$ .

Because for a degraded classifier to output 1, the neighborhood of the designated point should all be predicted as 1, but there is only one point can be predicted as 1.

$\bold{x}=(x_0,\dots,x_{d-1})\in(\Z^d)^d$ :

(x_i)_j=\begin{cases} -1&i=j\\ 1&i\neq j \end{cases}

$2^d$ $y\in\{0,1\}^{[d]}:\tilde{h}_y=\kappa(h_y)$ .

Observe that

\tilde{h}_y(x_i)=\begin{cases} -1&y_i=1\\ \bot&y_i=0 \end{cases}

$y_i=1$ $d_{\infty}(x_i,y)=2$ $y_i=0$ $d_{\infty}(x_i,y)=1$ .

$(f_y(x_0),\dots,f_y(x_{d-1}))$ $\bot$ $y$ $-1$ $2^d$ possible error patterns. (???)

$d$ .

This example shows that an adversary can not only affect the optimal loss, as shown in Lemma 1, but can also slow the convergence rate to the optimal hypothesis.

Concluding remarks

While our results provide a useful theoretical understanding of the problem of learning with adversaries, the nature of the 0-1 loss prevents the efficient implementation of Adversarial ERM to obtain robust classifiers.

In practice, recent work on adversarial training [33, 52, 75], has sought to improve the robustness of classifiers by directly trying to find a classifier that minimizes the Adversarial Expected Risk, which leads to a saddle point problem [52].

A number of heuristics are used to enable the efficient solution of this problem, such as replacing the 0-1 loss with smooth surrogates like the logistic loss and approximating the inner maximum by a Projected Gradient Descent (PGD)-based adversary [52] or by an upper bound [63].

Our framework now allows for an analysis of the underlying PAC learning problem for these approaches.

An interesting direction is thus to find the adversarial VC-dimension for more complex classifier families such as piece-wise linear classifiers and neural networks.
Another natural next step is to understand the behavior of convex learning problems in the presence of adversaries, in particular the Regularized Loss Minimization framework.

Inspirations

This paper is pure theoretical, without supporting experiments.

They propose a framework to encompass an adversary into PAC-learning, which is meaningful but so far not that useful.

They prove that for halfspace classifiers, the VC-dimension with the presence of an adversary is equal or less than that without adversary, but this relation varies in different families of classifiers.

It means that for a halfspace classifier, with the presence of an adversary, the sample complexity required is equal or less than that without adversary. In other words, an adversary may improve the performance of a halfspace classifier with limited number of data points. (Is this understanding correct ?)

$\ell_p$ bounded, since the VC dimension would be the same) can improve the performance of few shot learning.