Sampling from Near-identical Distributions

In this section we prove a technical lemma that is needed for the proof of Lemma 3.4.

Proof. Let us define the sets

$\displaystyle X^+$	$\displaystyle :=$	$\displaystyle \{ x\in X \mid P(x) > Q(x) \}$ and
$\displaystyle X^-$	$\displaystyle :=$	$\displaystyle \{ x\in X \mid P(x) \le Q(x) \}.$

Furthermore, define the distributions¹¹

$\displaystyle R(x)$	$\displaystyle :=$	$\begin{displaymath}\begin{cases} \frac{Q(x)-P(x)}{ Q(X^-) - P(X^-)} & \text{if $x\in X^-$}, \\ 0 & \text{otherwise} \end{cases}\text{ and}\end{displaymath}$
$\displaystyle S(x)$	$\displaystyle :=$	$\begin{displaymath}\begin{cases} 0 & \text{if $x\in X^+$, with probability $Q(p)... ...(P(p)-Q(p))/P(X^+)$}, \\ 0 & \text{if $x\in X^-$}. \end{cases}\end{displaymath}$

The denominator in the definition of

is positive, because for all $x \in X^-$ ,

is non-negative, therefore

is zero only if

over

. In this case

over

by the same reasoning, which means $P \equiv Q$ , but

and

are different.

It is easy to check that is indeed a distribution over (with support set ).

Let and be independent random variables from distributions and , respectively. Now define as follows:

$\displaystyle q := \begin{cases}p & \text{if $p \in X^-$}, \\ p & \text{if $p \in X^+$\ and $s=0$}, \\ r & \text{if $p \in X^+$\ and $s=1$}, \\ \end{cases}$

(13)

We claim that the such defined is suitable. Indeed, by Equation 13

$\displaystyle \Pr(p\!\ne\!q)$	$\displaystyle =$	$\displaystyle \Pr(p \in X^+, s=1) = \sum_{x \in X^+}\Pr(p=x, s=1)$
	$\displaystyle =$	$\displaystyle \sum_{x \in X^+} \Pr(p = x) \Pr(s=1 \vert p=x)$
	$\displaystyle =$	$\displaystyle \sum_{x \in X^+} P(x) \cdot \frac{P(x)-Q(x)}{P(x)} \le \Vert P - Q \Vert _{L_1} \le \varepsilon .$

Furthermore,

$\displaystyle \Pr(q\!=\!x)$	$\displaystyle =$	$\displaystyle \Pr(q=x \vert p\in X^-) \Pr(p\in X^-)$
		$\displaystyle + \Pr(q=x \vert p\in X^+, s=0) \Pr(p\in X^+, s=0)$
		$\displaystyle + \Pr(q=x \vert p\in X^+, s=1) \Pr(p\in X^+, s=1)$
	$\displaystyle =$	$\displaystyle \Pr(p=x \vert p\in X^-) \Pr(p\in X^-)$
		$\displaystyle + \Pr(p=x \vert p\in X^+, s=0) \Pr(p\in X^+, s=0)$
		$\displaystyle + \Pr(r=x \vert p\in X^+, s=1) \Pr(p\in X^+, s=1).$

By applying Bayes' Theorem on each term, we get

$\displaystyle \Pr(q\!=\!x)$	$\displaystyle =$	$\displaystyle \Pr(p\in X^- \vert p=x ) \Pr(p=x)$	(14)
		$\displaystyle + \Pr(p\in X^+, s=0 \vert p=x) \Pr(p=x)$
		$\displaystyle + \Pr(p\in X^+, s=1 \vert r=x) \Pr(r=x).$

We calculate each term of Equation 14 separately, both for $x\in X^+$ and $x \in X^-$ . First assume that $x \in X^-$ . Then

$\displaystyle \Pr(p\in X^- \vert p=x ) \Pr(p=x)$	$\displaystyle =$	$\displaystyle 1 \cdot P(x),$
$\displaystyle \Pr(p\in X^+, s=0 \vert p=x) \Pr(p=x)$	$\displaystyle =$	$\displaystyle \Pr(p\in X^+ \vert p=x) \Pr(s=0 \vert p\in X^+, p=x) P(x)$
	$\displaystyle =$	$\displaystyle 0 \cdot \Pr(s=0 \vert p\in X^+, p=x) P(x) = 0,$ and
$\displaystyle \Pr(p\in X^+, s=1 \vert r=x) \Pr(r=x)$	$\displaystyle =$	$\displaystyle \Pr(s=1 \vert p\in X^+, r=x) \Pr(p\in X^+ \vert r=x) \Pr(r=x)$
	$\displaystyle =$	$\displaystyle \Pr(s=1 \vert p\in X^+) \Pr(p\in X^+) \Pr(r=x)$
	$\displaystyle =$	$\displaystyle \frac{P(X^+) - Q(X^+)}{P(X^+)} \cdot P(X^+) \frac{Q(x)-P(x)}{Q(X^-)-P(X^-)}$
	$\displaystyle =$	$\displaystyle Q(x)-P(x).$

Before the last equation we used the definition of

, and in the last equation we have used the equality

, which is true because

Now let us consider the case $x\in X^+$ .

$\displaystyle \Pr(p\in X^- \vert p=x ) \Pr(p=x)$	$\displaystyle =$	$\displaystyle 0 \cdot P(x) = 0,$
$\displaystyle \Pr(p\in X^+, s=0 \vert p=x) \Pr(p=x)$	$\displaystyle =$	$\displaystyle \Pr(p\in X^+ \vert p=x) \Pr(s=0 \vert p\in X^+, p=x) P(x)$
	$\displaystyle =$	$\displaystyle 1 \cdot \frac{Q(x)}{P(x)} P(x) = Q(x),$ and
$\displaystyle \Pr(p\in X^+, s=1 \vert r=x) \Pr(r=x)$	$\displaystyle =$	$\displaystyle \Pr(p\in X^+, s=1 \vert r=x) \cdot 0 = 0.$

In both cases we get $\Pr(q=x) = Q(x)$ , which was to be proven. $\qedsymbol$