Getting The Public Invested In Science

FundScience Blog

Welcome to the FundScience Blog. This page was created to bring you the news of our venture by the FundScience team (Category: FundScience News) as well as interesting subjects that are related to education and science. We welcome and encourage comments and discussions on the posted topics. If you are a writer and are interested in posting please contact us. If you are a reader we hope that you sign-up for a feed of our blog and/or a quarterly collection of the published articles in an easy to read and pass to friends PDF format.

We hope you stay with us as we develop this exciting project!
The FundScience Team

Blackwell’s proof of Wald’s identity

04.29.11 by mraginsky

Every once in a while you come across a mathematical argument of such incredible beauty that you feel compelled to tell the whole world about it. This post is about one such gem: David Blackwell’s 1946 proof of Wald’s identity on the expected value of a randomly stopped random walk. In fact, even forty years after the publication of that paper, in a conversation with Morris DeGroot, Blackwell said: “That’s a paper I’m still very proud of. It just gives me pleasant feelings every time I think about it.”

Let {X^\infty = (X_1,X_2,\ldots)} be a sequence of independent real-valued random variables with common mean {\mu}. Let {\tau} be a stopping time, i.e., a random variable taking values in {{\mathbb N}}, such that the occurrence or nonoccurrence of the event {\{\tau=n\}} is determined only by {X^n := (X_1,\ldots,X_n)}. We say that Wald’s identity holds for the pair {(X^\infty,\tau)} if

\displaystyle  \mathop{\mathbb E}\left[ X_1 + \ldots + X_\tau\right] = \mu \cdot \mathop{\mathbb E} \tau.

The first result of this type has appeared in the seminal paper of Abraham Wald on sequential hypothesis testing (hence the name), and it can be stated as follows:

Theorem 1 Assume that {X_1,X_2,\ldots} are i.i.d., {\mathop{\mathbb E} |X_i| = \mathop{\mathbb E} |X_1| < \infty}, and {\mathop{\mathbb E} \tau < \infty}. Then Wald’s identity holds for the pair {(X^\infty,\tau)}.

The standard proof of this theorem uses martingale arguments. In 1946, David Blackwell found a beautiful proof that uses only Kolmogorov‘s Strong Law of Large Numbers (SLLN) and its converse. Since the latter is, perhaps, less well known than the former, I will just give both for convenience (see, e.g., Chapter 12 in Probability With Martingales by David Williams):

  1. SLLN: Let {U_1,U_2,\ldots} be i.i.d. with {\mathop{\mathbb E} |U_i| = \mathop{\mathbb E} |U_1| < \infty} for all {i}. Then

    \displaystyle  	\lim_{n \rightarrow \infty} \frac{U_1 + \ldots + U_n}{n} = \mathop{\mathbb E} U_1 \qquad \text{a.s.}

  2. Converse to SLLN: If {U_1,U_2,\ldots} are i.i.d. with {\mathop{\mathbb E} |U_i| = \mathop{\mathbb E} |U_1| = \infty}, then

    \displaystyle  	\lim_{n \rightarrow \infty} \frac{U_1 + \ldots + U_n}{n} \text{ does not exist a.s.}

Blackwell’s proof

Let us define recursively a sequence of random variables {\tau_1,\tau_2,\ldots} as follows. Let {\tau_1 = \tau}. Then, assuming {\tau_1,\ldots,\tau_k} have been defined, let

\displaystyle  \tau_{k+1} = \tau(X_{\tau_1+\ldots+\tau_k+1}, X_{\tau_1 + \ldots + \tau_k + 2},\ldots).

In other words, {\tau_1} is the stopping time determined by the original full sequence {X^\infty}, {\tau_2} is the stopping time determined by the remaining data after the first stopping time, i.e., by {X^\infty_{\tau_1+1}}, {\tau_3} is the stopping time determined by the data remaining after the second stopping time, i.e., by {X^\infty_{\tau_1+\tau_2 + 1}}, etc. We now claim that {\tau^\infty = (\tau_1,\tau_2,\ldots)} is an i.i.d. sequence. To see this, let us compute the conditional probability that {\tau_{k+1} = n} given {\tau^k = (\tau_1,\ldots,\tau_k)}. Given positive integers {j_1,\ldots,j_k}, define the event

\displaystyle  S = \left\{ \tau_1 = j_1,\ldots,\tau_k = j_k \right\}

as well as the event

\displaystyle  R = \left\{ \tau(X^\infty_{j_1+\ldots +j_k +1}) = n \right\}.

By construction, the occurrence or nonoccurrence of {S} is determined only by {X^{j_1 + \ldots + j_k}}, so {S} is independent of {R}. Therefore,

\displaystyle  \begin{array}{rcl}  	\Pr\Big(\tau_{k+1}= n |\tau_1 = j_1,\ldots,\tau_k = j_k\Big) &=& \frac{\Pr\left(S \cap \left\{ \tau_{k+1} = n\right\}\right)}{\Pr(S)} \\ 	&=& \frac{\Pr(S \cap R)}{\Pr(S)} \\ 	&=& \Pr(R). \end{array}

Moreover, since the {X_i}‘s are i.i.d., {\Pr(R) = \Pr(\tau_1 = n) =: \pi_n}. Thus, we have shown that, for every {k = 1,2,\ldots} and every {n},

\displaystyle  \Pr(\tau_{k+1} = n |\tau_1,\ldots,\tau_k) = \pi_n,

from which we conclude that {\tau_1,\tau_2,\ldots} is an i.i.d. sequence.

Now let us define, for each {k}, the random variable

\displaystyle  S_k = X_{\tau_1 + \ldots + \tau_{k-1} + 1} + \ldots + X_{\tau_1 + \ldots + \tau_k}

(where {S_1 = X_1 + \ldots + X_{\tau_1} \equiv X_1 + \ldots + X_\tau}). Exactly the same argument as above can be used to show that {S_1,S_2,\ldots} are i.i.d.. Now, noting that

\displaystyle  S_1 + \ldots + S_k = X_1 + X_2 + \ldots + X_{\tau_1 + \ldots + \tau_k},

we can write

\displaystyle  	\frac{S_1 + \ldots + S_k}{k} = \frac{X_1 + \ldots + X_{\tau_1 + \ldots + \tau_k}}{k} = \frac{X_1 + \ldots + X_{\tau_1 + \ldots + \tau_k}}{\tau_1 + \ldots + \tau_k} \cdot \frac{\tau_1 + \ldots + \tau_k}{k}.

Let’s see what happens as {k \rightarrow \infty}. First of all, if we let {Z_n := n^{-1}(X_1 + \ldots + X_n)}, then, by the SLLN, {Z_n \rightarrow \mathop{\mathbb E} X_1 =\mu} a.s. as {n \rightarrow \infty}. Since any subsequence of a convergent sequence converges to the same limit, we have

\displaystyle  \lim_{k \rightarrow \infty} \frac{X_1 + \ldots + X_{\tau_1 + \ldots + \tau_k}}{\tau_1 + \ldots + \tau_k} = \lim_{k \rightarrow \infty} Z_{\tau_1 + \ldots + \tau_k} = \mu, \qquad \text{a.s.}

Moreover, since {\tau_1,\tau_2,\ldots} are i.i.d. with mean {\mathop{\mathbb E} \tau}, {k^{-1}(\tau_1 + \ldots + \tau_k) \rightarrow \mathop{\mathbb E}\tau} almost surely as {k \rightarrow \infty}. Therefore,

\displaystyle  \lim_{k \rightarrow \infty} \frac{S_1 + \ldots + S_k}{k} = \mu \cdot \mathop{\mathbb E}\tau \qquad \text{a.s.}

Now we invoke the converse to the SLLN (more precisely, the contrapositive to the converse): Since {S_1,S_2,\ldots} are i.i.d. and the limit of {k^{-1}(S_1+\ldots+S_k)} exists a.s., then {\mathop{\mathbb E} S_1} is finite and equal to this limit. Hence,

\displaystyle  \mathop{\mathbb E} (X_1 + \ldots + X_\tau) = \mathop{\mathbb E} S_1 = \mu \cdot \mathop{\mathbb E} \tau.

The theorem is proved.


| Posted in Mathematics, Probability | Comments Off

Copyright 2010 FundScience LLC