Next: The Convergence of the Up: Convergence Theorems in Generalized Previous: Convergence Theorems in Generalized

## The Convergence of a General Value Iteration Process

Let be an arbitrary state-space and denote by the set of value functions over (i.e., the set of bounded functions), and let be an arbitrary contraction mapping with (unique) fixed point .

Let be a sequence of stochastic operators. The second argument of is intended to modify the first one, in order to get a better approximation of . Formally, let be an arbitrary value function and let . is said to approximate at with probability one over , if uniformly over .

Theorem A.1 (Szepesvári and Littman)   Let the sequence of random operators approximate at with probability one uniformly over . Let be an arbitrary value function, and define . If there exist functions and satisfying the conditions below with probability one, then converges to with probability one uniformly over :
1. for all and all ,

2. for all and all ,

3. for all , converges to zero uniformly in as increases; and,
4. there exists such that for all and large enough ,

The proof can be found in [Szepesvári and Littman(1996)]. We cite here the lemma, which is the base of the proof, since our generalization concerns this lemma.

Lemma A.2   Let be an arbitrary set, and consider the sequence

where , with probability one for some , and for all , and w.p.1. Assume that for all , uniformly in w.p.1 and w.p.1. Then converges to 0 w.p.1 as well.

Next: The Convergence of the Up: Convergence Theorems in Generalized Previous: Convergence Theorems in Generalized