This preview shows page 1. Sign up to view the full content.
Unformatted text preview: n An inequality for variances of the discounted rewards and T E
0 rt d t = E
=
0 =
0 = ∞
0
∞ 1211 rt 1{T ≥ t } dt E[rt 1{T ≥ t }] dt ∞
∞ 0 E[rt ] E[1{T ≥ t }] dt
E rt P{T ≥ t } dt
∞ =E
0 e−αt rt dt. In particular, (2.1) holds for deterministic functions r and R , and, therefore,
E[J1  F∞ ] = E[J2  F∞ ] P a.s., (2.2) if either E[J1   F∞ ] < ∞ or E[J2   F∞ ] < ∞ Pa.s. However, the second moments can
be different. Indeed, we have the following statement.
Theorem 2.1. If either E[J1   F∞ ] < ∞ or E[J2   F∞ ] < ∞ Pa.s., then
var (J1 ) ≤ var (J2 ),
and the equality holds if and only if var (J2  F∞ ) = 0 Pa.s.
Proof. By the total variance formula (see [6, p. 83] or [3, p. 454]), for i = 1, 2,
var (Ji ) = E[var (Ji  F∞ )] + var (E[Ji  F∞ ]).
Therefore, because of (2.2),
var (E[J1  F∞ ]) = var (E[J2  F∞ ]).
In addition, E[var (J1  F∞ )] = 0 and E[var (J2  F∞ )] ≥ 0. Hence, var (J2 ) − var (J1 ) =
E[var (J2  F∞ )] ≥ 0, i.e. var (J1 ) ≤ var (J2 ).
Example 2.1. Cons...
View
Full
Document
This note was uploaded on 02/02/2014 for the course AMS 507 taught by Professor Feinberg,e during the Fall '08 term at SUNY Stony Brook.
 Fall '08
 Feinberg,E

Click to edit the document details