Unformatted text preview: ing randomized stationary and switching stationary policies [2, Theorem 5.1]. However,
the variances of the total discounted rewards for the policies can be different. In addition, they
may depend on the deﬁnition of discounting. Received 6 July 2009; revision received 5 October 2009.
∗ Postal address: Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794,
∗∗ Email address: [email protected]
∗∗∗ Email address: [email protected] 1209 1210 E. A. FEINBERG AND J. FEI 2. Main result
Let ( , F , P) be a probability space with a ﬁltration Ft , t ∈ [0, ∞), where Fs ⊆ Ft ⊆ F
for all 0 ≤ s < t < ∞. Consider a nondecreasing sequence of stopping times Tn , n = 1, 2, . . . .
t ∈[0,∞) We consider an Ft -adapted stochastic process rt , t ∈ [0, ∞), and an FTn -adapted stochastic
sequence Rn , n = 1, 2, . . . . The process rt can be interpreted as the reward rate at time t . In
addition, a lump sum Rn is collected at time Tn .
There are two n...
View Full Document
- Fall '08
- Variance, Probability theory, Tn, J2, total discounted rewards