app-dp - Chapter 0 Discrete Time Dynamic Programming 0.1...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Chapter 0 Discrete Time Dynamic Programming 0.1 The Finite Horizon Case Time is discrete and indexed by t =0 ; 1 ;:::;T ,where T< 1 . An individual is interested in maximizing an objective function given by E 0 T X t =0 ¯ t u ( x t ;a t ) ; (0.1) where the instantaneous return function (or utility function) u depends on a state variable x t and a control variable a t at each date. We assume that x t 2 X ½R m , a t 2 A ½R n for all t , ¯ 2 (0 ; 1) ,and E 0 denotes the expectation conditional on information available at date 0 .T h en a t u r eo f the uncertainty in the environment will be made explicit below. There are several constraints faced by the individual. At each date the control variable is constrained to belong to a set that may depend on the state: a t 2 ¡( x t ) for all t . The initial value of the state x 0 is given by nature. Future states evolve according to the transition equation x t +1 = f ( x
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
" t is a random variable, unobservable at date t , described by a cu- mulative distribution function that may depend on the state and action but does not depend on t F ( " j x; a )= Prob ( " t · " j x t = x; a t = a ) : (0.3) Given any sequence of controls f a t g , we can construct the probability distrib- ution of future states conditional on x 0 from (0.2) and (0.3); the expectation E 0 in (0.1) is with respect to this distribution. 1 Notice that the environment here is stationary , in the sense that u , ¡ , f and F do not depend on t . It would be a straightforward generalization to allow these objects to vary with time, but we adopt a stationary environment because it yields time-invariant decision rules in the in…nite horizon case stud iedbe low .Inanycase , ( x t ;a t ) contains all of the information available at date t that is relevant for the probability distribution of future events. This, together with the additive separability of the objective function in (0.1), implies that the control at date t will depend only on the current state, a t = ® t ( x t ) ,where ® t is referred to as the decision rule . A policy (of length T ) is de…ned to be a sequence of decision rules, ¼ T = ( ® 0 1 ;:::;® T ) ,where ® t : X ! A for all t . The set of feasible policies is ¦ T = f ¼ T =( ® 0 1 ;:::;® T ): ® t ( x ) 2 ¡( x ) 8 x;t g : (0.4) A policy is stationary if it does not depend upon time: ® t ( x ) ´ ® ( x ) .An y given policy generates a stochastic law of motion for the state, x t +1 = f [ x t t ( x t ) ;" t ] ; which will be stationary if ® t is stationary. 1 Alternatively, the uncertainty can be described by starting with a Markov probability transition function, Q ( x;a; ~ X )=Pr( x t +1 2 ~ X j x t = x;a t = a ) ; for ~ X ½ X , as in Stokey, Lucas and Prescott (1989). One can always construct such a Q from our
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This essay was uploaded on 04/20/2008 for the course ECON 800 taught by Professor Krueger during the Spring '02 term at Stanford.

Page1 / 11

app-dp - Chapter 0 Discrete Time Dynamic Programming 0.1...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online