notes3

notes3 - Dynamic Programming Peter Ireland∗ EC720.01 -...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Dynamic Programming Peter Ireland∗ EC720.01 - Math for Economists Boston College, Department of Economics Fall 2010 We have now studied two ways of solving dynamic optimization problems, one based on the Kuhn-Tucker theorem and the other based on the maximum principle. These two methods both lead us to the same sets of optimality conditions; they differ only in terms of how those optimality conditions are derived. Here, we will consider a third way of solving dynamic optimization problems: the method of dynamic programming. We will see, once again, that dynamic programming leads us to the same set of optimality conditions that the Kuhn-Tucker theorem does; once again, this new method differs from the others only in terms of how the optimality conditions are derived. While the maximum principle lends itself equally well to dynamic optimization problems set in both discrete time and continuous time, dynamic programming is easiest to apply in discrete time settings. On the other hand, dynamic programing, unlike the Kuhn-Tucker theorem and the maximum principle, can be used quite easily to solve problems in which optimal decisions must be made under conditions of uncertainty. Thus, in our discussion of dynamic programming, we will begin by considering dynamic programming under certainty; later, we will move on to consider stochastic dynamic programming. References: Dixit, Chapter 11. Acemoglu, Chapters 6 and 16. 1 1.1 Dynamic Programming Under Certainty A Perfect Foresight Dynamic Optimization Problem in Discrete Time No uncertainty ∗ c Copyright ￿2010 by Peter Ireland. Redistribution is permitted for educational and research purposes, so long as no changes are made. All copies must be provided free of charge and must include this copyright notice. 1 Discrete time, infinite horizon: t = 0, 1, 2, ... yt = stock, or state, variable zt = flow, or control, variable Objective function: ∞ ￿ β t F ( yt , zt ; t) t=0 1 > β > 0 discount factor Constraint describing the evolution of the state variable Q(yt , zt ; t) ≥ yt+1 − yt or yt + Q(yt , zt ; t) ≥ yt+1 for all t = 0, 1, 2, ... Constraint applying to variables within each period: c ≥ G( yt , z t ; t ) for all t = 0, 1, 2, ... Constraint on initial value of the state variable: y0 given The problem: choose sequences {zt }∞ and {yt }∞ to maximize the objective function t=0 t=1 subject to all of the constraints. Notes: a) It is important for the application of dynamic programming that the problem is additively time separable: that is, the values of F , Q, and G at time t must depend only on the values of yt and zt at time t. b) Once again, it must be emphasized that although the constraints describing the evolution of the state variable and that apply to the variables within each period can each be written in the form of a single equation, these constraints must hold for all t = 0, 1, 2, .... Thus, each equation actually represents an infinite number of constraints. 2 1.2 The Kuhn-Tucker Formulation Let’s being our analysis of this problem by applying the Kuhn-Tucker theorem. That is, let’s begin by setting up the Lagrangian and taking first order conditions. Set up the Lagrangian, recognizing that the constraints must hold for all t = 0, 1, 2, ...: ∞ ￿ L= t β F ( yt , zt ; t ) + t=0 ∞ ￿ t=0 µt+1 [yt + Q(yt , zt ; t) − yt+1 ] + ˜ ∞ ￿ t=0 ˜ λt [c − G(yt , zt ; t)] It will be convenient to define µt+1 = β −(t+1) µt+1 ⇒ µt+1 = β t+1 µt+1 ˜ ˜ ˜ ˜ λ t = β − t λt ⇒ λ t = β t λt ˜ and to rewrite the Lagrangian in terms of µt+1 and λt instead of µt+1 and λt : ˜ L= ∞ ￿ t β F ( yt , z t ; t ) + t=0 ∞ ￿ β t+1 t=0 µt+1 [yt + Q(yt , zt ; t) − yt+1 ] + ∞ ￿ t=0 β t λt [c − G(yt , zt ; t)] FOC for zt , t = 0, 1, 2, ...: β t F2 (yt , zt ; t) + β t+1 µt+1 Q2 (yt , zt ; t) − β t λt G2 (yt , zt ; t) = 0 FOC for yt , t = 1, 2, 3, ...: β t F1 (yt , zt ; t) + β t+1 µt+1 [1 + Q1 (yt , zt ; t)] − β t λt G1 (yt , zt ; t) − β t µt = 0 Now, let’s suppose that somehow we could solve for µt as a function of the state variable yt : µt = W ( y t ; t ) µt+1 = W (yt+1 ; t + 1) = W [yt + Q(yt , zt ; t); t + 1] Then we could rewrite the FOC as: F2 (yt , zt ; t) + β W [yt + Q(yt , zt ; t); t + 1]Q2 (yt , zt ; t) − λt G2 (yt , zt ; t) = 0 (1) W (yt ; t) = F1 (yt , zt ; t) + β W [yt + Q(yt , zt ; t); t + 1][1 + Q1 (yt , zt ; t)] − λt G1 (yt , zt ; t) (2) And together with the binding constraint yt+1 = yt + Q(yt , zt ; t) (3) and the complementary slackness condition λt [c − G(yt , zt ; t)] = 0 (4) we can think of (1) and (2) as forming a system of four equations in three unknown variables yt , zt , and λt and one unknown function W (:, t). This system of equations determines the problem’s solution. Note that since (3) is in the form of a difference equation, finding the problem’s solution involves solving a difference equation. 3 1.3 An Alternative Formulation Now let’s consider the same problem in a slightly different way. For any given value of the initial state variable y0 , define the value function v (y0 ; 0) = max { z t } ∞ ,{ y t } ∞ t=0 t=1 ∞ ￿ β t F ( yt , zt ; t ) t=0 subject to y0 given yt + Q(yt , zt ; t) ≥ yt+1 for all t = 0, 1, 2, ... c ≥ G(yt , zt ; t) for all t = 0, 1, 2, ... More generally, for any period t and any value of yt , define v ( yt ; t) = max { z t +j } ∞ , { y t +j } ∞ j =0 j =1 ∞ ￿ β j F ( y t +j , z t + j ; t + j ) j =0 subject to yt given yt+j + Q(yt+j , zt+j ; t + j ) ≥ yt+j +1 for all j = 0, 1, 2, ... c ≥ G(yt+j , zt+j ; t + j ) for all j = 0, 1, 2, ... Note that the value function is a maximum value function. Now consider expanding the definition of the value function by separating out the time t components: v (yt ; t) = max [F (yt , zt ; t) + zt ,yt+1 max { z t +j } ∞ ,{ y t + j } ∞ j =1 j =2 ∞ ￿ β j F (yt+j , zt+j ; t + j )] j =1 subject to yt given yt + Q(yt , zt ; t) ≥ yt+1 yt+j + Q(yt+j , zt+j ; t + j ) ≥ yt+j +1 for all j = 1, 2, 3, ... c ≥ G( yt , z t ; t ) c ≥ G(yt+j , zt+j ; t + j ) for all j = 1, 2, 3, ... 4 Next, relabel the time indices: v (yt ; t) = max [F (yt , zt ; t) + β zt ,yt+1 max {zt+1+j }∞ ,{yt+1+j }∞ j =0 j =1 ∞ ￿ β j F (yt+1+j , zt+1+j ; t + 1 + j )] j =0 subject to yt given yt + Q(yt , zt ; t) ≥ yt+1 yt+j +1 + Q(yt+1+j , zt+1+j ; t + 1 + j ) ≥ yt+1+j +1 for allj = 0, 1, 2, ... c ≥ G( yt , z t ; t ) c ≥ G(yt+1+j , zt+1+j ; t + 1 + j ) for all j = 0, 1, 2, ... Now notice that together, the components for t + 1 + j , j = 0, 1, 2, ... define v (yt+1 ; t + 1), enabling us to simplify the statement considerably: v (yt ; t) = max F (yt , zt ; t) + β v (yt+1 ; t + 1) zt ,yt+1 subject to yt given yt + Q(yt , zt ; t) ≥ yt+1 c ≥ G( yt , z t ; t ) Or, even more simply: v (yt ; t) = max F (yt , zt ; t) + β v [yt + Q(yt , zt ; t); t + 1] zt (5) subject to yt given c ≥ G( y t , z t ; t ) Equation (5) is called the Bellman equation for this problem, and lies at the heart of the dynamic programming approach. Note that the maximization on the right-hand side of (5) is a static optimization problem, involving no dynamic elements. By the Kuhn-Tucker theorem: v (yt ; t) = max F (yt , zt ; t) + β v [yt + Q(yt , zt ; t); t + 1] + λt [c − G(yt , zt ; t)] zt The FOC for zt is F2 (yt , zt ; t) + β v ￿ [yt + Q(yt , zt ; t); t + 1]Q2 (yt , zt ; t) − λt G2 (yt , zt ; t) = 0 5 (6) And by the envelope theorem: v ￿ (yt ; t) = F1 (yt , zt ; t) + β v ￿ [yt + Q(yt , zt ; t); t + 1][1 + Q1 (yt , zt ; t)] − λt G1 (yt , zt ; t) (7) Together with the binding constraint yt+1 = yt + Q(yt , zt ; t) (3) and complementary slackness condition λt [c − G(yt , zt ; t)] = 0, (4) we can think of (6) and (7) as forming a system of four equations in three unknown variables yt , zt , and λt and one unknown function v (:, t). This system of equations determines the problem’s solution. Note once again that since (3) is in the form of a difference equation, finding the problem’s solution involves solving a difference equation. But more important, notice that (6) and (7) are equivalent to (1) and (2) with v ￿ ( yt ; t) = W ( yt ; t ) . Thus, we have two ways of solving this discrete time dynamic optimization problem, both of which lead us to the same set of optimality conditions: a) Set up the Lagrangian for the dynamic optimization problem and take first order conditions for zt , t = 0, 1, 2, ... and yt , t = 1, 2, 3, .... b) Set up the Bellman equation and take the first order condition for zt and then derive the envelope condition for yt . One question remains: How, in practice, can we solve for the unknown value functions v (:, t)? To see how to answer this question, consider two examples: Example 1: Optimal Growth - Here, it will be possible to solve for v explicitly. Example 2: Saving Under Certainty - Here, it will not be possible to solve for v explicitly, yet we can learn enough about the properties of v to obtain some useful economic insights. 2 Example 1: Optimal Growth Here, we will modify the optimal growth example that we solved earlier using the maximum principle in two ways: a) We will switch to discrete time in order to facilitate the use of dynamic programming. 6 b) Set the depreciation rate for capital equal to δ = 1 in order to obtain a very special case in which an explicit solution for the value function can be found. Production function: α F ( kt ) = kt where 0 < α < 1 kt = capital (state variable) ct = consumption (control variable) Evolution of the capital stock: α kt+1 = kt − ct for all t = 0, 1, 2, ... Initial condition: k0 given Utility or social welfare: ∞ ￿ β t ln(ct ) t=0 The social planner’s problem: choose sequences {ct }∞ and {kt }∞ to maximize the utility t=0 t=1 function subject to all of the constraints. To solve this problem via dynamic programming, use kt = state variable ct = control variable Set up the Bellman equation: α v (kt ; t) = max ln(ct ) + β v (kt − ct ; t + 1) ct Now guess that the value function takes the time-invariant form v (kt ; t) = v (kt ) = E + F ln(kt ), where E and F are constants to be determined. Using the guess for v , the Bellman equation becomes α E + F ln(kt ) = max ln(ct ) + β E + β F ln(kt − ct ) ct FOC for ct : 1 βF −α =0 c t kt − c t 7 (8) (9) Envelope condition for kt : F αβ F k α−1 = αt kt kt − c t (10) Together with the binding constraint α kt+1 = kt − ct , (8)-(10) form a system of four equations in 4 unknowns: ct , kt , E , and F . Equation (9) implies α kt − c t = β F c t or ct = ￿ 1 1 + βF ￿ α kt (11) Substitute (11) into the envelope condition (10): F αβ F k α−1 = αt kt kt − c t ￿ ￿ 1 α α α F kt − F kt = αβ F kt 1 + βF ￿ ￿ 1 1− = αβ 1 + βF Hence 1 = 1 − αβ 1 + βF Or, equivalently, 1 + βF = βF = (10) (12) 1 1 − αβ 1 αβ −1= 1 − αβ 1 − αβ α F= 1 − αβ (13) Substitute (12) into (11) to obtain α ct = (1 − αβ )kt (14) which shows that it is optimal to consume the fixed fraction 1 − αβ of output. Evolution of capital: α α α α kt+1 = kt − ct = kt − (1 − αβ )kt = αβ kt which is in the form of a difference equation for kt . 8 (15) Equations (14) and (15) show how the optimal values of ct and kt+1 depend on the state variable kt and the parameters α and β . Given a value for k0 , these two equations can be used to construct the optimal sequences {ct }∞ and {kt }∞ . t=0 t=1 For the sake of completeness, substitute (14) and (15) back into (8) to solve for E : α E + F ln(kt ) = max ln(ct ) + β E + β F ln(kt − ct ) ct (8) E + F ln(kt ) = ln(1 − αβ ) + α ln(kt ) + β E + β F ln(αβ ) + αβ F ln(kt ) Since (13) implies that F = α + αβ F, this last equality reduces to E = ln(1 − αβ ) + β E + β F ln(αβ ) which leads directly to the solution E= 3 ln(1 − αβ ) + αβ 1−αβ ln(αβ ) 1−β Example 2: Saving Under Certainty Here, a consumer maximizes utility over an infinite horizon, t = 0, 1, 2, ..., earning income from labor and from investments. At = beginning-of-period assets At can be negative, that is, the consumer is allowed to borrow yt = labor income (exogenous) ct = consumption saving = st = At + yt − ct r = constant interest rate Evolution of assets: At+1 = (1 + r)st = (1 + r)(At + yt − ct ) Note: At + yt − ct = At = ￿ 1 1+r 9 ￿ ￿ 1 1+r ￿ At+1 At+1 + ct − yt Numerical Solutions to the Optimal Growth Model with Complete Depreciation Generated using equations (14) and (15). Each example sets = 0.33 and = 0.99. Example 1: k(0) = 0.01 capital stock consumption 0.5 0.2 0.4 0.15 0.3 0.1 0.2 0.05 0.1 0 0 0 2 4 6 8 10 12 14 16 18 0 20 2 4 6 8 10 12 14 16 18 20 14 16 18 20 Example 2: k(0) = 1 consumption capital stock 0.8 1.2 1 0.6 0.8 0.4 0.6 0.4 0.2 0.2 0 0 0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 In both examples, c(t) converges to its steady state value of 0.388 and k(t) converges to its steady-state value of 0.188. Similarly, At+1 = ￿ 1 1+r ￿ At+2 + ct+1 − yt+1 Combining these last two equalities yields ￿ ￿2 ￿ ￿ 1 1 At = At+2 + (ct+1 − yt+1 ) + (ct − yt ) 1+r 1+r Continuing in this manner yields At = ￿ 1 1+r ￿T At + T T −1 ￿ ￿ 1 ￿j + ( c t +j − y t + j ) . 1+r j =0 Now assume that the sequence {At }∞ must remain bounded (while borrowing is allowed, t=0 unlimited borrowing is ruled out), and take the limit as T → ∞ to obtain ∞ ￿ ￿ 1 ￿j At = ( c t + j − y t +j ) 1+r j =0 or ∞ ∞ ￿ ￿ 1 ￿j ￿ ￿ 1 ￿j At + yt +j = c t +j . 1+r 1+r j =0 j =0 (16) Equation (16) takes the form of an infinite horizon budget constraint, indicating that over the infinite horizon beginning at any period t, the consumer’s sources of funds include assets At and the present value of current and future labor income, while the consumer’s use of funds is summarized by the present value of current and future consumption. The consumer’s problem: choose the sequences {st }∞ and {At }∞ to maximize the utility t=0 t=1 function ∞ ∞ ￿ ￿ β t u( c t ) = β t u ( At + y t − s t ) t=0 t=0 subject to the constraints A0 given and (1 + r)st ≥ At+1 for all t = 0, 1, 2, ... To solve the problem via dynamic programming, note first that At = state variable st = control variable 10 Set up the Bellman equation v (At ; t) = max u(At + yt − st ) + β v (At+1 ; t + 1) st(1 + r)st ≥ At+1 st v (At ; t) = max u(At + yt − st ) + β v [(1 + r)st ; t + 1] st FOC for st : −u￿ (At + yt − st ) + β (1 + r)v ￿ [(1 + r)st ; t + 1] = 0 Envelope condition for At : v ￿ (At ; t) = u￿ (At + yt − st ) Use the constraints to rewrite these optimality conditions as u￿ (ct ) = β (1 + r)v ￿ (At+1 ; t + 1) (17) v ￿ ( At ; t ) = u ￿ ( c t ) (18) and Since (18) must hold for all t = 0, 1, 2, ..., it implies v ￿ (At+1 ; t + 1) = u￿ (ct+1 ) Substitute this result into (17) to obtain: u￿ (ct ) = β (1 + r)u￿ (ct+1 ) (19) Now make 2 extra assumptions: a) β (1 + r) = 1 or 1 + r = 1/β , the interest rate equals the discount rate b) u is strictly concave Under these 2 additional assumptions, (19) implies u￿ (ct ) = u￿ (ct+1 ) or ct = ct+1 And since this last equation must hold for all t = 0, 1, 2, ..., it implies ct = ct+j for all j = 0, 1, 2, ... Now, return to (16): ∞ ∞ ￿ ￿ 1 ￿j ￿ ￿ 1 ￿j At + yt +j = c t +j . 1+r 1+r j =0 j =0 ∞ ∞ ￿ ￿ 1 ￿j ￿ At + y t +j = c t βj 1+r j =0 j =0 11 (16) (20) FACT: Since |β | < 1, ∞ ￿ βj = j =0 1 1−β To see why this is true, multiply both sides by 1 − β : 1= 1−β 1−β = (1 − β ) ∞ ￿ βj j =0 2 = (1 + β + β + ...) − β (1 + β + β 2 + ...) = (1 + β + β 2 + ...) − (β + β 2 + β 3 + ...) =1 Use this fact to rewrite (20): ￿ ￿ ∞ ￿ ￿ 1 ￿j 1 At + y t +j = ct 1+r 1−β j =0 or ￿ ∞ ￿ ￿ 1 ￿j ct = (1 − β ) At + y t +j 1+r j =0 ￿ (21) Equation (21) indicates that it is optimal to consume a fixed fraction 1 − β of wealth at each date t, where wealth consists of value of current asset holdings and the present discounted value of future labor income. Thus, (21) describes a version of the permanent income hypothesis. 4 4.1 Stochastic Dynamic Programming A Dynamic Stochastic Optimization Problem Discrete time, infinite horizon: t = 0, 1, 2, ... yt = state variable zt = control variable εt+1 = random shock, which is observed at the beginning of t + 1 Thus, when zt is chosen: εt is known ... ... but εt+1 is still viewed as random. 12 The shock εt+1 may be serially correlated, but will be assumed to have the Markov property (i.e., to be generated by a Markov process): the distribution of εt+1 depends on εt , but not on εt−1 , εt−2 , εt−3 , .... For example, εt+1 may follow a first-order autoregressive process: εt+1 = ρεt + ηt+1 . Now, the full state of the economy at the beginning of each period is described jointly by the pair of values for yt and εt , since the value for εt is relevant for forecasting, that is, forming expectations of, future values of εt+j , j = 1, 2, 3, .... Objective function: E0 ∞ ￿ β t F (yt , zt , εt ) t=0 1 > β > 0 discount factor E0 = expected value as of t = 0 Constraint describing the evolution of the state variable yt + Q(yt , zt , εt+1 ) ≥ yt+1 for all t = 0, 1, 2, ... and for all possible realizations of εt+1 Note that the constraint implies that the randomness in εt+1 induces randomness into yt+1 as well: in particular, the value of yt+1 does not become known until εt+1 is observed at the beginning of t + 1 for all t = 0, 1, 2, ... Note, too, that the sequential, period-by-period, revelation of values for εt , t = 0, 1, 2, ..., also generates sequential growth in the information set available to the agent solving the problem: At the beginning of period t = 0, the agent knows I0 = {y0 , ε0 } At the beginning of period t = 1, the agent’s information set expands to I1 = {y1 , ε1 , y0 , ε0 } And, more generally, at the beginning of period t = 0, 1, 2, ..., the agent’s information set is given by It = {yt , εt , yt−1 , εt−1 , ..., y0 , ε0 } so that conditional expectations of future variables are defined implicitly with respect to this growing information set: for any variable Xt+j whose value becomes known at time t + j , j = 0, 1, 2, ...: Et Xt+j = E (Xt+j |It ) = E (Xt+j |yt , εt , yt−1 , εt−1 , ..., y0 , ε0 ) 13 The role of the additive time separability of the objective function, the similar “additive time separability” that is built into the constraints, and the Markov property of the shocks is to make the most recent values of yt and εt sufficient statistics for It , so that within the confines of this problem, Et (Xt+j |yt , εt , yt−1 , εt−1 , ..., y0 , ε0 ) = Et (Xt+j |yt , εt ). Note, finally, that the randomness in yt+1 induced by the randomness in εt+1 also introduces randomness into the choice of zt+1 from the perspective of time t: Given (yt , εt ), choose zt ⇒ Given (yt , zt ) the realization of εt+1 determines (yt+1 , εt+1 ) ⇒ Given (yt+1 , εt+1 ), choose zt+1 This makes the number of choice variables, as well as the number of constraints, quite large. The problem: choose contingency plans for zt , t = 0, 1, 2, ..., and yt , t = 1, 2, 3, ..., to maximize the objective function subject to all of the constraints. Notes: a) In order to incorporate uncertainty, we have really only made two adjustments to the problem: First, we have added the shock εt to the objective function for period t and the shock εt+1 to the constraint linking periods t and t + 1. And second, we have assumed that the planner cares about the expected value of the objective function. b) For simplicity, the functions F and Q are now assumed to be time-invariant, although now they depend on the shock as well as on the state and control variable. c) For simplicity, we have also dropped the second set of constraints, c ≥ G(yt , zt ). Adding them back is straightforward, but complicates the algebra. d) In the presence of uncertainty, the constraint yt + Q(yt , zt , εt+1 ) ≥ yt+1 must hold, not only for all t = 0, 1, 2, ..., but for all possible realizations of εt+1 as well. Thus, this single equation can actually represent a very large number of constraints. e) The Kuhn-Tucker theorem can still be used to solve problems that feature uncertainty. But because problems with uncertainty can have a very large number of choice variables and constraints, the Kuhn-Tucker theorem can become very difficult to apply in practice, since one may have to introduce a very large number of Lagrange multipliers. Dynamic programming, therefore, can be an easier and more convenient way to solve dynamic stochastic optimization problems. 14 4.2 The Dynamic Programming Formulation Once again, for any values of y0 and ε0 , define v (y0 , ε0 ) = max ∞ {zt }t=0 ,{yt }∞ t=1 E0 ∞ ￿ β t F (yt , zt , εt ) t=0 subject to y0 and ε0 given yt + Q(yt , zt , εt+1 ) ≥ yt+1 for allt = 0, 1, 2, ... and all εt+1 More generally, for any period t and any values of yt and εt , define v (yt , εt ) = max { z t + j } ∞ ,{ yt + j } ∞ j =0 j =1 Et ∞ ￿ β j F (yt+j , zt+j , εt+j ) j =0 subject to yt and εt given yt+j + Q(yt+j , zt+j , εt+j +1 ) ≥ yt+j +1 for allj = 0, 1, 2, ... and all εt+j +1 Note once again that the value function is a maximum value function. Now separate out the time t components: v (yt , εt ) = max [F (yt , zt , εt ) + zt ,yt+1 max { z t +j } ∞ ,{ y t +j } ∞ j =1 j =2 Et ∞ ￿ β j F (yt+j , zt+j , εt+j )] j =1 subject to yt and εt given yt + Q(yt , zt , εt+1 ) ≥ yt+1 for all εt+1 yt+j + Q(yt+j , zt+j , εt+j +1 ) ≥ yt+j +1 for allj = 1, 2, 3, ... and all εt+j +1 Relabel the time indices: v (yt , εt ) = max [F (yt , zt , εt ) + β zt ,yt+1 max {zt+1+j }∞ ,{yt+1+j }∞ j =0 j =1 Et ∞ ￿ β j F (yt+1+j , zt+1+j , εt+1+j )] j =0 subject to yt and εt given yt + Q(yt , zt , εt+1 ) ≥ yt+1 for all εt+1 yt+j +1 + Q(yt+1+j , zt+1+j , εt+1+j +1 ) ≥ yt+1+j +1 for all j = 0, 1, 2, ... and all εt+1+j +1 FACT (Law of Iterated Expectations): For any random variable Xt+j , realized at time t + j , j = 0, 1, 2, ...: Et Et+1 Xt+j = Et Xt+j . 15 To see why this fact holds true, consider the following example: Suppose εt+1 follows the first-order autoregression: εt+1 = ρεt + ηt+1 , with Et ηt+1 = 0 Hence εt+2 = ρεt+1 + ηt+2 , with Et+1 ηt+2 = 0 or εt+2 = ρ2 εt + ρηt+1 + ηt+2 . It follows that Et+1 εt+2 = Et+1 (ρ2 εt + ρηt+1 + ηt+2 ) = ρ2 εt + ρηt+1 and therefore Et Et+1 εt+2 = Et (ρ2 εt + ρηt+1 ) = ρ2 εt . It also follows that Et εt+2 = Et (ρ2 εt + ρηt+1 + ηt+2 ) = ρ2 εt . So that in this case as in general Et Et+1 εt+2 = Et εt+2 Using this fact: v (yt , εt ) = max [F (yt , zt , εt ) + β zt ,yt+1 max {zt+1+j }∞ ,{yt+1+j }∞ j =0 j =1 Et Et+1 ∞ ￿ β j F (yt+1+j , zt+1+j , εt+1+j )] j =0 subject to yt and εt given yt + Q(yt , zt , εt+1 ) ≥ yt+1 for all εt+1 yt+j +1 + Q(yt+1+j , zt+1+j , εt+1+j +1 ) ≥ yt+1+j +1 for all j = 0, 1, 2, ... and all εt+1+j +1 Now use the definition of v (yt+1 , εt+1 ) to simplify: v (yt , εt ) = max F (yt , zt , εt ) + β Et v (yt+1 , εt+1 ) zt ,yt+1 subject to yt and εt given yt + Q(yt , zt , εt+1 ) ≥ yt+1 for all εt+1 Or, even more simply: v (yt , εt ) = max F (yt , zt , εt ) + β Et v [yt + Q(yt , zt , εt+1 ), εt+1 ] zt 16 (22) Equation (22) is the Bellman equation for this stochastic problem. Thus, in order to incorporate uncertainty into the dynamic programming framework, we only need to make two modifications to the Bellman equation: a) Include the shock εt as an additional argument of the value function. b) Add the expectation term Et in front of the value function for t+1 on the right-hand side. Note that the maximization on the right-hand side of (22) is a static optimization problem, involving no dynamic elements. Note also that by substituting the constraints into the value function, we are left with an unconstrained problem. Unlike the Kuhn-Tucker approach, which requires many constraints and many multipliers, dynamic programming in this case has no constraints and no multipliers. The FOC for zt is F2 (yt , zt , εt ) + β Et {v1 [yt + Q(yt , zt , εt+1 ), εt+1 ]Q2 (yt , zt , εt+1 )} = 0 (23) The envelope condition for yt is: v1 (yt , εt ) = F1 (yt , zt , εt ) + β Et {v1 [yt + Q(yt , zt , εt+1 ), εt+1 ][1 + Q1 (yt , zt , εt+1 )]} (24) Equations (23)-(24) coincide exactly with the first-order conditions for zt and yt that we would have derived through a direct application of the Kuhn-Tucker theorem to the original, dynamic stochastic optimization problem. Together with the binding constraint yt+1 = yt + Q(yt , zt , εt+1 ) (25) we can think of (23) and (24) as forming a system of three equations in two unknown variables yt and zt and one unknown function v . This system of equations determines the problem’s solution, given the behavior of the exogenous shocks εt . Note that (25) is in the form of a difference equation; once again, solving a dynamic optimization problem involves solving a difference equation. 5 Example 3: Saving with Multiple Random Returns This example extends example 2 by: a) Introducing n ≥ 1 assets b) Allowing returns on each asset to be random 17 As in example 2, we will not be able to solve explicitly for the value function, but we will be able to learn enough about its properties to derive some useful economic results. Since we are extending the example in two ways, assume for simplicity that the consumer receives no labor income, and therefore must finance all of his or her consumption by investing. At = beginning-of-period financial wealth ct = consumption sit = savings allocated to asset i = 1, 2, ..., n Hence, At = c t + n ￿ sit i=1 Rit+1 = random gross return on asset i, not known until t + 1 Hence, when sit is chosen: Rit is known ... ... but Rit+1 is still viewed as random. Hence At+1 = n ￿ Rit+1 sit i=1 does not become known until the beginning of t +1, even though the sit must be chosen during t. Utility: E0 ∞ ￿ t β u( ct ) = E 0 t=0 ∞ ￿ t=0 t β u ( At − n ￿ sit ) i=1 The problem can now be stated as: choose contingency plans for sit for all i = 1, 2, ..., n and t = 0, 1, 2, ... and At for all t = 1, 2, 3, ... to maximize E0 ∞ ￿ t=0 t β u ( At − n ￿ sit ) i=1 subject to A0 given and n ￿ i=1 Rit+1 sit ≥ At+1 for all t = 0, 1, 2, ... and all possible realizations of Rit+1 for each i = 1, 2, ..., n. 18 As in the general case, the returns can be serially correlated, but must have the Markov property. To solve this problem via dynamic programming, let At = state variable sit , i = 1, 2, ...n = control variables Rt = [R1t , R2t , ...Rnt ] = vector of random returns The Bellman equation is v (At , Rt ) = max u(At − n {sit }i=1 FOC: −u￿ (At − for all i = 1, 2, ..., n n ￿ n ￿ sit ) + β Et v ( i=1 n ￿ Rit+1 sit , Rt+1 ) i=1 sit ) + β Et Rit+1 v1 ( i=1 n ￿ Rit+1 sit , Rt+1 ) = 0 i=1 Envelope condition: ￿ v1 (At , Rt ) = u (At − n ￿ sit ) i=1 Use the constraints to rewrite the FOC and envelope conditions more simply as u￿ (ct ) = β Et Rit+1 v1 (At+1 , Rt+1 ) for all i = 1, 2, ..., n and v1 (At , Rt ) = u￿ (ct ) Since the envelope condition must hold for all t = 0, 1, 2, ..., it implies v1 (At+1 , Rt+1 ) = u￿ (ct+1 ) Hence, the FOC imply that u￿ (ct ) = β Et Rit+1 u￿ (ct+1 ) (26) must hold for all i = 1, 2, ..., n Equation (26) generalizes (19) to the case where there is more than one asset and where the asset returns are random. It must hold for all assets i = 1, 2, ..., n, even though each asset may pay a different return ex-post. In example 2, we combined (19) with some additional assumptions to derive a version of the permanent income hypothesis. Similarly, we can use (26) to derive a version of the famous capital asset pricing model. 19 For simplicity, let mt+1 β u￿ (ct+1 ) = u￿ ( c t ) denote the consumer’s intertemporal marginal rate of substitution. Then (26) can be written more simply as 1 = Et Rit+1 mt+1 (27) Keeping in mind that (27) must hold for all assets, suppose that there is a risk-free asset, f f with return Rt+1 that is known during period t. Then Rt+1 must satisfy f 1 = Rt+1 Et mt+1 or Et mt+1 = 1 f Rt+1 (28) FACT: For any two random variables x and y , cov (x, y ) = E [(x − µx )(y − µy )], where µx = E (x) and µy = E (y ). Hence, cov (x, y ) = = = = E [xy − µx y − xµy + µx µy ] E (xy ) − µx µy − µx µy + µx µy E (xy ) − µx µy E (xy ) − E (x)E (y ) Or, by rearranging, E (xy ) = E (x)E (y ) + cov (x, y ) Using this fact, (27) can be rewritten as 1 = Et Rit+1 mt+1 = Et Rit+1 Et mt+1 + covt (Rit+1 , mt+1 ) or, using (28), f f Rt+1 = Et Rit+1 + Rt+1 covt (Rit+1 , mt+1 ) f f Et Rit+1 − Rt+1 = −Rt+1 covt (Rit+1 , mt+1 ) (29) Equation (29) indicates that the expected return on asset i exceeds the risk-free rate only if Rit+1 is negatively correlated with mt+1 . Does this make sense? 20 Consider that an asset that acts like insurance pays a high return Rit+1 during bad economic times, when consumption ct+1 is low. Therefore, for this asset: covt (Rit+1 , ct+1 ) < 0 ⇒ covt [Rit+1 , u￿ (ct+1 )] > 0 ⇒ covt (Rit+1 , mt+1 ) > 0 f ⇒ Et Rit+1 < Rt+1 . This implication seems reasonable: assets that work like insurance often have expected returns below the risk-free return. Consider that common stocks tend to pay a high return Rit+1 during good economic times, when consumption ct+1 is high. Therefore, for stocks: covt (Rit+1 , ct+1 ) > 0 ⇒ covt [Rit+1 , u￿ (ct+1 )] < 0 ⇒ covt (Rit+1 , mt+1 ) < 0 f ⇒ Et Rit+1 > Rt+1 . This implication also seems to hold true: historically, stocks have had expected returns above the risk-free return. Recalling once more that (29) must hold for all assets, consider in particular the asset whose return happens to coincide exactly with the representative consumer’s intertemporal marginal rate of substitution: m Rt+1 = mt+1 . For this asset, equation (29) implies f f m m Et Rt+1 − Rt+1 = −Rt+1 covt (Rt+1 , mt+1 ) or f f f Et mt+1 − Rt+1 = −Rt+1 covt (mt+1 , mt+1 ) = −Rt+1 vart (mt+1 ) f −Rt+1 = f Et mt+1 − Rt+1 vart (mt+1 ) (30) Substitute (30) into the right-hand side of (29) to obtain f Et Rit+1 − Rt+1 = or where covt (Rit+1 , mt+1 ) f (Et mt+1 − Rt+1 ) vart (mt+1 ) f f Et Rit+1 − Rt+1 = bit (Et mt+1 − Rt+1 ), (31) covt (Rit+1 , mt+1 ) vart (mt+1 ) is like the slope coefficient from a regression of Rit+1 on mt+1 . bit = Equation (31) is a statement of the consumption-based capital asset pricing model, or consumption CAPM. This model links the expected return on each asset to the riskfree rate and the representative consumer’s intertemporal marginal rate of substitution. 21 ...
View Full Document

This note was uploaded on 02/29/2012 for the course ECON 720 taught by Professor Ireland during the Fall '09 term at BC.

Ask a homework question - tutors are online