Dynamic Programming
Econ720. Fall 2009. Lutz Hendricks.
1
Basic Idea
We are given a dynamic optimization problem in discrete time of the form
max
X
T
t
=1
u
(
k
t
; c
t
; t
) +
V
T
+1
(
k
T
+1
)
subject to
k
t
+1
=
g
(
k
t
; c
t
; t
)
with the initial condition
k
0
given and the terminal
value of
k
given, call it
V
T
+1
(
k
T
+1
)
. We call the variable with the equation of motion
the
state variable
and the other choice variable the
control variable
.
One way of solving this problem is to set up a Lagrangean
° = max
X
T
t
=1
u
(
k
t
; c
t
; t
) +
X
T
t
=1
°
t
[
g
(
k
t
; c
t
; t
)
°
k
t
+1
] +
V
T
+1
(
k
T
+1
)
(1)
The °rst sum is simply the objective function, while the second sum is a conve
nient way of collecting all constraints. The FOCs are
u
c
(
t
)
=
°
°
t
g
c
(
t
)
u
k
(
t
)
=
°
°
t
g
k
(
t
) +
°
t
°
1
°
T
=
V
0
T
+1
(
k
T
+1
)
This is a perfectly good approach, but DP is an alternative that is sometimes
more convenient. The basic idea is to restart the problem at some date
±
. First note
that we can think of the maximized Lagrangean as an indirect utility function. After
maximizing out all future values of
c
and
k
,
°
becomes a function only of
k
t
and
t
which we write as
V
°
(
k
°
)
. Now move to period
±
+ 1
and restart the problem again.
Given a suitable structure of the problem (whatever that means!), the solution to
the period
±
+ 1
problem will be the same as that of the period
±
problem, except
for the period
±
values of course, which are in the past from the
±
+ 1
perspective.
In other words, if the solution to the date
±
problem is
(^
c
t
;
^
k
t
+1
)
; t
=
±; : : : ; T
,
then the solution to the date
±
+ 1
problem is
(^
c
t
;
^
k
t
+1
)
; t
=
±
+ 1
; : : : ; T
. The
decision maker does not revise his date
±
plan at some later time. The problem is
time consistent
.
For this to work, we must be able to rearrange the terms in the Lagrangean so
that the problem is divided into two sets of equations. The °rst set only contains
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
variables prior to date
±
and
k
°
. The second set contains only variables from date
±
onwards. Here:
°
=
max
X
°
°
1
t
=1
u
(
k
t
; c
t
; t
) +
X
°
°
1
t
=1
°
t
[
g
(
k
t
; c
t
; t
)
°
k
t
+1
]
+
X
T
t
=
°
u
(
k
t
; c
t
; t
) +
X
T
t
=
°
°
t
[
g
(
k
t
; c
t
; t
)
°
k
t
+1
]
+
V
T
+1
(
k
T
+1
)
More generally,
°
°
±
(
k
°
±
) = max
X
°
°
1
t
=
°
±
u
(
k
t
; c
t
; t
) +
X
°
°
1
t
=
°
±
°
t
[
g
(
k
t
; c
t
; t
)
°
k
t
+1
] + °
°
(
k
°
)
(2)
If this is true, then we can write the date
±
problem as
V
°
(
k
°
) = max
c
°
; k
°
+1
f
u
(
k
°
; c
°
; ±
) +
°
°
[
g
(
k
°
; c
°
; ±
)
°
k
°
+1
] +
V
°
+1
(
k
°
+1
)
g
(3)
You can convince yourself that this is true by expanding
V
(
:
)
on the right hand
side repeatedly. You will recover exactly (1), except that there is a ± max²for each
date. But since the decision maker does not change his mind, these ± max²³s are
irrelevant.
The interpretation is simple.
Given that behavior is optimal from date
±
+ 1
onwards, the value of
k
°
+1
is given by the indirect utility function
V
°
+1
(
k
°
+1
)
. The
date
±
decision can then be made by trading o/ current utility against next period
utility.
This is the end of the preview.
Sign up
to
access the rest of the document.
 '09
 LUTZHENDRICKS
 Dynamic Programming, Optimization, Trigraph, Lagrangean, Bellman equation

Click to edit the document details