Dynamic Programming Lecture #14
Outline:
Inventory Control
Inventory Control DP
xk+1 = xk + uk wk
x=
inventory x > 0
backlog
x<0
u = production
w = demand
Dene
h |z| z > 0
p |z| z < 0
r(z) =
p
Total cost:
h
N 1
r(xk + uk wk ) + cuk
k=0
production co
Dynamic Programming Lecture #10
Outline:
Stochastic DP
LQ Optimal Control
Finite State Markov Chains
Stochastic DP
System:
xk+1 = fk (xk , uk , wk )
xk Sk ,
uk Uk (xk ),
wk Wk (xk )
Assume:
wk is an RV on some probability space k
Probability functi
Dynamic Programming Lecture #11
Outline:
Finite State Markov Chains
Viterbi Algorithm
Finite State Markov Chains
Restrict attention to nite state systems:
xk X = cfw_1, 2, . . . , N
Simple example: Machine up/down dynamics:
fail
no
fail
up
no
repair
Dynamic Programming Lecture #12
Outline:
Viterbi algorithm
Controlled Markov chains
Viterbi Algorithm
Given an observation sequence:
Z = cfw_z1 , z2 , ., zN
what is the most likely state sequence?
X = cfw_x0 , x1 , ., xN
i.e.,
X = arg max P r(X|Z)
P
Dynamic Programming Lecture #8
Outline:
Probability review, cont
Expectation
Law of large numbers
Stochastic DP
Expectation
Define: The expected value of X
E [X] =
xpX (x)
x
Expected value is a weighted sum:
All values X() can take.
Weighted by th
Dynamic Programming Lecture #9
Outline:
Stochastic DP
Examples
Repeated Prisoners Dilemma
Stochastic DP
System:
xk+1 = fk (xk , uk , wk )
xk Sk ,
uk Uk (xk ),
wk Wk (xk )
Assume:
wk is an RV on some probability space k
Probability function p(wk ) c
Dynamic Programming Lecture #7
Outline:
Probability review, cont
Independent events
Conditional independence
Mutual independence
Bernoulli trials
Random variables
Expectation
Independent Events
Define: Events A and B are Independent if
P A
B = P [
Dynamic Programming Lecture #5
Outline:
Worst case DP
Stochastic DP preview
Probability review
Probability space
Conditional probabilities
Value Iteration
Setup:
xk+1 = fk (xk , uk , wk )
uk Uk (xk )
wk Wk (xk , uk )
Cost of policy:
N 1
J (x0 ) = w
Dynamic Programming Lecture #6
Outline:
Stochastic DP preview
Probability review
Probability space
Conditional probabilities
Total probability
Bayes rule
Independent events
Stochastic DP Preview
Objective: Study systems with random phenomena.
Exa
Dynamic Programming Lecture #4
Outline:
Deterministic DP Review
Variations
Worst case DP
Deterministic DP Review
System:
xk+1 = fk (xk , uk )
State: xk Sk
Control (decision): uk Uk (xk )
Policy shorthand: = cfw_0 , 1 , ., N 1
k : xk uk Uk (xk )
C
Dynamic Programming Lecture #1
Outline:
Problem formulation(s)
Examples
Motivation: Staged Optimization
Q: How to formulate optimization for problems that occur in stages?
Standard Optimization:
min J()
Cost function: J() = J(1 , . . . , n )
Variabl