EE266, Spring 2014-15
Professor S. Lall
EE266 Homework 2 Solutions
1. Monte Carlo integration. Consider a unit circle inscribed in a square, as shown below.
+1
x2 0
1
0
x1
+1
Each of the small circles drawn on this gure represents a random point that was

EE365: The Bellman-Ford Algorithm
1
Shortest path problems
I given weighted graph and a destination vertex
I find lowest cost path from every vertex to destination
4
18
2
12
6
10
5
10
3
5
3
1
1
8
4
5
7
7
6
2
Dynamic programming principle
I let gij = cost

EE365: Markov Chains
Markov chains
Transition Matrices
Distribution Propagation
Other Models
1
Markov chains
2
Markov chains
I a model for dynamical systems with possibly uncertain transitions
I very widely used, in many application areas
I one of a handf

EE365: Costs and Rewards
Costs and rewards
Value iteration
1
Costs and rewards
2
Costs and rewards in a Markov chain
I associate costs (or rewards; more generally, just a function) with Markov chain
x0 , . . . , xT
I gt : X R is the stage cost function
I

EE365: Probability and Monte Carlo
1
Notation
I in this course, random variables will take values in a finite set X
I we will use multiple styles of notation
I e.g., we switch between linear algebra notation and function notation
2
Abstract notation
I ran

EE266, Spring 2015-16
Professor S. Lall
EE266 Homework 2
1. Monte Carlo integration. Consider a unit circle inscribed in a square, as shown below.
+1
x2 0
1
0
x1
+1
Each of the small circles drawn on this figure represents a random point that was
generate

EE365: Code for Dynamic Programming
1
Example: Inventory model
I inventory level xt cfw_0, 1, . . . , C
I new stock added ut cfw_0, 1, . . . , C
I xt+1 = xt wt + ut
I demand Prob(wt = 0, 1, 2) = (0.7, 0.2, 0.1)
2
Example: Inventory model with ordering pol

EE266, Spring 2015-16
Professor S. Lall
EE266 Homework 3
1. Managing a data center. You are the manager of a data center oering a particular
service to customers (e.g., computing power, file retrieval, serving web pages). In this
problem we consider a ver

EE266, Spring 2015-16
Professor S. Lall
EE266 Homework 1
1. Optimal disposition of a stock. You must sell a total amount B > 0 of a stock in two
rounds. In each round you can sell any nonnegative amount of the stock; by the second
round all of the initial

EE365: Value
1
Value function
I suppose you will receive a reward g(x1 ) depending on the state at t = 1
I how much should you pay at time t = 0 be in state i?
Define the value of state i, given by vi , to be
vi = E g(x1 ) | x0 = i
(the term value makes m

EE365: Markov Decision Processes
Markov decision processes
Markov decision problem
Examples
1
Markov decision processes
2
Markov decision processes
I add input (or action or control) to Markov chain with costs
I input selects from a set of possible transi

EE266, Spring 2014-15
Professor S. Lall
EE266 Homework 1 Solutions
1. Optimal disposition of a stock. You must sell a total amount B > 0 of a stock in two
rounds. In each round you can sell any nonnegative amount of the stock; by the second
round all of t

EE266, Spring 2014-15
Professor S. Lall
EE266 Homework 4 Solutions
1. The Bellman-Ford algorithm. Consider a directed, weighted graph with vertex set
cfw_1, . . . , n. We can represent such a graph by a matrix W Rnn , where Wij is the
weight of the edge (

EE266, Spring 2014-15
Professor S. Lall
EE266 Homework 5 Solutions
1. A rened inventory model. In this problem we consider an inventory model that is
more rened than the one youve seen in the lectures. The amount of inventory at time
t is denoted by qt cf

EE266, Spring 2014-15
Professor S. Lall
EE266 Homework 3 Solutions
1. Second passage time. In this problem we will consider the following Markov chain.
Note that self-loops are omitted from this gure.
is
0.4 0.3 0 0.3
0 0.4 0 0.3
0.3 0 0.1 0
P =
0.3 0
0

EE266 and MS&E251: Introduction
About the course
Optimization
Dynamical systems
Stochastic control
1
About the course
2
About the course
I EE266 is the same as MS&E251
I Formerly called EE365
I created by Stephen Boyd, Sanjay Lall, and Ben Van Roy in 2012

EE365: Hitting Times
1
Example: Inventory re-ordering
if we start in state C, how long before we re-order?
E (x0 , x1 , . . . ) = mincfw_t > 0 | xt E
I E is a random variable, called the first passage time or hitting time to set E
I E is the earliest time

EE365: Example: Dynamic Pricing
1
Dynamic pricing
(
xt+1 =
xt 1
xt
if wt ut and xt > 0
otherwise
I xt X = cfw_0, 1, . . . , n is stock at time t
I assume one customer arrives per period (time periods are very short)
I wt cfw_0, 1, 2 is the reservation pri

EE365: Shortest Path Example
1
Stochastic shortest path example
1
2
n1
n
I chain of n = 100 nodes
I move from node 10 to node 90 in T = 100 steps
I can move forward one node, move backward one node, or stay put
I at each time step, lightning strikes with

EE365: Structure of Markov Chains
1
Distribution propagation
0.3
0.3
0.3
1
0
1
0.4
0.3
0.3
0.3
3
4
19
0.3
2
0.4
0.3
20
1
0.4
I distribution propagation t+1 = t P
I to find distribution of final states, compute ss = lim t
t
I called the steady-state distri

EE365: Epidemic Example
1
Monte Carlo simulation
to approximate
X
e = E f (x0 , . . . , xT ) =
f (x0 , . . . , xT )ds0 Ps0 s1 PsT 1 sT
s0 ,.,sT X
(a sum with nT +1 terms)
(i)
I simulate N trajectories xt , and let
e =
N
1 X
(i)
(i)
f (x0 , . . . , xT )
N

EE266, Spring 2015-16
Professor S. Lall
EE266 Homework 4
1. The Bellman-Ford algorithm. Consider a directed, weighted graph with vertex set
cfw_1, . . . , n. We can represent such a graph by a matrix W 2 Rnn , where Wij is the
weight of the edge (i, j) if