Introduction to Engineering Concepts and Mathematics
ENGR 109

Spring 2006
Lecture 1 Notes: Optimality
1
Markov decision processes
In this class we will study discretetime stochastic systems. We can describe the evolution
(dynamics) of these systems by the following equatio
Introduction to Engineering Concepts and Mathematics
ENGR 109

Spring 2006
Lecture 2 Notes: Decision Processes
1 Summary: Markov Decision Processes
Markov decision processes can be characterized by (S,A, g (), P (,), where
S denotes a finite set of states
Ax denotes a finite
Introduction to Engineering Concepts and Mathematics
ENGR 109

Spring 2006
L
Lecture 5 Notes: Discounted Functions
In this lecture, we will show that optimal policies for discountedcost problems with large enough discount factor
are also optimal for averagecost problems. T
Introduction to Engineering Concepts and Mathematics
ENGR 109

Spring 2006
Lecture 4 Notes: Average Cost
In the average cost problems, we aim at finding a policy u which minimizes
T 1
lim
sup 1
Ju(x) = T T E gu(xt)x0 = 0 .
t=0
(1)
Since the state space is finite, it can be s
Introduction to Engineering Concepts and Mathematics
ENGR 109

Spring 2006
Lecture 9 Notes: Joint Distribution
In this lecture, we will consider the problem of supervised learning. The setup is as follows. We
have pairs (x, y), distributed according to a joint distribution P
Introduction to Engineering Concepts and Mathematics
ENGR 109

Spring 2006
Lecture 6 Notes: Multiclass Networks
In the first part of this lecture, we will discuss the application of dynamic programming to the
queueing network introduced in [1], which illustrates several issu
Introduction to Engineering Concepts and Mathematics
ENGR 109

Spring 2006
Lecture 10 Notes: Bellman Error
We now consider the problem of computing an appropriate parameter r, so that, given an
approximation
architecture
J(x, r), J(, r) J ().
A class of iterative methods ar
Introduction to Engineering Concepts and Mathematics
ENGR 109

Spring 2006
Lecture 8 Notes: Contingencies
In this lecture, we want to study the convergence of
rt+1= rt+ tS(rt, wt)
to some with E [S(r , wt)] = 0. Recall the Lyapunov function analysis in deterministic case tha
Introduction to Engineering Concepts and Mathematics
ENGR 109

Spring 2006
Lecture 3 Notes: Stationary Policy
Using value iteration, starting at an arbitrary J0, we generate a sequence of cfw_Jk by
Jk+1= T Jk , integer k 0.
We have shown that the sequence JkJ as k, and deriv
Introduction to Engineering Concepts and Mathematics
ENGR 109

Spring 2006
Lecture 7 Notes: RTVI Algorithms
Recall the realtime value iteration (RTVI) algorithm
choose xk+1= f(xk , uk, wk)
choose ut in some fashion
Jk+1(x) = (T Jk)(x), x
update Jk+1(xk) = (T Jk)(xk),
We thu