CS221 Problem Set #3 Programming Part
1
CS 221, Autumn 2009
Problem Set #3 Programming Assignment
Due by 9:30am on Tuesday, November 10. Please see the course information page
on the class website for late homework submission instructions. SCPD students can
also fax their solutions to (650) 7251449. We will not accept solutions by email or
courier.
Programming part (30 points)
Overview
In this programming assignment, you’ll use the value iteration algorithm to find a policy for
driving a car on a loosesurface road. The car will begin facing down the road in one direction,
traveling at a fixed speed. Your policy will need to learn to spin the car around and then drive
off in the opposite direction as quickly as possible.
You are provided with a simple simulator of the car that, given a (realvalued) state vector,
will simulate forward in time using a specified (realvalued) action vector. Since the simulator
operates on continuousvalued states and actions, we’ve discretized the state and action spaces
for you.
In these discretized spaces, you’ll use the simulator to collect data and build up a
probabilistic model of the car’s dynamics.
Once you have such a model, you’ll compute the
value function (using value iteration) for the discrete MDP, and finally compute the optimal
policy from the value function.
This policy can be used by the provided graphical display
function, so that you can see how the car performs using your policy.
Please keep all of your code confined to
learnTransitionModel.m
and
solveMDP.m
. The
code necessary to complete this assignment is not very long – you shouldn’t need
to modify any other files.
The Continuous State Model
The road on which the car is driving defines the
x
axis of the coordinate system (which you can
think of as east).
The
y
axis points to the left side of the road (north) if the car is facing in
the positive
x
direction. Angles are measured counterclockwise from the
x
axis, so if the car’s
heading is 0, then it is facing directly down the road, and if its heading is
π
, then it is facing
down the road in the opposite direction.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
CS221 Problem Set #3 Programming Part
2
The state of the car,
s
, can be described by an array of 4 real numbers:
s
= [
y, v
x
, v
y
, θ
].
y
is
the
y
position of the car relative to the road.
v
x
and
v
y
are the car’s velocity along the
x
and
y
axes, and
θ
is the car’s heading.
In each state, one may choose an action for the car to execute. An action is an array of 2 real
numbers:
a
= [
α, ω
].
α
is the steering angle (i.e., the angle of the tires relative to the car’s
centerline – positive angles cause a left turn, negative angles cause a right turn), and
ω
is the
“velocity” of the car’s wheels. As an example, if
a
= [0
,
10], then the car will drive straight, with
the wheels spinning at a rate such that they propel the car at 10m/s (i.e., they spin at a rate
of
ω
= 10
/r
radians/sec, where
r
is the radius of the wheel). The car has an infinite amount
of torque, and will immediately drive the wheels at whatever velocity is commanded. Choosing
ω
≤
0 will result in applying the car’s brakes – you cannot drive backward.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '09
 KOLLER,NG
 Numerical Analysis, Probability theory, discrete state

Click to edit the document details