Reinforcement Learning: Tutorial 5 (week from 3. 3. 2014)
1. How can particle filters
be used in the context
of robot localization?
2. The "art" of importance
sampling: We are
sampling P(x), which
may be not cover the
interesting aspect of the
game. It is
Reinforcement Learning: Tutorial 5 (week from 3. 3. 2014)
1. How can particle filters
be used in the context
of robot localization?
Particle filters sample a
probability distribution.
The dynamics of the
particles can be used to
represent the change of
th
Reinforcement Learning 2013/2014: Tutorial 7
1. RL learning aims at related tasks of optimising the value function, the policy and the behaviour based
on the reward signal. The rewards is used only locally (in simple RL algorithms), such that models are
u
Reinforcement Learning 2014: Tutorial 6 (week from 10. 3. 2014)
1. Recall the three types of errors in RL (hint: value, policy, exploration). How are they
represented in the definition of the global utility measure (or global reward average).
Is it possib
Reinforcement Learning 2014: Tutorial 6 (week from 10. 3. 2014)
These are just a few hints, please do not distribute.
1. Errors in value estimation, errors in policy estimation and "region errors", i.e. has
the agent arrived (sufficiently often/at all) in
Reinforcement Learning: Tutorial 4
(week from 24. 2. 2014)
1. Is the reinforcement learning framework adequate to usefully represent all goal-directed
learning tasks? Can you think of any clear exceptions?
[this and other problems on this sheet are from S
Reinforcement Learning 2013/2014
Tutorial 2 (week 4)
1. Discuss your solution of the 1D walker problem (see homework in lecture RL3,
21/1/2014). How do
initialisation
alternative reward denitions
exploration variants
parameters and parameter decay sch
Reinforcement Learning: Tutorial 4
Problems and hints for solutions for the week from 24. 2. 2014
1. Is the reinforcement learning framework adequate to usefully represent all goaldirected learning tasks? Can you think of any clear exceptions?
This proble
Reinforcement Learning 2013
Tutorial 3: Hints and solutions
1. Consider the gamblers problem (example 4.3 in S+B). Why does the optimal policy for the gamblers
problem have such a curious form? In particular, for capital of 50 it bets it all on one ip, bu
Reinforcement Learning 2013
Tutorial 2: Hints and solutions
1. Discuss your solution of the 1D walker problem (see homework in lecture RL3, 21/1/2014). How do
initialisation
alternative reward denitions
exploration variants
parameters and parameter de
Reinforcement Learning 2013/2014
Tutorial 3
1. Consider the gamblers problem (example 4.3 in S+B). Why does the optimal policy for the gamblers
problem have such a curious form? In particular, for capital of 50 it bets it all on one ip, but for
capital of
Reinforcement Learning
Tutorial 1 (week 3, 27/1/14 and 31/1/14)
Questions
1. Consider the comparison between greedy methods shown in Figure 2.1 in
the Sutton and Barto book. Which
method will perform best in the long run
in terms of cumulative rewards and
Reinforcement Learning
Tutorial 1 (week 4, 4/2/13 and 7/2/13)
Questions and Hints for Solutions
1. Consider the comparison between -greedy methods shown in Figure 2.1 in the Sutton
and Barto book. Which method will perform best in the long run in terms of
Reinforcement Learning: Coursework Assignment 2 (Semester 2, 2014)
Instructions
This homework assignment is to be done individually, without help from your classmates or
others. Plagiarism will be dealt with strictly as per University policy.
Solve all pr
Reinforcement Learning: Coursework Assignment 1 (Semester 2, 2013)
Instructions
This homework assignment is to be done individually, without help from your classmates or
others. Plagiarism will be dealt with strictly as per University policy.
Solve all pr
Lecture 8 Dimensionality
Reduction
Contents:
Subset Selection & Shrinkage
Ridge regression, Lasso
PCA, PCR, PLS
Comparison of Methods
Lecture 8: RLSC - Prof. Sethu Vijayakumar
1
Data From Human Movement
Measure arm movement and full-body movement of
hum
RLSC Homework
You have been given example code (https:/db.tt/HUYqjOd1) that you may use to complete
this homework. The code allows you to load a kinematic model of a Baxter robot as well as to
compute the forward kinematics and Jacobian of the end-eectors