Multi-arm Bandits
Sutton and Barto, Chapter 2
The simplest
reinforcement learning
problem
Learning: Evaluative vs. Instructive
Evaluative: Evaluates actions
Depends entirely on actions taken
Key feature of RL
Instructive: Instructs by giving correct a
Chapter 3: Markov Decision Process (MDP)
The Reinforcement Learning Problem
Objectives of this chapter:
present Markov decision processesan idealized form of
the AI problem for which we have precise theoretical
results
introduce key components of the ma
Chapter 4: Dynamic Programming
Objectives of this chapter:
Overview of a collection of classical solution methods
for MDPs known as dynamic programming (DP)
Show how DP can be used to compute value functions,
and hence, optimal policies
Discuss efficie
Reduction in
action
Practical DP & NPC Problems
CSCI 3160 Tutorial 11
Nov 16th, 2016
Knapsack problem
Unbounded knapsack problem:
The supply of each item is unbounded.
Item
Weight / w
Value / v
1
6
30
2
3
14
3
4
16
4
2
9
Optimal solution: 1x Item 1 + 2x I
Solution to Assignment 3
Chen Zitong
Ex 5.9
(a)If graph G has more than |V|-1 edges, and there
is a unique heaviest edge, then this edge cannot be
part of a minimum spanning tree
(False, e can be a bridge)
(b) If G has a cycle with a unique heaviest edge
CSCI3160 Tutorial (9th week)
Midterm
L AI Qiuxia, SHB 905,Email:
[email protected]
Offi ce hour: Tue 10:00am-12:00pm
Outlines
Q1
Q2
Q3
Q4
Q5
Q6
2
Q1
(a)
True.
Suppose
=
3
Q1
The product of two n-digit integers can be computed
(b)
in time using
Homework 2
Solution Guide / Tips
CSCI 3160 Tutorial 7
Oct 19th, 2016
Pouring Water
?
0/10
7/7
4/4
?/10
?
2/7
?/4
or
?
?
?/10
?/7
So the problem is about states and transitions.
State: volumes of the containers <Va,Vb,Vc>
Transitions: the pouring operation
Homework 1
CHEN Zitong
2.4
A
dividing into five sub problems of half the siz
e, recursively solving , then combining in linear t
me.
T (n) = 5T (n/2) + O(n)
a = 5, b = 2, d = 1
2.4
B
recursively solving two sub problems of size
n 1 and then combining
BFS / DFS
And its applications
CSCI 3160 Tutorial3
Sep 21st, 2016
DFS
Search the unvisited nodes as deep as possible.
Pass 1:
1 2 3 4 6
# go back
visited
# go back
visited
# go back
visited
# go back
to 4, all neighbors
to 3, all neighbors
to 2, all neigh
CSCI3320: Homework 2, Spring 2017
Teacher: John C.S. Lui
SAMPLING THEORY
209
1. Sampling DistributionELEMENTARY
of means. A population
consists of the five numbers 2, 3, 6, 8,
and
CHAP. 8]
11. Consider all possible samples of size 2 that can be drawn with
CSCI3320: Homework 01, Spring 2017
Teacher: John C.S. Lui
1. Assume we are given the task of building a system to distinguish junk email. What is in a junk
email that lets us know that it is junk? How can the computer detect junk through a syntactic
analy
Course outline (the secon half term)
Part 8 Basic issues
!The large number law and statistical consistency
!Bias-Variance Tradeoff
!Five model selection related topics
!Major streams of model selection studies
!Two approximate implementations of integral
Machine(Learning(Theory(
(
Lei(Xu(
h1p:/www.cse.cuhk.edu.hk/~lxu/(
(
Department(of(Computer(Science(and(Engineering,(
The(Chinese(University(of(Hong(Kong(
(
!
Dean's reminder
1) To brief students about the "Student/Faculty Expectations on Teaching and
Lea
Machine(Learning(Theory(
(
Lei(Xu(
h1p:/www.cse.cuhk.edu.hk/~lxu/(
(
Department(of(Computer(Science(and(Engineering,(
The(Chinese(University(of(Hong(Kong(
(
!
Dean's reminder
1) To brief students about the "Student/Faculty Expectations on Teaching and
Lea
Machine(Learning(Theory(
(
Lei(Xu(
h1p:/www.cse.cuhk.edu.hk/~lxu/(
(
Department(of(Computer(Science(and(Engineering,(
The(Chinese(University(of(Hong(Kong(
(
!
Dean's reminder
1) To brief students about the "Student/Faculty Expectations on Teaching and
Lea
Course outline (the secon half term)
Part%8%Basic&issues&
!The large number law and statistical consistency
!Bias-Variance Tradeoff
!Five model selection related topics
!Major streams of model selection studies
!Two approximate implementations of integral
Assignment 3 of CSCI5030 - Machine Learning Theory
Deadline: 14:30pm, 19 Nov., 2014
Assignment 3
1. Given two Gaussians G(x|1 , 1 ), G(x|2 , 2 ), and a set of i.i.d. samples X = cfw_xt N
t=1 , with each xt coming from
either G(x|1 , 1 ) with a prior proba
Assignment 4 of CSCI5030 - Machine Learning Theory
Deadline: 14:30pm, 10 Dec., 2014
Assignment 4
1. The Fisher information measures the amount of information that an observable random variable X carries about
an unknown parameters upon which the probabili
Assignment 2 of CSCI5030 - Machine Learning Theory
Deadline: 14:30pm, 22 Oct., 2014
Assignment 2
1. A learning system consists of three basic ingredients, namely, Learner, Theory and Implementation.
(1.1): Please describe the key points for each ingredien
Assignment 1 of CSCI5030 - Machine Learning Theory
Deadline: 14:30pm, 8 Oct., 2014
Assignment 1
1. Let the random variable x have continuous cumulative distribution function F (x) =
Rx
p(x)dx, and define the
random variable y as y = F (x), please prove th
Assignment 5 of CSCI5030 - Machine Learning Theory
Deadline: 14:30pm, 12 Dec., 2014
Assignment 5
1. Kullback-Leibler (KL) divergence is a non-symmetric measure of difference between two probability distribution
p(X) and q(X). The KL divergence of q(X) fro
CSCI3160-15F CSE-CUHK-HK-CHN
Design and Analysis of Algorithms
Homework 4
Due: 5pm Dec 7, 2015
1. Exercises 8.4.
2. Use the restriction method to prove the NP-completeness of the following problems:
(a) Tree Subgraph
Instance: Graph G and tree T .
Questio
#include<stdio.h>
#include<string.h>
#include<ctype.h>
int check(char x[]);
int main(void)cfw_
char add[50]=cfw_0;
printf("Enter email address: ");
gets(add);
int judge=check(add);
if(judge=0)
printf("This email address is not valid.");
else if(judge=1)
p