1. Bayes rule.
Suppose that in answering a question in a multiple choice test, a student either knows the
answer with probability p, or he guesses with probability 1-p. Assume that the probability
of answering a question correctly is 1 for a student who k
CS 6375 Machine Learning
2009 Spring
Homework 1
Due: 01/28/2009 (tentative), 2:30pm
Part I: Written questions. 30 points.
1. [15 points]. (based on an exercise from Terran Lane) The following is the training data for a
binary classification task.
Attr 1
A
CS 6375 Machine Learning
Homework 5
Due: 04/11/2008, 11:59pm
1. Bias and Variance. (15 pts)
Alpaydin book, Chapter 4, problem 9.
Let us say, given the samples X i = cfw_xit , rit , where t is the index for the instances in the training
set, we define g i
CS 6375 Machine Learning, Spring 2009
Homework 2. Total points: 50
Due: 02/10/2009 11:59pm
Note: The following answers may not have all the details. Let me know if you have any
questions.
1. Bayes rules. [10 pts]
Part of exercise 13.11 in R&N book.
Suppos
CS 6375 Machine Learning, Spring 2009
HW5
Gang LIU
SID:11458407
Apr. 10, 2009
Email: gxl083000@utdallas.edu
Solution(*)
1M
g x) = g( x) ] = g( x),
(
E[ i
1 i M
i
M i =1
1
Bias 2 ( g ) = [g xt ) - f ( xt ) ]2
(
Nt
1
Variance( g ) =
(
[g xt ) - g( xt ) ]2
CS 6375 Machine Learning, Spring 2009
Homework 2. Total points: 50
Due: 02/10/2009 11:59pm
1. Bayes rules. [10 pts]
Part of exercise 13.11 in R&N book.
Suppose you are given a bag containing n unbiased coins. You are told that n-1 of these coins
are norma
Machine Learning
Homework 3 Solutions
Problem 1
sketch of decision boundaries using K-NN classifiers.
N=1
N=3
Problem 2
The key point of this problem is to understand that training samples are randomly generated, and
that training set is going to determin
CS 6375 Homework 1
Chenxi Zeng, UTD ID: 11124236
Part I:
1. Let S0 be the original data table given in the question, then Entropy( S0 ) = 1.
Let I i (i=1,2,3,4) be the information gain if we split by Attr i, i.e., I i =IG( S0 |Attr i).
3
Stage 1: We have
Regularization
The problem of
overfitting
Machine Learning
Size
Price
Price
Price
Example: Linear regression (housing prices)
Size
Size
Overfitting: If we have too many features, the learned hypothesis
may fit the training set very well (
), but fail
to g
Linear Regression with
multiple variables
Multiple
features
Machine Learning
Multiple features (variables).
Size (feet2)
Price
($1000)
2104
1416
1534
852
460
232
315
178
Andrew Ng
Multiple features (variables).
Size (feet2)
Number of
bedrooms
Number of
fl
Linear Algebra
review (optional)
Matrices and
vectors
Machine Learning
Andrew Ng
Matrix: Rectangular array of numbers:
Dimension of matrix: number of rows x number of
columns
Andrew Ng
Matrix Elements (entries of matrix)
, entry in the
row,
column.
Andre
Octave Tutorial
Basic
operations
Machine Learning
Octave Tutorial
Moving data
around
Machine Learning
Octave Tutorial
Computing on
data
Machine Learning
Octave Tutorial
Plotting data
Machine Learning
Octave Tutorial
Control statements:
for, while, if stat
Advice for applying
machine learning
Deciding what
to try next
Machine Learning
Debugging a learning algorithm:
Suppose you have implemented regularized linear regression to predict housing
prices.
However, whe
Machine learning
system design
Machine Learning
Priori3zing what to
work on: Spam
classica3on example
Building a spam classier
From: cheapsales@buystufffromme.com
To: ang@cs.stanford.edu
Subject: Buy now!
From: Alfred Ng
To: ang@c
Neural Networks:
Representation
Non-linear
hypotheses
Machine Learning
Non-linear Classification
x2
x1
size
# bedrooms
# floors
age
Andrew Ng
What is this?
You see this:
But the camera sees this:
Andrew Ng
Computer Vision: Car detection
Not a car
Cars
Tes
Linear Regression with
mul2ple variables
Mul2ple features
Machine Learning
Mul4ple features (variables).
Size (feet2) Price ($1000)
2104
1416
1534
852
460
232
315
178
Andrew Ng
Mul4ple features
Octave Tutorial
Basic operations
Machine Learning
Octave Tutorial
Moving data around
Machine Learning
Octave Tutorial
Computing on data
Machine Learning
Octave Tutorial
Plotting data
Machine Learning
Octave Tutorial
Control statements: for,
while, if stat
Introduction
Welcome
Machine Learning
Andrew Ng
Andrew Ng
SPAM
Andrew Ng
Machine Learning
Grew out of work in AI
New capability for computers
Examples:
Database mining
Large datasets from growth of automation/web.
E.g., Web click data, medical records, bi
Regularization
The problem of
overfitting
Machine Learning
Size
Price
Price
Price
Example: Linear regression (housing prices)
Size
Size
Overfitting: If we have too many features, the learned
hypothesis may fit the training set very well (
), but fail to g
Dimensionality
Reduction
Motivation I:
Data Compression
Machine Learning
Data Compression
(inches)
Reduce data from
2D to 1D
(cm)
Andrew Ng
Data Compression
(inches)
Reduce data from
2D to 1D
(cm)
Andrew Ng
Data Compression
Reduce data from 3D to 2D
Andre
Recommend
er Systems
Problem
formulation
Machine Learning
Example: Predicting movie ratings
User rates movies using one to five stars
Movie
Alice (1)
Bob (2)
Carol (3)
Dave (4)
Love at last
Romance forever
Cute puppies of
love
Nonstop car
chases
Swords vs
Neural
Networks:
Representation
Non-linear
hypotheses
Machine Learning
Non-linear Classification
x2
x1
size
# bedrooms
# floors
age
Andrew Ng
What is this?
You see this:
But the camera sees this:
Andrew Ng
Computer Vision: Car detection
Not a car
Cars
Tes
Anomaly
detection
Problem
motivation
Machine Learning
Anomaly detection example
Dataset:
New engine:
(vibratio
n)
Aircraft engine features:
= heat generated
= vibration intensity
(heat)
Andrew Ng
Density estimation
(vibration)
Dataset:
Is
anomalous?
(heat
Linear Algebra
review (op3onal)
Matrices and
vectors
Machine Learning
Andrew Ng
Matrix: Rectangular array of numbers:
Dimension of matrix: number of rows x number of columns
Andrew Ng
Matrix Elements (entries of ma
Linear regression
with one variable
Model
representa6on
Machine Learning
Andrew Ng
Housing Prices
(Portland, OR)
500
400
300
Price 200
(in 1000s 100
of dollars) 0
0
500
1000
1500
2000
2500
3000
Size (
Introduction
Welcome
Machine Learning
Andrew Ng
Andrew Ng
SPAM
Andrew Ng
Machine Learning
- Grew out of work in AI
- New capability for computers
Examples:
- Database mining
Large datasets from growth of automation/web.
E.g., Web click data, medical recor
Advice for
applying machine
learning
Deciding what
to try next
Machine Learning
Debugging a learning algorithm:
Suppose you have implemented regularized linear regression to predict
housing prices.
However, when you test your hypothesis on a new set of ho
A CHECKERS LEARNING PROBLEM
Machine Learning By Tom Mitchell
PROBLEM
Task T: playing checkers
Performance measure P: percent of games won in the world
tournament
Training experience E: games played against itself
APPROACH
1. The exact type of knowledge
Assignment 2
CS 6375: Machine Learning
Naive Bayes and Logistic Regression for Text Classification
In this homework you will implement and evaluate Naive Bayes and Logistic Regression for text
classification. It is acceptable to look at WEKA's Java code.
Midterm Review
CS 6375: Machine Learning
The University of Texas at Dallas
Machine Learning
Supervised Learning
Unsupervised Learning
Reinforcement Learning
Parametric
Non-parametric
Y Continuous
Y Discrete
Gaussians
Learned in closed form
Linear Function
Decision Trees
The University of Texas at Dallas
Choosing the best Attribute?
Fundamental principle underlying tree
creation
Simplicity
Occams Razor: Simplest model that explains
the data should be preferred
Each node divides the data into subsets
Ma