IEOR E4525
Machine Learning
Feb. 9th, 2015
Solutions to Assignment 2
1. Naive Bayes and spam ltering
(a) A MATLAB code for this exercise is attached to the solution. Note that NaiveBayes
functionality
IEOR E4525
Machine Learning
Feb. 23rd, 2015
Solutions to Assignment 3
1. Regression analysis of the CEO Pay Data Set
The R code for this exercise is enclosed (see HW3.R), here we outline only the main
# Chapter 3 Lab: Linear Regression
library(MASS)
library(ISLR)
#
# Simple Linear Regression
fix(Boston) # invokes `edit` on Boston and then assigns the new edited in the workspace
names(Boston)
?Bosto
IEOR E4525
Machine Learning
April 8th, 2015
Solutions to Assignment 6
1. PCA Via Optimization
Induction base:
First, we need to show that 1 , the rst eigenvector of a covariance matrix , solves a prob
IEOR E4525
Machine Learning
Mar. 9th, 2015
Solutions to Assignment 4
1. Scaling the inputs
True. In general it is a good idea to scale the inputs, otherwise not all components
of an input vector x con
IEOR E4525
Machine Learning
April 29th, 2015
Solutions to Assignment 8
1. The Party Animal
(a) The Bayes network is shown below:
Figure 1: Bayes network example
The key here is to expand P(H = 1, A =
# Chapter 4 Lab: Logistic Regression, LDA, QDA, and KNN
# The Stock Market Data
library(ISLR)
names(Smarket)
# Contains % returns for S&P 500 for 1250 days from 2001 to
2005
dim(Smarket)
# It also con
IEOR E4525: Machine Learning for OR and FE (Spring 2017)
Syllabus and Course Logistics
Instructors: Martin Haugh and Garud Iyengar
Email: [email protected] and [email protected]
URL: http:/www
IEOR E4525
Martin Haugh
Due: Thursday Feb. 23rd
Assignment 3
1. (Some Properties of Ridge Regression)
Do Exercise 4 in Chapter 6 of ISLR.
2. (Exercise 3.4 in Bishop: Error in the Predictor Regularizat
Machine Learning for OR & FE
Resampling Methods
Martin Haugh
Department of Industrial Engineering and Operations Research
Columbia University
Email: [email protected]
Some of the figures in thi
IEOR E4525
Martin Haugh and Garud Iyengar
Due: Tuesday 31 Jan 2017
Solutions to Assignment 1
2. (EDA with the Spam Filtering Data Set)
The csv file spam.csv contains a data set for emails that were ca
% solves the question / problem in the lectures
% probablity of the classes
K = 2; % # of classes
pi = rand(1,K);
pi = pi/sum(pi);
%probability of the answers
Q = 5; % # of questions
sigma = zeros(K,Q
% testing the multinomial EM model
theta = 0.25;
% set seed
seed = 10;
rng(seed);
% samples of Z
pz = [0.5, 0.25*theta,0.25*(1-theta), 0.25*(1-theta), 0.25*theta];
n = 10;
m = length(pz);
y = mnrnd(n,
IEOR E4525
Martin Haugh & Garud Iyengar
Due: Thursday 9 February 2017
Assignment 2
You might want to work through some of the examples in Section 5.3 Lab: Cross-Validation
and the Bootstrap in ISLR be
sx# Chapter 3 Lab: Linear Regression
library(MASS)
library(ISLR)
#
# Simple Linear Regression
fix(Boston)
# invokes `edit` on Boston and then assigns the new edited
version of Boston in the workspace
require(DAAG)
require(ggplot2)
require(MASS)
# This code implements reduced rank LDA (Fisher Discriminant Analysis)
# It can reproduce the subplots of Figure 4.8 in HTF by specifing coordinates
a,b
#
Lab 10 - Ridge Regression and the Lasso in Python
March 9, 2016
This lab on Ridge Regression and the Lasso is a Python adaptation of p. 251-255 of Introduction to
Statistical Learning with Application
Lab 8 - Subset Selection in Python
March 2, 2016
This lab on Subset Selection is a Python adaptation of p. 244-247 of Introduction to Statistical Learning
with Applications in R by Gareth James, Danie
Lab 2 - Linear Regression in Python
February 24, 2016
This lab on Linear Regression is a python adaptation of p. 109-119 of Introduction to Statistical Learning
with Applications in R by Gareth James,
spamdata <- read.csv("spam.csv")
names(spamdata) #variable column headings
str(unique(spamdata$spampct) #view the unique values of spampct
sum(is.na(spamdata$spampct) # count of NA values in spampct
s
spamdata <- read.csv("spam.csv")
names(spamdata) #variable column headings
str(unique(spamdata$spampct) #view the unique values of spampct
sum(is.na(spamdata$spampct) # count of NA values in spampct
s
IEOR E4525
Martin Haugh
Due: Tuesday 31 Jan 2017
Assignment 1
You do not need to submit anything for Questions 1 and 3.
You only need to do one of Questions 4 and 5.
1. (Implications of Big Data)
The
IEOR E4525
Iyengar
Due: Tuesday April 11th
Assignment 6
1. Binomial-Poisson mixture
Suppose a random variable X is distributed as follows
0
with probability p,
X=
Poisson() with probability 1 p,
where
IEOR E4525
M. Haugh and G. Iyengar
March 24th, 2017
IEOR E4525 Midterm
Instructions :
200pts
1. There are 5 questions in all.
2. You have 2 hours and 30 minutes to do the exam.
3. The exam is closed b
IEOR E4525
M. Haugh
Due Thursday April 27th 2017
Assignment 8
1. (Barber Exercise 23.3)
Consider an HMM with three states (K = 3) and two output symbols (M = 2), with
a left-to-right state transition