DM_Assignment_1 - neighbor classification on the zipcode...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
ISyE 7406, Spring-2007 Instructor: Kwok-Leung Tsui Homework # 1 (due 1/28/09) 1. Use R to reproduce the Home Equity Loan exercise in the first unit of notes, “Introduction to Data Mining”. The data file, “hmeq.csv”, is given on our course website. 2. (Ex 2.2 in the textbook) Show how to compute the Bayes decision boundary for the simulation example in Figure 2.5 of the textbook. The simulation example is generated according to two bivariate Gaussian distributions with uncorrelated components and different means, i.e. . , ), , ( ~ ), , ( ~ 1 0 0 0 0 1 1 1 t independen are Y Y G Y G Y Σ Σ μ 3. (Ex 2.7 in the textbook) Compare the classification performance of linear regression and k-nearest
Background image of page 1
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: neighbor classification on the zipcode data. In particular, consider only the 2 s and 3 s, and k = 1, 3, 5, 7 and 15. Show both the training and test error for each choice. The zipcode data are available from the book website: www-stat.stanford.edu/ElemStatLearn . You can also find the data in our course website. Data Description: for the training data, every column stands for a covariate (x) and % stands for the response in train%.txt or train. %; for the test data, the first column stands for the response and other columns stand for covariates....
View Full Document

This note was uploaded on 11/13/2010 for the course ISE 680 taught by Professor Santanu during the Spring '10 term at Purdue University Calumet.

Ask a homework question - tutors are online