Hw4 - Problem 1 Answer the following questions briefly(i For K-means clustering let W K be the within-cluster variation if K clusters are used Give

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Problem 1 Answer the following questions briefly: (i) For K-means clustering, let W K be the within-cluster variation if K clusters are used. Give a formula for W K . Explain how to use a plot of W K against K to determine the optimal number of clusters to use. (ii) Why is the predictive accuracy of training data not a good estimator of performance for future data? Problem 2 Classification trees in SAS EM. This problem uses the data sets TargetKnown.xls and Unclassified.xls from HW3. Recall that TargetKnown has ten variables: a binary target and nine predictor variables. You should use a stratiied random sample containing all responders and an equal number of non-responders. Unclassified has the nine predictor variables, but the target is missing in Unclassified . (i) Using TargetKnown , find a good decision tree model for predicting the target. Write a short report describing your model....
View Full Document

This note was uploaded on 02/06/2011 for the course ORIE 474 taught by Professor Apanasovich during the Spring '07 term at Cornell University (Engineering School).

Ask a homework question - tutors are online