CS404_Final_Exam_121603

CS404_Final_Exam_121603 - CSc 401 Data Mining Final Exam...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
CSc 401 – Data Mining Name:______________________________ Final Exam December 18, 2003 Score:_________________/100 Directions: Carefully answer each of the following questions. This is an open-book, open-note exam. You may use calculators but you may NOT use computers. You are NOT to get help from others. Points will be assigned on answer quality as well as answer correctness. CLEARLY show all work. PUT YOUR NAME AT THE TOP OF EACH PAGE. 1. Whenever a continuous-valued attribute is used as a decision node in a C4.5 tree, Quinlan’s approach is to bisect the range of values of the attribute. Indicate a different way to handle continuous attributes in a C4.5 decision tree when they are being used as decision nodes. Do you believe your method produces more interpretable trees? Explain. [5 pts.] Convert the values to nominal values (binning) Develop binary thresholds in set of values Use fuzzy logic I would suggest using binning. Binning would probably provide for more accurate trees, however, they might not be as easy to read as Quinlan’s approach produces. 2. Consider the ellipse shown plotted on the w-z coordinate axis. Also shown in the diagram are the x-y eigenvectors from the PCA which also from a basis for the space. [6 pts.] w-z are the original coordinate axes x- y are the eigenvectors a.) Onto which eigenvector should we project the ellipse if we want to keep the maximum variation? Why? X – the variance along this vector is much greater (higher distance) than that of Y b.) In the diagram at the left, indicate the set of points that result from the projection. 2c598a6e2a3743e4036abd468a386c9a5e1d2ca4.doc -- 1 Z W Y X
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
??? The points within the ellipse? 3. Consider the 3 x 3 gridtop SOM map whose weight vectors are shown below. [The underlined numbers represent the location of the neuron on the grid according to Matlab’s notation scheme. The three values under each of these numbers are the weight vectors corresponding to that neuron.]
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 08/25/2011 for the course CPE 404 taught by Professor Merz during the Fall '09 term at Missouri S&T.

Page1 / 8

CS404_Final_Exam_121603 - CSc 401 Data Mining Final Exam...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online