CSc 401 – Data Mining
Name:______________________________
Final Exam
December 18, 2003
Score:_________________/100
Directions:
Carefully answer each of the following questions.
This is an openbook, opennote exam.
You may use calculators but you may NOT use computers.
You are NOT to get help from others.
Points
will be assigned on answer quality
as well as answer correctness.
CLEARLY
show all work. PUT
YOUR NAME AT THE TOP OF EACH PAGE.
1.
Whenever a continuousvalued attribute is used as a decision node in a C4.5 tree, Quinlan’s approach
is to bisect the range of values of the attribute.
Indicate a different way to handle continuous
attributes in a C4.5 decision tree when they are being used as decision nodes.
Do you believe your
method produces more interpretable trees?
Explain.
[5 pts.]
Convert the values to nominal values (binning)
Develop binary thresholds in set of values
Use fuzzy logic
I would suggest using binning.
Binning would probably provide for more accurate trees,
however, they might not be as easy to read as Quinlan’s approach produces.
2.
Consider the ellipse shown plotted on the wz coordinate axis.
Also shown in the diagram are the xy
eigenvectors from the PCA which also from a basis for the space.
[6 pts.]
wz are the original coordinate axes
x
y are the eigenvectors
a.)
Onto which eigenvector should we project
the ellipse if we want to keep the
maximum variation?
Why?
X
– the variance along this vector is much greater
(higher distance) than that of Y
b.)
In the diagram at the left, indicate the set
of points that result from the projection.
2c598a6e2a3743e4036abd468a386c9a5e1d2ca4.doc  1
Z
W
Y
X
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
??? The points within the ellipse?
3.
Consider the 3 x 3 gridtop SOM map whose weight vectors are shown below.
[The underlined
numbers represent the location of the neuron on the grid according to Matlab’s notation scheme.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '09
 Merz
 Regression Analysis, Neural Networks, pts, KFB

Click to edit the document details