1)
a. problem1a.m results in the following plots:
Order1:
Order2:
Order3:
Order 4:
with accuracy:
b. according to these graphs the 3rd order would yield the best performance. The
accuracy shows that 3 and 4 have the highest accuracy. According to the grap
Chris Andersen
Data Mining
PS7
1)
a.
as the number of hidden nodes increases from 1-5 the accuracy increases.
Between 5 and 10 hidden nodes the accuracy fluctuates between increases and
decreases but only by fractions of an exponent.
The run time continuo
1)
a.
there was a 71.6016% accuracy rate.
according to the confusion matrix the classes that were most confused were
atheism, comp.graphics, sci.electronics, sci.med, and talk.politics.guns.
The accuracy from problem set 5 was 84%, which is a large statis
1)
a. True, there are 3772 instances in hypothyroid. If we guess all the classes to be
negative we would have a 0.9229 success rate.
b. False, overfitting is indicated by high description lengths. High description
lengths mean either a small tree with man
Christopher Andersen
Data Mining
HW 3
1)
a.
The fruit-spots attribute was selected by the 1R classifier.
b.
The leafspot-size was used as the root node.
c. Both the 1R and the J48 decision tree selected a best first attribute. They did
not however select
Chris Andersen
Data Mining
Problem set 2
1)
a.
the mean vector is [0.1323 -0.0778]
the distance to [8 15] is 17.0071
the distance to [3 -10] is 10.3283
[3 -10] is closer to the mean vector.
b.
c.
the second point ([3 -10]) is most like the mean vector.
d.
Chris Andersen
Data Mining PS1
Due 1/29/2013
1)
a. This task would be supervised because of all of the available information about
the weather. The task would also be one of regression because we are predicting a
maximum temperature. You can evaluate the