a. problem1a.m results in the following plots:
b. according to these graphs the 3rd order would yield the best performance. The
accuracy shows that 3 and 4 have the highest accuracy. According to the grap
as the number of hidden nodes increases from 1-5 the accuracy increases.
Between 5 and 10 hidden nodes the accuracy fluctuates between increases and
decreases but only by fractions of an exponent.
The run time continuo
there was a 71.6016% accuracy rate.
according to the confusion matrix the classes that were most confused were
atheism, comp.graphics, sci.electronics, sci.med, and talk.politics.guns.
The accuracy from problem set 5 was 84%, which is a large statis
a. True, there are 3772 instances in hypothyroid. If we guess all the classes to be
negative we would have a 0.9229 success rate.
b. False, overfitting is indicated by high description lengths. High description
lengths mean either a small tree with man
The fruit-spots attribute was selected by the 1R classifier.
The leafspot-size was used as the root node.
c. Both the 1R and the J48 decision tree selected a best first attribute. They did
not however select
Problem set 2
the mean vector is [0.1323 -0.0778]
the distance to [8 15] is 17.0071
the distance to [3 -10] is 10.3283
[3 -10] is closer to the mean vector.
the second point ([3 -10]) is most like the mean vector.
Data Mining PS1
a. This task would be supervised because of all of the available information about
the weather. The task would also be one of regression because we are predicting a
maximum temperature. You can evaluate the