In the next four questions, you will be asked to write short essay responses to four
prompts (see 2.4.2).
2.4.2 Discussion QuestionsNote that two of these require you to run additional experiments that examine the effectsof different parameters on the SGD methods.(a) We learn a classifier using SGD by iterating until the error on test data is belowa certain threshold. What do you expect will happen to the performance of theclassifier on the test data as we make this threshold lower?Run experimentswith your own code for SGD, but set different convergencethresholds. Describe the experiments you ran to examine the effect of the thresh-old on test accuracy, and discuss your results. Do you observe any patterns? Ifso, can you explain what’s going on?(b) What is the effect of varying the learning rate for your Stochastic Gradient Descentmethods (both the baseline and decision-stumps-as-features baseline)?Re-run the SGD experimentsusing a variety of learning rates (perhapsα=10i, whereiranges from-1 to-5), and compare the average accuracies you seefor each.For each of the SGD methods, does a higher or a lower learning rate performbetter? Is the function monotonic? Why do you think you got the results youdid?(c) Based on your original results and the experiments you ran to answer the previ-ous questions, which learning algorithm would you prefer to use to classify newexamples from the same Badges Game?When providing evidence to support your choice, consider the average cross-validated classification accuracy, the variance of that accuracy, and whether thealgorithm has parameters that need to be set (and how sensitive the algorithm’sperformance is to its parameters).You might also want to think about the expressiveness of the different algorithms,and to what degree the (unknown) concept that generated the data is separablein your feature representation.Finally, you will be required to submit your source code, aREADME, and displays ofyour best-performing trees.
5

•
The
README
should contain your name and email address, and should provide
enough information for someone to compile and run your code.
Place all source files (excluding executables and object files) and the
README
into
a directory called userID-hw1.
Pack the directory so that when we unpack it,
the userID-hw1 directory is created with all of your files in it. The name of the
packed file should be userID-hw1.zip or userID-hw1.tar.gz.
•
For the last part of the assignment, we’d like to see the string representation of
your best-performing decision trees (see the output of WekaTester for an example).
For each decision tree variant you experimented with (except for the last one on
decision stumps as features), include in a separate text file the tree created during
cross validation that had the best performance. At the top of the file, give the
number of correct and incorrect predictions made by the tree, and indicate which
fold served as the test set.

