In the next four questions you will be asked to write short essay responses to

In the next four questions you will be asked to write

This preview shows page 5 - 7 out of 7 pages.

In the next four questions, you will be asked to write short essay responses to four prompts (see 2.4.2). 2.4.2 Discussion QuestionsNote that two of these require you to run additional experiments that examine the effectsof different parameters on the SGD methods.(a) We learn a classifier using SGD by iterating until the error on test data is belowa certain threshold. What do you expect will happen to the performance of theclassifier on the test data as we make this threshold lower?Run experimentswith your own code for SGD, but set different convergencethresholds. Describe the experiments you ran to examine the effect of the thresh-old on test accuracy, and discuss your results. Do you observe any patterns? Ifso, can you explain what’s going on?(b) What is the effect of varying the learning rate for your Stochastic Gradient Descentmethods (both the baseline and decision-stumps-as-features baseline)?Re-run the SGD experimentsusing a variety of learning rates (perhapsα=10i, whereiranges from-1 to-5), and compare the average accuracies you seefor each.For each of the SGD methods, does a higher or a lower learning rate performbetter? Is the function monotonic? Why do you think you got the results youdid?(c) Based on your original results and the experiments you ran to answer the previ-ous questions, which learning algorithm would you prefer to use to classify newexamples from the same Badges Game?When providing evidence to support your choice, consider the average cross-validated classification accuracy, the variance of that accuracy, and whether thealgorithm has parameters that need to be set (and how sensitive the algorithm’sperformance is to its parameters).You might also want to think about the expressiveness of the different algorithms,and to what degree the (unknown) concept that generated the data is separablein your feature representation.Finally, you will be required to submit your source code, aREADME, and displays ofyour best-performing trees. 5
Image of page 5
The README should contain your name and email address, and should provide enough information for someone to compile and run your code. Place all source files (excluding executables and object files) and the README into a directory called userID-hw1. Pack the directory so that when we unpack it, the userID-hw1 directory is created with all of your files in it. The name of the packed file should be userID-hw1.zip or userID-hw1.tar.gz. For the last part of the assignment, we’d like to see the string representation of your best-performing decision trees (see the output of WekaTester for an example). For each decision tree variant you experimented with (except for the last one on decision stumps as features), include in a separate text file the tree created during cross validation that had the best performance. At the top of the file, give the number of correct and incorrect predictions made by the tree, and indicate which fold served as the test set.
Image of page 6
Image of page 7

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture

  • Left Quote Icon

    Student Picture