CSCI 4150 Learning Theory

# CSCI 4150 Learning Theory - Learning Theory 8.1 CS221...

• karthikmail1989
• 13

This preview shows pages 1–4. Sign up to view the full content.

Learning Theory 8 8.1 CS221 Lecture notes by Andrew Ng, No 7 see also CS229 Lecture notes by Andrew Ng, Part VI

This preview has intentionally blurred sections. Sign up to view the full version.

CS221 Lecture notes #7 Supervised learning summary In the previous sets of notes on supervised learning, we discussed many spe- cific algorithms for supervised learning. Now, we’re going to take a step back and discuss some of the principles of how to use these learning algorithms to achieve good performance. 1 Multi-class classification When discussing logistic regression and decision trees, we simplified our task by focusing on binary classification tasks, where there are only two categories to distinguish. However, many problems require us to distinguish more than two categories. Many binary classification algorithms can be extended to directly deal with multiple classes, but there is one general approach we can take even for algorithms which don’t have straightforward multiclass extensions. In one-vs.-all (also called one-vs.-many or one-vs.-rest ), if we are trying to distinguish between N different classes, we train N different clas- sifiers, each one of which tries to distinguish one class from all the rest. For instance, suppose we are given the three-class data shown in Figure 1 (a). We construct three different classification problems, each of which uses one of the three classes for the positive examples and the other two classes for the negative examples. The resulting classifiers are shown. How do we combine these classifiers to get a prediction on a novel example x ? Each of the classifiers outputs some sort of confidence score that it sees a positive example. For instance, with logistic regression, the confidence score is given by h θ ( x ). For decision trees, it is the probability estimate associated with the corresponding leaf node. Our prediction on the new example x will 1
2 (a) 0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4 (b) 0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4 (c) 0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4 (d) 0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4 Figure 1: (a) A multiclass classification problem, with three categories. (b- d) Learned classifiers for each of the binary classification subproblems in one-vs.-all. simply be the class for which the classifier returns the highest confidence score of the example being a member of that class. 2 Bias, variance, and generalization error In the first machine learning lecture, we introduced the idea of overfitting or underfitting. Recall that we said a model underfits the training data if, like the first model in Figure 2 (b), it does not capture all of the structure available from the data. On the other hand, a model overfits if it captures too many of the ideosyncrasies of the training data, as in Figure 2 (d). In this section, we define more formally what we mean by overfitting and underfitting.

This preview has intentionally blurred sections. Sign up to view the full version.

This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern