This preview shows pages 1–10. Sign up to view the full content.

Andrew Y. Ng Advice for applying Machine Learning Andrew Ng Stanford University

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Andrew Y. Ng Today’s Lecture Advice on how getting learning algorithms to different applications. Most of today’s material is not very mathematical. But it’s also some of the hardest material in this class to understand. Some of what I’ll say today is debatable. Some of what I’ll say is not good advice for doing novel machine learning research. •K e y i d e a s : 1. Diagnostics for debugging learning algorithms. 2. Error analyses and ablative analysis. 3. How to get started on a machine learning problem. – Premature (statistical) optimization.
Andrew Y. Ng Debugging Learning Algorithms

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Andrew Y. Ng Debugging learning algorithms Motivating example: • Anti-spam. You carefully choose a small set of 100 words to use as features. (Instead of using all 50000+ words in English.) • Bayesian logistic regression, implemented with gradient descent, gets 20% test error, which is unacceptably high. • What to do next?
Andrew Y. Ng Fixing the learning algorithm • Bayesian logistic regression: • Common approach: Try improving the algorithm in different ways. – Try getting more training examples. – Try a smaller set of features. – Try a larger set of features. – Try changing the features: Email header vs. email body features. – Run gradient descent for more iterations. – Try Newton’s method. – Use a different value for λ . – Try using an SVM. • This approach might work, but it’s very time-consuming, and largely a matter of luck whether you end up fixing what the problem really is.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Andrew Y. Ng Diagnostic for bias vs. variance Better approach: – Run diagnostics to figure out what the problem is. – Fix whatever the problem is. Bayesian logistic regression’s test error is 20% (unacceptably high). Suppose you suspect the problem is either: – Overfitting (high variance). – Too few features to classify spam (high bias). Diagnostic: – Variance: Training error will be much lower than test error. – Bias: Training error will also be high.
Andrew Y. Ng More on bias vs. variance Typical learning curve for high variance: m (training set size) error Test error Training error • Test error still decreasing as m increases. Suggests larger training set will help. • Large gap between training and test error. Desired performance

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Andrew Y. Ng More on bias vs. variance Typical learning curve for high bias: m (training set size) error Test error Training error • Even training error is unacceptably high. • Small gap between training and test error. Desired performance
Andrew Y. Ng Diagnostics tell you what to try next Bayesian logistic regression, implemented with gradient descent. Fixes to try: – Try getting more training examples.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.