Week5 - What went right Week 5 Quiz 1 Redux Learning Vision...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Week 5 Quiz 1 Redux Learning Vision Chapter 20, 24 Quiz 1 ! In general, grades a bit lower than I expected ! Average: 68 (!CSEgrad: 57) ! Min: 35 ! Max: 100 (!CSEgrad: 88) ! Spread is quite high ! Problems 1,5 were the worst offenders Stats Everybody Non-(CSE-grads) What went right ! Almost everyone got ! Inference cloud problem (Problem 4) ! Many people got ! Figuring out how to break down probabilities into components ! Probability distributions What went wrong ! Many people had trouble with ! Double Markov Model problem (prob 1) ! Values versus variables (prob 5) Question-by-question breakdown
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Back to machine learning Evaluating Learning ! Training data must be selected so as to reflect the global data pool ! Testing on unseen data is crucial to prevent over-fitting to the training data : ! Unintended correlations between input and output Eg. Photos with tanks were taken on a sunny day ! Correlations specific to the set of training data Eg. Language processing trained on Wall Street Journal articles may not extend to spoken dialog Training vs. Test data ! Learning agents are presented a collection of training examples ! Modifications are made to the learning algorithm until it performs well on the training data ! Test data is held out from this process ! When performance on training data is acceptable, algorithm is run on test data ! Only performance on test data is reported Test data methodologies ! Single pass: reserve x% of data for test ! Cross-validation: ! Each fold reserves x% of data, trains on rest and tests on held out data ! Performance is averaged across folds ! Statistical tests tell how many test cases are needed for reliable conclusions Fold 1: Fold 2: Integrating CV with Decision Trees ! As a tree gets deeper, hypothesis becomes more specific ! If too specific, then hypothesis overfits training data ! For each CV fold, train full d-tree on training set ! Then prune back tree to minimize error on CV set ! Find best depth d across all folds ! Regrow tree using full data set, but stop at depth d. Ensemble learning ! Develop a suite of classifiers and combine their votes (pick the majority classification) ! Train a new classifier on the errors from another classifier
Background image of page 2
Boosting ! Training examples are assigned weight or importance ! The classifier’s opinion is weighted in proportion to the weight of the examples it learned ! Initially all examples weighted the same ! Weight of misclassified ones are boosted Boosting Non-parametric learning ! Neural networks (and Gaussians as we will see later) are parametric learners ! Restricted number of parameters according to a particular form weights, means, variances ! Non-parametric learners use the data directly to derive classifications ! Nearest neighbors / k-nearest neighbors Nearest neighbor k-nearest neighbor ! Idea: points are likely to be clustered together ! Classification: similar classes cluster together ! Probability density: instances likely to cluster together ! To figure out what to do with a point, look at
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 04/13/2010 for the course CSE 730 taught by Professor Ericfosler-lussier during the Fall '08 term at Ohio State.

Page1 / 12

Week5 - What went right Week 5 Quiz 1 Redux Learning Vision...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online