13-PracticalMachineLearning

Rand index percentage of correct classicaons compare

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: sample bootstrap sample train filter unused data points test Out of Bag Error •  Very similar to cross- valida%on •  Measured during training •  Can be too op%mis%c Variable Importance •  •  •  •  •  Again use out of bag samples Predict class for these samples Randomly permute values of one feature Predict classes again Measure decrease in accuracy Temp%ng Scenario •  Run random forest with all features •  Reduce number of features based on importance weights •  Run again with reduced feature set and report out of bag error This does not measure test performance! Unbalanced Classes •  The Problem: •  Oversample: •  Subsample: •  Subsample for each tree! Random Forest Subsampling sample train Random Forest •  Similar to Bagging •  Easy to parallelize •  Packaged with some neat func%ons: –  Out...
View Full Document

Ask a homework question - tutors are online