Unformatted text preview: sample bootstrap sample train filter unused data points test Out of Bag Error •  Very similar to cross- valida%on •  Measured during training •  Can be too op%mis%c Variable Importance •  •  •  •  •  Again use out of bag samples Predict class for these samples Randomly permute values of one feature Predict classes again Measure decrease in accuracy Temp%ng Scenario •  Run random forest with all features •  Reduce number of features based on importance weights •  Run again with reduced feature set and report out of bag error This does not measure test performance! Unbalanced Classes •  The Problem: •  Oversample: •  Subsample: •  Subsample for each tree! Random Forest Subsampling sample train Random Forest •  Similar to Bagging •  Easy to parallelize •  Packaged with some neat func%ons: –  Out...
