This preview shows pages 1–7. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Classification and Regression Trees I What if you have many X variables? I Could imagine estimating f ( X 1 , X 2 ,..., X k ) I But increasingly difficult beyond k = 2 or k = 3 I Curse of dimensionality I In high dimensions, every point is isolated (see next slide) I Assume effects are additive E Y = f ( X 1 ) + f ( X 2 ) + ... + X k I What if you expect interactions or contingent response to of a variable? I Contingent response: if X 1 e.g. 25, Y unrelated to X 2 if X 2 < e.g. 25, Y depends on X 1 and X 2 I Tree models provide a way to model contingent responses parsimoniously c 2011 Dept. Statistics (Iowa State University) Stat 511 section 33 1 / 15 5 10 15 5 10 15 20 X 5 10 15 5 10 15 20 X 1 X 2 c 2011 Dept. Statistics (Iowa State University) Stat 511 section 33 2 / 15 I Best known method is the Classification and Regression Tree (CART) I Breiman et al. (1984), Classification and Regression Trees I Concept: (for 1 X variable) I Any E Y = f ( X ) can be approximated by a sequence of means f ( X ) = 1 X k 1 2 k 1 > X k 2 . . . . . . K k K 1 X < k K K + 1 k K X c 2011 Dept. Statistics (Iowa State University) Stat 511 section 33 3 / 15 5 10 15 20 1 2 3 4 5 6 x y c 2011 Dept. Statistics (Iowa State University) Stat 511 section 33 4 / 15 10 15 20 25 30 35 1 2 3 4 X 1 X 2 1 2 3 4 5 c 2011 Dept. Statistics (Iowa State University) Stat 511 section 33 5 / 15 I Given Y , assumed continuous, and a set of p potentially useful covariates { X 1 , X 2 ,...,,....
View Full
Document
 Spring '08
 Staff

Click to edit the document details