class_09_19 - Statistical Data Mining ORIE 474 Fall 2006...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Statistical Data Mining ORIE 474 Fall 2006 Tatiyana Apanasovich 09/25/06 Model Structures for Prediction & Curse of Dimensionality
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6.3 Model Structure for Prediction Main model classes used in DM: A Regression models w/ linear structure B Local piecewise model structures for regression C Nonparametric “memory-based” local models D Stochastic components of model structures E Predictive models for classification
Background image of page 2
A. Regression Models w/ Linear Structure Model Structure: Θ={a 0 ,..,a p } Geometric interpretation: p-dim. hyperplane embedded in a (p+1)-dim. space with slope parameters a 1 ,..,a p and intercept a 0 Features: Additive ( individual contributions ) = + = p j j j X a a Y 1 0 ˆ
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Linear Regression Models (cont’d) Generalized Additive Models f j ’s are smooth (e.g. log(x), sqrt(x), etc.) Functions are nonlinear in X, but still linear in the parameters = + = p j j j j X f a a Y 1 0 ) ( ˆ
Background image of page 4
Linear Regression Models: Ex y=0.001x 3 -0.05x 2 +x+N(0,3)
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Parametric Models assume a particular, relatively simple, functional form e.g., uniform distribution, normal distribution, exponential, Poisson typically relatively small number of parameters often closed form solutions for parameter estimates that require a single pass through the data important to test the assumptions made by the model: – using simple visualizations – using statistical goodness-of-fit tests
Background image of page 6
Nonparametric Models take a local data-driven weighted average of around the point of interest simplest version: histogram – estimate for density is just (scaled) number of points in bin – problems: not smooth
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 20

class_09_19 - Statistical Data Mining ORIE 474 Fall 2006...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online