Statistical Data Mining ORIE 474 Fall 2007 Tatiyana Apanasovich 10/24/07 Classification Modeling

10.1 Predictive Modeling Aims to predict the unknown value of a variable of interest based on the known values of other variables Learn a mapping input X nxp scalar output Y Supervised Learning Partition data set {(x(i),y(i)) : i=1,…,n} into Training data set D train = {(x(i),y(i)) : i=1,…,m} Validation data set D validation = {(x(i),y(i)) : i=m+1, …,n}
Predictive Modeling (cont’d) From the training data, estimate a mapping/function f s.t. y = f(x;θ) where f is the functional form of the model structure θ is a vector of model parameters that have to be estimated Input by data miner Model structure(s) Score function Search method

Predictive Modeling (cont’d) Score function where d( , ) is a distance measure Search method Minimize S as a function of θ on the training data set To compare different predictive models (e.g. f’s), evaluate the S f (θ*) at the optimal θ* for each model structure f on validation data set Choose f* such that S f (θ*) is minimal )) ); ( ( ), ( ( )) ( ˆ ), ( ( ) ( θ θ i x f i y d i y i y d S train train D D = = ) ); ( ( ) ( ˆ θ i x f i y =
10.2 Classification Modeling Target variable Y is categorical , and is often called the class variable Notation Instead of Y we will use C , taking values is {c 1 , …,c m } Input variables X 1 ,…,X p x(i) = (x 1 (i),…,x p (i)) T input vector for i th object Concepts Discriminative viewpoint (decision boundaries) Probabilistic viewpoint

Ex: Red blood cell data Source:http://www.ics.uci.edu/ ~smyth/courses/ics274/ 182 individuals healthy iron deficient anemia
Discriminative Classification: Ex

Probabilistic Classification: Ex 0.01 0.99 0.50
A. Discriminative Classification f: x(i) {c 1 ,…,c m } If m=2, f produces piecewise constant surface over the {X 1 ,X 2 } plane C X 2 X 1

Discriminant Analysis
