{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

class_10_24

# class_10_24 - Statistical Data Mining ORIE 474 Fall 2007...

This preview shows pages 1–11. Sign up to view the full content.

Statistical Data Mining ORIE 474 Fall 2007 Tatiyana Apanasovich 10/24/07 Classification Modeling

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
10.1 Predictive Modeling Aims to predict the unknown value of a variable of interest based on the known values of other variables Learn a mapping input X nxp scalar output Y Supervised Learning Partition data set {(x(i),y(i)) : i=1,…,n} into Training data set D train = {(x(i),y(i)) : i=1,…,m} Validation data set D validation = {(x(i),y(i)) : i=m+1, …,n}
Predictive Modeling (cont’d) From the training data, estimate a mapping/function f s.t. y = f(x;θ) where f is the functional form of the model structure θ is a vector of model parameters that have to be estimated Input by data miner Model structure(s) Score function Search method

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Predictive Modeling (cont’d) Score function where d( , ) is a distance measure Search method Minimize S as a function of θ on the training data set To compare different predictive models (e.g. f’s), evaluate the S f (θ*) at the optimal θ* for each model structure f on validation data set Choose f* such that S f (θ*) is minimal )) ); ( ( ), ( ( )) ( ˆ ), ( ( ) ( θ θ i x f i y d i y i y d S train train D D = = ) ); ( ( ) ( ˆ θ i x f i y =
10.2 Classification Modeling Target variable Y is categorical , and is often called the class variable Notation Instead of Y we will use C , taking values is {c 1 , …,c m } Input variables X 1 ,…,X p x(i) = (x 1 (i),…,x p (i)) T input vector for i th object Concepts Discriminative viewpoint (decision boundaries) Probabilistic viewpoint

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Ex: Red blood cell data Source:http://www.ics.uci.edu/ ~smyth/courses/ics274/ 182 individuals healthy iron deficient anemia
Discriminative Classification: Ex

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Probabilistic Classification: Ex 0.01 0.99 0.50
A. Discriminative Classification f: x(i) {c 1 ,…,c m } If m=2, f produces piecewise constant surface over the {X 1 ,X 2 } plane C X 2 X 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Discriminant Analysis
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### Page1 / 24

class_10_24 - Statistical Data Mining ORIE 474 Fall 2007...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online