# LEC5 - Categorical Variable Recoding Morgan C Wang Department of Statistics University of Central Florida 1 Morgan C Wang Outline Introduction

This preview shows pages 1–13. Sign up to view the full content.

Morgan C. Wang Department of Statistics University of Central Florida Categorical Variable Recoding 2/28/2011 1 Morgan C. Wang

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Outline Introduction Representations with Dummy Variables Unary Variable with Missing Values Binary Variable with Missing Values High Level Nominal or Ordinal Variables Nominal Scale Variables with Many Levels Ordinal Scale Variables with Many Levels Periodical Variables Case Study 2/28/2011 Morgan C. Wang 2
Introduction 2/28/2011 3 Morgan C. Wang

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Introduction Why categorical variables with many levels can be a problem for methods such as regression and neural network? Transformation Target Dependent Transformation Target Independent Transformation 2/28/2011 Morgan C. Wang 4
Representation with Dummy Variables 2/28/2011 5 Morgan C. Wang

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Representation with Dummy Variables Unary Variable with Missing Values Binary Variables with Missing Values Categorical Variables with Many Levels Nominal Scale Ordinal Scale 2/28/2011 Morgan C. Wang 6
Unary Variable with Missing Values A unary variable that has missing values can be treated as a binary variable with two categories “Yes” and “Unknown” (to represent the missing category) and then treat this variable as a missing value indicator variable. 2/28/2011 Morgan C. Wang 7

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Unary Variable with Missing Values 2/28/2011 Morgan C. Wang 8
Dummy Variable A dummy variable is a numerical valued variable with the value “1” to indicate the presence of a given level of a categorical variable and with the value “0” to indicate the absence of the given level of a categorical variable. For a categorical variable with n levels, we can use n-1 dummy variables to represent it. 2/28/2011 Morgan C. Wang 9

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Binary Variable with Missing Values Treat it as one binary variable by imputing the missing values into either the “Yes” or “No” category; Treat it as a nominal scale variable with three categories, and Treat it as an ordinal scale variable that can be represented by two binary variables. 2/28/2011 Morgan C. Wang 10
Binary Variable with Missing Values The precise way in which we treat a binary variable with missing values depends on the proportion of missing values, the predictive capability of the binary variable, and the necessity of limiting the dimensionality of the problem. 2/28/2011 Morgan C. Wang 11

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Impute the Missing Value If the proportion of missing values for the binary input variable is limited (say < 5%) and the target binary variable has a consistent frequency for the two levels of the input variable, then imputing either a Yes or a No to the input variable will have near negligible consequences. If the binary target values are overwhelmingly “Yes” (as an example, the same story holds if it were “No”), say in the 99% range, then it is unlikely that imputation would accomplish much.
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 09/22/2011 for the course STA 6714 taught by Professor Staff during the Spring '11 term at University of Central Florida.

### Page1 / 68

LEC5 - Categorical Variable Recoding Morgan C Wang Department of Statistics University of Central Florida 1 Morgan C Wang Outline Introduction

This preview shows document pages 1 - 13. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online