This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Statistics 191: Introduction to Applied Statistics Qualitative Variables, Interactions & ANOVA Jonathan Taylor Department of Statistics Stanford University February 17, 2010 1 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Qualitative variables + interactions Outline Qualitative / categorical variables. Regression equations differing by group. Interactions. Analysis of Variance Models 2 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Categorical variables Categorical variables Most variables we have looked at so far were continuous: height, rating, etc. In many situations, we record a categorical variable: sex, state, country, etc. How do we include this in our model? 3 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Categorical variables A simple example One example that we have looked at does have categorical variables. Two sample problem with equal variances: suppose Y = ( Z 1 ,..., Z m , W 1 ,..., W n ) with Z j ∼ N ( μ 1 ,σ 2 ) , 1 ≤ j ≤ m and W j ∼ N ( μ 2 ,σ 2 ) , 1 ≤ j ≤ n + m . For 1 ≤ i ≤ n , let X i = ( 1 1 ≤ i ≤ m otherwise. 4 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Categorical variables A simple example Design matrix X ( n + m ) × 2 = 1 1 . . . . . . 1 1 1 0 . . . . . . 1 0 5 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Example IT salary data Outcome: S, salaries for IT staff in a corporation. Predictors: X, experience (years); E, education (3 levels): 1=Bachelor’s, 2=Master’s, 3=Ph.D; M, management (2 levels): 1=management, 0=not management. 6 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University IT salary R code 7 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University IT salary R code 8 / 1 Statistics 191: Introduction to Applied Statistics Jonathan Taylor Department of Statistics Stanford University Two solutions Solution #1: stratification One solution is to “stratify” data set by this categorical variable. We could break data set up into groups by education and management, and fit fit model S i = β + β 1 X i + ε i in each group. Problem: this results in smaller samples in each group: lose degrees of freedom for estimating σ 2 within each group....
View
Full
Document
This document was uploaded on 03/16/2010.
 Winter '09
 Statistics

Click to edit the document details