STP452topic7

# STP452topic7 - STAT 512: Applied Regression Analysis Topic...

This preview shows pages 1–4. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: STAT 512: Applied Regression Analysis Topic 7 Spring 2008 Topic Overview: Two-way Analysis of Variance (ANOVA) Interactions Two-Way ANOVA The response variable Y is continuous. There are now two categorical explanatory variables (factors). Call them factor A and factor B instead of X 1 and X 2 . Data for two-way ANOVA Y , the response variable Factor A with levels i = 1 to a Factor B with levels j = 1 to b A particular combination of levels is called a treatment or a cell. There are a b treatments. Y ijk is the k-th observation for treatment ( i,j ) , k = 1 , 2 ,...,n In Chapter 19, we for now assume equal sample size in each treatment combination ( n ij = n &gt; 1 ; n T = abn ). This is called a balanced design. In later chapters we will deal with unequal sample sizes but it is more complicated. Notation For Y ijk the subscripts are interpreted as follows: i = 1 , 2 ,...,a denotes the level of the factor A j = 1 , 2 ,...,b denotes the level of the factor B k = 1 , 2 ,...,n denotes the k-th observation in cell or treatment ( i,j ) 1 Example KNNL p 833 ( nknw817.sas ) response Y is the number of cases of bread sold. factor A is the height of the shelf display; a = 3 levels: bottom, middle, top. factor B is the width of the shelf display; b = 2 levels: regular, wide. n = 2 stores for each of the 3 2 = 6 treatment combinations ( n T = 12 ) Read the data /* File: nknw817.sas */ data bread; infile 'ch19ta07.dat'; input sales height width; proc print data=bread; run; Model Assumptions We assume that the response variable observations are independent, and normally distributed with a mean that may depend on the levels of the factors A and B , and a variance that does not (is constant). Cell Means Model Y ijk = ij + ijk ij is the theoretical mean or expected value of all observations in cell ( i,j ) the ijk are iid N (0 , 2 ) Y ijk N ( ij , 2 ) , independent There are ab + 1 parameters of the model: ij , for i = 1 to a and j = 1 to b ; and 2 . Parameter Estimations Estimate ij by the mean of the observations in cell ( i,j ) , Y ij. = 1 n k Y ijk 2 For each ( i,j ) combination, we can get an estimate of the variance 2 ij : s 2 ij = 1 n- 1 k ( Y ijk- Y ij. ) 2 Combine these to get an estimate of 2 , since we assume they are all equal. In general we pool the s 2 ij , using weights proportional to the df , n ij- 1 . The pooled estimate is s 2 = ij ( n ij- 1) s 2 ij ij ( n ij- 1) = ij ( n ij- 1) s 2 ij n T- ab . Here, n ij = n , so s 2 = P s 2 ij ab = MSE . Investigate with SAS Note we are including an interaction term which is denoted as the product of A and B . It is not literally the product of the levels but it would be if we used indicator variables and did regression....
View Full Document

## STP452topic7 - STAT 512: Applied Regression Analysis Topic...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online