This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Lecture 26 26.1 Test of independence. In this lecture we will consider the situation when data comes from the sample space X that consists of pairs of two features and each feature has a finite number of categories or, simply, X = { ( i, j ) : i = 1 , . . . , a, j = 1 , . . . , b } . If we have an i.i.d. sample X 1 , . . . , X n with some distribution on X then each X i is a pair ( X 1 i , X 2 i ) where X 1 i can take a different values and X 2 i can take b different values. Let N ij be a count of all observations equal to ( i, j ) , i.e. with first feature equal to i and second feature equal to j, as shown in table below. Table 26.1: Contingency table. Feature 2 Feature 1 1 2 b 1 N 11 N 12 N 1 b 2 N 21 N 22 N 2 b . . . . . . . . . . . . . . . a N a 1 N a 2 N ab We would like to test the independence of two features which means that ( X = ( i, j )) = ( X 1 = i ) ( X 2 = j ) . In we introduce the notations ( X = ( i, j )) = ij , ( X 1 = i ) = p i and ( X 2 = j ) = q j , 103 LECTURE 26. 104 then we want to test that for all i and j we have ij = p i q j . Therefore, our hypotheses can be formulated as follows: H 1 : ij = p i q j for some ( p 1 , . . . , p, ....
View Full
Document
 Spring '09
 DmitryPanchenko
 Statistics

Click to edit the document details