Lec_14 - k-Sample Test Pawel Polak GU4222 GR5222...

• 40
• 100% (1) 1 out of 1 people found this document helpful

This preview shows page 1 - 6 out of 40 pages.

1/40k-Sample TestPawel PolakMarch 27, 2017GU4222 & GR5222: Nonparametric Statistics - Lecture 14Pawel Polak (Columbia University)GU4222 & GR5222: Nonparametric Statistics -Lec˙14
2/40K-Sample testsIn the two sample tests, we consideredX1,X2, . . . ,XnFandY1,Y2, . . . ,YmG,and asked questions regarding the relation ofFandG.ExampleWhat if we would like to show that under the same education / familyincome, the crime rate is independent of the race.Start with collecting statistics of the crime rates in dierentstates/neighborhoods for the people that have the same education/familyincome, and construct tables of the form:RaceStatisticsHispanicX11,X12, . . . ,X1n1WhiteX21,X22, . . . ,X2n2African AmericanX31,X32, . . . ,X3n3AsianX41,X42, . . . ,X4n4EachXijin this table presents the crime rate of a certain race in a certain neighborhood. Based onthis data we would like to test our hypothesis that says the crime rates are the same for dierentraces.Pawel Polak (Columbia University)GU4222 & GR5222: Nonparametric Statistics -Lec˙14
3/40k-sample parametric testsOne way to cast this problem as a testing problem is the following:Suppose thatX11,X12, . . . ,X1n1N(μ1,σ2), . . . ,Xk1,Xk2, . . . ,XknkN(μk,σ2).Note that all the samples have the same variance.If our claim (that dierent races have the same crime rate) is true,we will haveμ1=μ2=. . .=μk.Therefore, we test forH0:μ1=μ2=. . .=μkvs.H1: at least for onei,j, μi6=μj.How do we test this hypothesis?Is it any dierent from the two sample test?Can we use the two-sample test to address this problem as well?Pawel Polak (Columbia University)GU4222 & GR5222: Nonparametric Statistics -Lec˙14
4/40k-sample parametric testsLet’s start with a simple example that we have only three groups.group1123group21.50.52.1group31.22.32.9The first idea that we would like to explore is inspired by the T-test.The fact thatμ1=μ2=μ3, gives us three two-sample hypotheses,each of which can be checked with a two-sample T-test.These three hypotheses areH00:μ1=μ2,H000:μ2=μ3, andH0000:μ1=μ3. If, we accept all these three hypotheses, then we canacceptH0as well.Otherwise we should rejectH0.Therefore, we can perform the two-sample T-test to evaluate thevalidity ofH00,H000, andH0000.Is this a good approach?Pawel Polak (Columbia University)GU4222 & GR5222: Nonparametric Statistics -Lec˙14
5/40k-sample parametric testsWe can characterize the significance level of this test.SinceH00,H000, andH0000are similar we set the significance level topfor all of them.Now we can characterize the significance level of testingH0withthis pairwise tests.P(rejecting H0|H0) = 1-P(accepting H0|H0)= 1-P(accepting H00\accepting H000\accepting H0000|H0)a= 1-P(accepting H00|H0)P(accepting H000|H0)P(accepting H0000|H0)= 1-(1-p)3.Clearly, equality (a) is not exactly true. To obtain that we assume theindependence of the three events. However the independence is not true.
• • • 