EXST7005 Fall2010 10a t-tests

EXST7005 Fall2010 10a t-tests - Statistical Methods I (EXST...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Statistical Methods I (EXST 7005) Page 62 Probability distribution tables in general The tables we will use will ALL be giving the area in the tail (α). However, if you examine a number of tables from other sources you will find that this is not always true. Even when it is true, some tables will give the value of α as if it were in two tails, and some as if it were in one tail. For example, we want to conduct a two-tailed Z test at the α = 0.05 level. We happen to know that Z = 1.96. If we look at this value in the Z tables we expect to see a value of 0.025, or α/2. But many tables would show the probability for 1.96 as 0.975, and some as 0.05. Why the difference? It just depends on how the tables are presented. Some of the alternatives are shown below. Some tables give cumulative distribution starting at – infinity. You want to find the probability corresponding to 1 – α/2. The value that leaves .025 in the upper tail would be 0.975. Some tables may start at zero (0.0) and give the cumulative area from this point for the upper half of the distribution. This would be less common. The value that leaves .025 in the upper tail would be 0.475. Among the tables like ours, that give the area in the tail, some are called two tailed tables and some are one tailed tables. Table value, 0.0.025 -4 -3 -2 -1 0 1 2 3 4 α=0.025 1-α=0.975 One tailed table. Table value, 0.050 -4 -3 -2 -1 0 α/2 1-α 1 2 3 4 α/2 Two tailed table. Why the extra confusion at this point? All our tables will give the area in the tail. The Z tables we used gave the area in one tail. For a two tailed test you needed to doubled the probability. For the F tables and Chi square tables covered later, this area will be a single tail as with the Z tables. This is because these distributions are not symmetric. Traditionally, many t-tables have given the area in TWO TAILS instead of on one tail. Many textbooks have this type of tables. SAS will also usually give two-tailed values for t-tests. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 63 Our tables will have both two-tailed probabilities (top row) and one-tailed probabilities (bottom row), so you my use either. The same patterns are true for many of the computer programs that you may use to get probabilities. For example in EXCEL If you use the NORMDIST(1.96) function it returns 0.975, one tail, cumulative from –∞ If you enter NORMSINV(0.025) it returns –1.96, the two tailed value If you enter TINV(0.05,9999) it returns 1.96, so it is also two-tailed. The TDIST(1.96,9999,1) function allows you to specify 1 or 2 tails in the function call. The t tables My t-tables are created in EXCEL, but patterned after Steel & Torrie, 1980, pg. 577. The degrees of freedom, “d.f.” or γ, are given on the left side of the table. The probability of randomly selecting a larger value of t is given at the top (and bottom) of the page. P(t ≥ t0) given at the bottom, this is a one-tailed probability. P(|t| ≥ t0) given at the top, this is a two-tailed probability (not the absolute value signs) Each row represents a different t distribution (with different d.f.). The Z table had many probabilities, corresponding to Z values of 0.00, 0.01, 0.02, 0.03, etc. About 400 probabilities occurred in the tables we used. They all fit on one page because the whole Z table was a single distribution. The t table has many different distributions so less information is given about each distribution. If we are going to give many different t-distributions on one page, we lose something. We will only give a few selected probabilities, the ones we are most likely to use. e.g., 0.10, 0.05, 0.025, 0.01, 0.005. Only the POSITIVE side of the table is given, but as with the Z distribution, the t distribution is symmetric, so the lower half of the table can be determined by using the upper half. Our t-tables Partial t-table – 1 or 2 tails? df 1 2 3 4 5 6 7 8 9 10 ∞ 0.100 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.282 0.050 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.645 0.025 12.71 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 1.960 0.010 0.005 31.82 63.656 6.965 9.925 4.541 5.841 3.747 4.604 3.365 4.032 3.143 3.707 2.998 3.499 2.896 3.355 2.821 3.250 2.764 3.169 2.326 2.576 James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 64 Note the selected d.f. on the left side. The table stabilizes fairly quickly. Many tables don't go over about d.f. = 30. The Z tables give a good approximation for larger d.f. Our tables will give d.f. as follows down the left most column of the table, 1, 2, 3, 4, 5 ,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 ,20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 32, 34, 36, 38, 40, 45, 50, 75, 100, ∞ Selected probabilities In the topmost row of the table selected probabilities will be given as α for a TWO TAILED TEST. In the bottom-most row of the table selected probabilities will be given as α for a ONE TAILED TEST. Probabilities in our tables are, Top row: Bottom row: 0.50 0.25 0.40 0.20 0.30 0.15 0.20 0.10 0.10 0.05 0.050 0.025 0.02 0.01 0.010 0.005 0.002 0.001 0.0010 0.0005 HELPFUL HINT: Don't try to memorize “two tail top, one tail bottom”, just recall the characteristics of the distribution when df = ∞ then t = 1.96. This leaves 5% in both tails and 2.5% in one tail. So take any t-table and look to see what probability corresponds to df=∞ and t = 1.96. If the value is 0.025, it is the area in one tail of the distribution and if it is 0.050 it is a two tailed table. If the area is 0.975 it is cumulative from – ∞, etc. This trick of recalling 1.96 also works for Z tables. The tables we use give the area in the tail of the distribution, Z = 1.96 corresponds to a probability of 0.025. Some Z tables give the cumulative area under the curve starting at –∞, the probability at Z = 1.96 would be 0.975. Other Z tables give the cumulative area starting at 0, the probability at Z = 1.96 would be 0.475 Working with our t-tables Example 1. Let d.f. = γ = 10 H0: μ = μ0 versus H1: μ ≠ μ 0 and α = 0.05 P(|t| ≥ t0) = 0.05; 2P(t ≥ t0)=0.05; P(t ≥ t0)=0.025 (Probabilities at the top of the table) -4 -3 -2 -1 0 1 2 3 4 1 2 3 4 t0=2.228 Example 2. Let d.f. = γ = 10 H0: μ = μ0 versus H1: μ > μ0 and α = 0.05 P(t ≥ t0) = 0.05 (probabilities at the bottom of the table) t0=1.812 -4 -3 -2 -1 0 James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 65 Look up the following values. Find the t value for H1: μ ≠ μ0, α=0.050, d.f. = ∞ Find the t value for H1: μ > μ0, α = 0.025, d.f. = ∞ Find the t value for H1: μ ≠ μ0, α = 0.010, d.f. = 12 Find the t value for H1: μ > μ0, α = 0.025, d.f. = 22 Find the t value for H1: μ ≠ μ0, α = 0.200, d.f. = 35 Find the t value for H1: μ ≠ μ0, α = 0.002, d.f. = 5 Find the t value for H1: μ < μ0, α = 0.100, d.f. = 8 Find the t value for H1: μ < μ0, α = 0.010, d.f. = 75 Find the P value for t = –1.740, H1: μ < μ0, d.f. = 17 Find the P value for t = 4.587, H1: μ ≠ μ0, d.f. = 10 1.960 1.960 3.055 2.074 1.306 5.894 –1.397 –2.377 0.050 0.001 t-test of Hypothesis We want to determine if a new drug has an effect on blood pressure of rhesus monkeys before and after treatment. We are looking for a net change in pressure, either up or down (two-tailed test). Example 1 of the t-test We obtain a random sample of 10 individuals. Note: n = 10, but d.f. = γ = 9 1) H0: μ = μ0 2) H1: μ ≠ μ0 3) Assume: Independence (randomly selected sample) and that the CHANGE in blood pressure is normally distributed. 4) We set α = 0.01, but split between two tails (to meet the alternate hypothesis). P(|t| ≥ t0) = 0.01; 2P(t ≥ t0) = 0.01; P(t ≥ t0) = 0.005 in each tail The critical value of t is: Given that it is a 2 tailed test, with 9 d.f. (n = 10, but d.f. = γ = 9) and we set α = 0.01 Under these conditions, the critical limit from the t-table is t0 = 3.250 5) Obtain values from the sample of 10 individuals (n = 10). The values for change in blood pressure were; 0, 4, –3, 2, 0, 1, –4, 5, –1, 4 n ∑Y i =1 n ∑Y i =1 = 0 + 4 – 3 + 2 + 0 +1– 4 + 5 –1+ 4 = 8 i 2 i = 0 + 16 + 9 + 4 + 0 + 1 + 16 + 25 + 1 + 16 = 88 n Y = ∑Y i =1 i n = 8 = 0.8 10 James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 66 2 ⎛ n ⎞ ⎜ ∑ Yi ⎟ n n 2 2 ∑ (Yi − Y ) ∑ Yi − ⎝ i=1n ⎠ 88 − 6410 (88 − 6.4 ) 2 = i =1 = = = 9.067 SY = i =1 ( n − 1) 9 9 n SY = 9.067 = 3.011 SY = SY 3.011 = = 0.952 10 n Finally, the value of the test statistic, a t value in this case, is t= (Y − μ ) = ( 0.8 − 0 ) = 0.840 0 SY 0.952 with 9 d.f. 6) Compare the critical limit to the test statistic and decide to reject or fail to reject. The critical limit from the t-table is t0 = 3.250 The test statistic calculated from the sample was 0.840 (9 d.f.) The area leaving 0.005 in each tail is almost too small to show on our usual graphs. The test statistic is clearly in the region of “acceptance”, so we fail to reject the H0. 1 2 3 7) Conclude that the new drug does not affect the -4 -3 -2 -1 0 blood pressure of rhesus monkeys. Is there an error? Maybe a Type II error, but not a Type I error since we did not reject the null hypothesis. Example 2 of the t-test A company manufacturing environmental monitoring equipment claims that their thermograph (a machine that records temperature) requires (on the average) no more than 0.8 amps to operate under normal conditions. We wish to test this claim before buying their equipment. We want to reject the equipment if the electricity demand exceeds 0.8 amps. 1) H0: μ = μ0 , where μ0 = 0.8 2) H1: μ > μ0 3) Assume (1) independence and (2) a normal distribution of amp values, or at least of the mean that we will test. We do not assume a known variance with the t-test, we use a variance calculated from the sample. 4) We set α = 0.05. The critical value of t for considers that we are doing a 1 tailed test (see H1:) with 15 d.f. (n = 16, but d.f. = γ = 15) and α = 0.05 P(t ≥ t0) = 0.05 from the table is t0 = 1.753 James P. Geaghan Copyright 2010 4 Statistical Methods I (EXST 7005) Page 67 5) Draw a sample. We have 16 machines for testing. The individual values for amp readings were not recorded. Summary statistics are given below; Y = 0.96 SY = 0.32 SY = t= SY 0.32 = = 0.08 16 n (Y − μ ) = ( 0.96 − 0.8) = 2.00 0 SY 0.08 with 15 d.f. 6) Compare the critical limit and to the test statistic. The critical limit from the table is t0 = 1.753 and the calculated test statistic was t = 2 (with 15 d.f.) -4 -3 -2 -1 0 1 2 3 4 t0 = 1.753 Clearly, the test statistic exceeds the one tailed critical limit and falls in the upper tail of the distribution in area of rejection. 7) Conclusion: We would conclude that the machines require more electricity than the claimed 0.8 amperes. Of course, there is a possibility of a Type I error. t test with SAS SAS example (#2a) Recall our test of blood pressure change of Rhesus monkeys. We can take the values of blood pressure change, and enter them in SAS PROC UNIVARIATE. Values: 0, 4, –3, 2, 0, 1, –4, 5, –1, 4 SAS PROGRAM DATA step OPTIONS NOCENTER NODATE NONUMBER LS=78 PS=61; TITLE1 't-tests with SAS PROC UNIVARIATE'; DATA monkeys; INFILE CARDS MISSOVER; TITLE2 'Analysis of Blood Pressure change in Rhesus Monkeys'; INPUT BPChange; CARDS; RUN; The data would follow the cards statement ending with a semicolon PROC PRINT DATA=monkeys; RUN; PROC UNIVARIATE DATA=monkeys PLOT; VAR BPChange; TITLE2 'PROC Univariate on Blood Pressure Change'; RUN; The PROC UNIVARIATE from SAS® will perform a two-sample t-test. See SAS PROGRAM output. James P. Geaghan Copyright 2010 Statistical Methods I (EXST 7005) Page 68 Notes on SAS PROC Univariate Note that all values we calculated match the values given by SAS. Note that the standard error is called the “Std Error Mean”. This is unusual; it is called the “Std Error” in most other SAS procedures. The test statistic value matches our calculated value (0.840). SAS also provides a “Pr>|t| 0.4226”. Calculated The value provided by SAS is a P value Upper Lower (Pr>|t| = 0.4226) meaning that the value Critical calculated value of t = 0.840 would leave Critical region 0.4226 (or 42.46 percent) of the region distribution in the 2 tails (half in each -4 -3 -2 -1 0 1 2 3 4 tail). The two tailed split is indicated by the absolute value signs around t, so the proportion in each tail is 0.2113 (or 21.13 %). The P-value indicates our calculated value would leave 21.13% in each tail, our critical region has only 0.5% in each tail. Clearly we are in the region of “acceptance”. Example 2b with SAS Testing the thermographs using SAS PROC UNIVARIATE. We didn't have data, so we cannot test with SAS. A NOTE. SAS automatically tests the mean of the values in PROC UNIVARIATE against 0. In the thermograph example our hypothesized value was 0.8, not 0.0. But from what we know of transformations, we can subtract 0.8 from each value without changing the characteristics of the distribution. SAS Example 2c – Freund & Wilson (1993) Example 4.2 We receive a shipment of apples that are supposed to be “premium apples”, with a diameter of at least 2.5 inches. We will take a sample of 12 apples, and test the hypothesis that the mean size is equal 2.5 inches, and thus qualify as premium apples. If LESS THAN 2.5 inches, we reject. 1) H0: μ = μ0 2) H1: μ < μ0 3) Assume: Independence (randomly selected sample) Apple size is normally distributed. 4) α = 0.05. We have a one tailed test (H1: μ < μ0), and we chose α = 0.05. The critical limit would be a t value with 11 d.f. This value is –1.796. 5) Draw a sample. We will take 12 apples, and let SAS do the calculations. The sample values for the 12 apples are; 2.9, 2.1, 2.4, 2.8, 3.1, 2.8, 2.7, 3.0, 2.4, 3.2, 2.3, 3.4 As mentioned, SAS automatically tests against zero, and we want to test against 2.5. So, we subtract 2.5 from each value and test against zero. The test should give the same results. James P. Geaghan Copyright 2010 ...
View Full Document

This note was uploaded on 12/29/2011 for the course EXST 7005 taught by Professor Geaghan,j during the Fall '08 term at LSU.

Ask a homework question - tutors are online