12 Pages

Lecture20standard

Course: STAT 511, Fall 2008
School: Purdue
Rating:
 
 
 
 
 

Word Count: 690

Document Preview

511: Statistics Statistical Methods Dr. Levine Purdue University Fall 2006 Lecture 19: Analysis of Paired Data Devore: Section 9.3 Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 Analysis of Paired Data The data consists of n independently selected pairs (X1 , Y1 ), (X2 , Y2 ),..., (Xn , Yn ) with E Xi = 1 and E Yi = 2 . The differences Di = Xi Yi are assumed to be...

Register Now

Unformatted Document Excerpt

Coursehero >> Indiana >> Purdue >> STAT 511

Course Hero has millions of student submitted documents similar to the one
below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.
511: Statistics Statistical Methods Dr. Levine Purdue University Fall 2006 Lecture 19: Analysis of Paired Data Devore: Section 9.3 Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 Analysis of Paired Data The data consists of n independently selected pairs (X1 , Y1 ), (X2 , Y2 ),..., (Xn , Yn ) with E Xi = 1 and E Yi = 2 . The differences Di = Xi Yi are assumed to be normally 2 distributed with mean value D = 1 2 and variance D . The last requirement is usually the consequence of Xs and Y s being normally distributed themselves Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 Example Consider Ex. 9.8 from Devore. Six river locations were selected and the concentration of zinc in mg/L determined for both surface water and bottom water at each location. Presumably, there is some connection between surface water and bottom water concentrations... Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 The Paired t test The test considered is H0 : D = 0 where D = X Y The test statistic is d 0 t= sD / n where d and sD are the sample mean and standard deviation of di s Note that the old method of computing the variance of the difference does not work anymore since X and Y are NOT independent Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 The differences themselves are independent. Thus, hypotheses about D = 1 2 can be tested using a one-sample t-test with Di s as data H0 : D = 0 Test statistic value is d 0 t= sD / n where d and sD are the sample mean and standard deviation of the di s. Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 Possible alternatives are Ha : D > 0 , Ha : D < 0 and Ha : D = 0 . Their respective rejection regions are t t,n1 , t t,n1 and either t t/2,n1 or t t/2,n1 . Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 Example Consider Ex. 9.9 in Devore. Note that differences closely adhere to the straight line in the normal probability plot but the mean appears to be nonzero. The formal hypothesis is H0 : D = 0 vs. Ha : D = 0. The value of the test is statistic 6.75 d0 = 3.28 = t= sd / n 8.234/ 16 The P-value for a t curve with 15 df is .004 Note that for this sample size the normality assumption is important Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 A condence interval for D The paired t condence interval for D is d t/2,n1 sD / n Note that for large n this interval is valid without any restrictions on the distribution of differences. The same is not true if n is relatively small Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 Example Consider Example 9.10 in Devore. Yet again, the exploratory data analysis suggests normality of the differences. The sample size is n = 13 so this is important. The needed d = 20.5 sec and sD = 11.96 while the number of df is n 1 = 12. The 95% condence interval is sD t/2,n1 = 20.5 (2.179) 11.96 = (13.3, 27, 7) d n 13 Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 Paired Data and Two-Sample t Procedures The main difference between the paired data t test and the standard t test lies in how we estimate V (X Y ). In the independent case we have V (X Y ) = V (X) + V (Y ) but... In the paired data case V (XY ) = V 1 n Di 2 2 1 + 2 21 2 V (Di ) = = n n Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 In the above, = Corr(X, Y ) = Cov(X, Y )/[ general, V (X) V (Y )]; in 2 2 V (X Y ) = 1 + 2 21 2 = 0 and, therefore, V (X Y ) = V (X) + V (Y ). In the independence case, Thus, using a regular t-test in paired data means overestimating the variance of X Y , and consequently, underestimating the signicance of the data Aug, 2006 Statistics 511: Statistical Methods Dr. Levine Purdue University Fall 2006 Pros and Cons of Pairing For great heterogeneity and large correlation within experimental units, the loss in degrees of freedom will be compensated for by an increased precision associated with pairing (use pairing). If the units are relatively homogeneous and the correlation within pairs is not large, the gain in precision due to pairing will be outweighed by the decrease in degrees of freedom (use independent samples). Aug, 2006
Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.

Below is a small sample set of documents:

Purdue - STAT - 511
Statistics 511: Statistical Methods Dr. LevinePurdue University Fall 2006Lecture 20: Two Sample Test for Proportions and the Variance TestDevore: Section 9.4-9.5Aug, 2006Statistics 511: Statistical Methods Dr. LevinePurdue University Fall
Purdue - STAT - 511
Statistics 511: Statistical Methods Dr. LevinePurdue University Fall 2006Lecture 20: Lecture 20: Two Sample Test for Proportions and the Variance TestDevore: Section 9.4-9.5Aug, 2006Statistics 511: Statistical Methods Dr. LevinePurdue Univ
Purdue - STAT - 511
Statistics 511: Statistical Methods Dr. LevinePurdue University Fall 2006Practice Problems for the Final Exam, Fall 2006Aug, 2006Statistics 511: Statistical Methods Dr. LevinePurdue University Fall 2006Example 1 Consider a new design fo
Purdue - STAT - 511
Statistics 511: Statistical Methods Dr. LevinePurdue University Fall 2006Practice Problems for the Final Exam, Fall 2006Aug, 2006Statistics 511: Statistical Methods Dr. LevinePurdue University Fall 2006Example 1 Consider a new design fo
Purdue - STAT - 511
HOMEWORK 11.10 a. Minitab generates the following stem-and-leaf display of this data: 59 6 33588 7 00234677889 8 127 9 077 stem: ones 10 7 leaf: tenths 11 368 What constitutes large or small variation usually depends on the application at hand, but
Purdue - STAT - 511
Stat 511 Homework #2 Section 2.1: 9; 2.2: 13, 18, 22; 2.3: 39, 44; 2.4: 589: (a) ( A B)' = A' B'( A B )' - shaded area (red)A shaded area (blue) B striped area A' B ' - area both shaded blue and striped(b) ( A B )' = A' B 'AB=AB
Purdue - STAT - 511
Stat 511 Homework #3 Solutions Section 2.5: Pr. 71, 76. Section 3.1: Pr. 4. Section 3.2: Pr. 12, 18, 23 Section 3.3: Pr. 32, 35. 2.71 P ( A B ) = P ( B ) P ( A B ) = P ( B ) P( A) P ( B ) = [1 P ( A)] P ( B ) = P ( A) P ( B )Alternatively,
Purdue - STAT - 511
Stat 511 Homework #4 Solution Section 3.4: 51, 56, 59; 3.6: 77, 79, 86; Additional Problem51: Given: 20% of all telephones of a certain type are submitted for service while under warranty. Of those, 60% can be repaired, whereas the other 40% must be
Purdue - STAT - 511
Stat 511 Homework #5 Solution Section 4.1: Pr. 3, 5, 10, Section 4.2: Pr. 12, 18, 21, 23, Section 4.3: Pr. 27, 29, 33 3. a) Graph of f(x) = .09375(4 x2)x3 b) P(X &gt; 0) = .09375(4 x )dx = .09375(4 x ) = .5 0 3 02 22c) P(-1 &lt; X &lt; 1) =1
Purdue - STAT - 511
Homework # 6 Stat 511- Fall 2006
Purdue - STAT - 511
Stat 511 Homework #7 Solution Section 5.4: Pr. 48, 50, Section 7.1: Pr. 1, 3, 7, Section 7.2: Pr. 12, 15, 20 Section 5.4 48.a. X = = 50 , x =x 1 = = .10 n 10050.25 50 49.75 50 Z .10 .10 P( 49.75 X 50.25) = P = P(-2.5 Z 2.5) = .987
Purdue - STAT - 511
Stat 511 Homework #8 Section 8.1: 1, 7, 9 ;Section 8.2: 15, 16, 19, 21
Purdue - STAT - 511
Stat 511 Homework #9 Solution (Section 8.3: Pr. 35, 37, 41, Section 8.4: Pr. 45, 46, 49, 55) Section 8.3 35.1 Parameter of interest: p = true proportion of cars in this particular county passing emissions testing on the first try. Ho: p = .70 Ha: p
Purdue - STAT - 511
Problem 1 (5pt) We must find P( Pj | D) for j = 1,2,3 . Bayes rule gives us P ( P1 | D) = P ( P1 ) P ( D | P1 ) = 0.3 0.01 = 0.158 0.3 0.01 + 0.2 0.03 + 0.5 0.02 0.2 0.03 = 0.316 0.3 0.01 + 0.2 0.03 + 0.5 0.02 0.5 0.02 = 0.526 0.3 0.01 + 0.
Purdue - STAT - 511
MIDTERM 2 STAT 511, FALL 2006 Total is 20pt Problem 1 (5pt) Let X be the number of packages being mailed by a randomly selected customer at a certain shipping facility. Suppose the distribution of X is as follows:x p(x)1 .42 .33 .24 .1a.
Purdue - STAT - 511
Statistics 511: Statistical Methods Dr. LevinePurdue University Fall 2006Practice ProblemsDevore: Chapters 1-3Aug, 2006Statistics 511: Statistical Methods Dr. LevinePurdue University Fall 2006Chapter 1, Pr. 22 A very large percentage o
Purdue - STAT - 520
STAT 520 TIME SERIES AND APPLICATIONSPROF. MICHAEL LEVINE Tel. (office) 765-496-7571 E-mail mlevins@stat.purdue.edu CLASS MEETING TIME AND PLACE: TTh, 3.00-4.15pm, UNIV 217 Office hours: TTh, 2-3pm, HAAS 154 or by appointment THE COURSE WEBPAGE IS
Purdue - STAT - 520
Testing the estimated noise sequence. 20pt 1. We know the classical result that states the sample autocorrelations of an iid sequence Y1 , . . . , Yn are approximately iid with the normal 1 distribution N 0, n . This suggests a simple procedure that
Purdue - STAT - 520
464 675 703 887 1139 1077 1318 1260 1120 963 996 960 530 883 894 1045 1199 1287 1565 1577 1076 918 1008 1063 544 635 804 980 1018 1064 1404 1286 1104 999 996 1015 615 722 832 977 1270 1437 1520 1708 1151 934 1159 1209 699 830 996 1124 1458 1270 1753
Purdue - STAT - 520
101 82 66 35 31 7 20 92 154 125 85 68 38 23 10 24 83 132 131 118 90 67 60 47 41 21 16 6 4 7 14 34 45 43 48 42 28 10 8 2 0 1 5 12 14 35 46 41 30 24 16 7 4 2 8 17 36 5062 67 71 48 28 8 13 57 122 138 103 86 63 37 24 11 15 40 62 98 124 96 66 64 54 39 2
Purdue - STAT - 520
Statistics 520: Time Series and Applications Dr. LevinePurdue University Spring 2008IntroductionShumway and Stoffer: 1.1-1.4Jan, 2008 Page 1Statistics 520: Time Series and Applications Dr. LevinePurdue University Spring 2008Examples of t
Purdue - STAT - 520
Statistics 520: Time Series and Applications Dr. LevinePurdue University Spring 2008Stationarity and Correlation EstimationShumway and Stoffer: 1.5-1.6Jan, 2008 Page 1Statistics 520: Time Series and Applications Dr. LevinePurdue University
Purdue - STAT - 520
Exploratory Data Analysis of Time SeriesShumway and Stoffer: 2.31Determining the trend:regression approach Consider the model x t = t + twhere t is a stationary process while t is a stationary trend. A strong trend can obscure the behavior
Purdue - STAT - 520
Statistics 520: Time Series and Applications Dr. LevinePurdue University Spring 2008General ARMA(p,q) modelsShumway and Stoffer: 3.1-3.2Jan, 2008 Page 1Statistics 520: Time Series and Applications Dr. LevinePurdue University Spring 2008A
Purdue - STAT - 520
Purdue - STAT - 520
Purdue - STAT - 520
Purdue - STAT - 520
0.000431 -0.032049 -0.064579 0.037256 0.014549 0.055532 0.028476 0.027706 0.002458 -0.027810 0.027726 0.029064 0.001707 0.045159 0.006724 0.006786 0.057336 -0.020423 0.075360 0.025511 0.046540 -0.039414 0.069393 0.023042 -0.002745 -0.013542 0.089915
Purdue - STAT - 520
Additional: The best answer there is either AR(3), AR(5) or MA(3). If you try to take into account the higher order lags (say, above 10), that relatively high acf/pacf at those lags is usually the result of estimation errors and it is best to keep th
Purdue - STAT - 520
Purdue - STAT - 520
Purdue - STAT - 520
Purdue - STAT - 511
Instructor Office Phone E-mail Course webpage Office hours GraderProf. Michael Levine MATH 438 (765)496-7571 mlevins@stat.purdue.edu http:/www.stat.purdue.edu/~mlevins/Stat511/Stat511.htm Thursday: 2-3.30pm Friday: 2-3.30pm or by appointment Yang Z
Purdue - STAT - 511
TENTATIVE SCHEDULE Week Topic August 23rd- August 28th Chapter1 rd August 31st- September 3 Sections 2.1, 2.2 September 8th- September 10th Sections 2.3-2.5 th th September 13 - September 17 Sections 3.1-3.3 September 20th- September 24th Sections 3.
Purdue - STAT - 511
An exampleFor two-seaters on the highwayx_24 30 30 25.8 191. Mean as a non-resistant measure of the distribution center. It can be severely compromised by just one outlier; long tails also make it unrepresentative. A good example is the
Purdue - STAT - 520
STAT 520 TIME SERIES AND APPLICATIONS COURSE INFORMATIONCLASS MEETING TIME AND PLACE: TTh, 3.00-4.15pm, REC 307 Office hours: TTh, 2-3pm, HAAS 154 or by appointment http:/www.stat.purdue.edu/~mlevins/STAT520_07/STAT520_07.htm TEXTBOOK Introduction
Purdue - STAT - 520
-0.010381 -0.024476 -0.115591 0.089783 0.036932 0.068493 0.000000 0.000000 0.065104 0.032258 0.031250 0.030303 0.023041 0.081081 0.183333 0.078571 0.149007 -0.048991 0.085890 0.031073 0.019178 -0.008152 0.090411 0.218593 0.041667 0.028000 0.004864 -0
Purdue - STAT - 520
0.000431 -0.032049 -0.064579 0.037256 0.014549 0.055532 0.028476 0.027706 0.002458 -0.027810 0.027726 0.029064 0.001707 0.045159 0.006724 0.006786 0.057336 -0.020423 0.075360 0.025511 0.046540 -0.039414 0.069393 0.023042 -0.002745 -0.013542 0.089915
Purdue - STAT - 520
0.00632 0.00366 0.01202 0.00627 0.01761 0.00918 0.00820 -0.01170 -0.00587 0.00757 -0.00992 0.03989 0.02817 0.03682 0.02809 0.02073 0.02593 0.02202 0.00458 0.00969 -0.00241 0.00896 0.02054 0.01734 0.00939 -0.00465 -0.00810 -0.01398 -0.00399 0.01192 0.
Purdue - STAT - 520
3.7 3.4 3.4 3.1 3.0 3.2 3.1 3.1 3.3 3.5 3.5 3.1 3.2 3.1 2.9 2.9 3.0 3.0 3.2 3.4 3.1 3.0 2.8 2.7 2.9 2.6 2.6 2.7 2.5 2.5 2.6 2.7 2.9 3.1 3.5 4.5 4.9 5.2 5.7 5.9 5.9 5.6 5.8 6.0 6.1 5.7 5.3 5.0 4.9 4.7 4.6 4.7 4.3 4.2 4.0 4.2 4.1 4.340.0 41.0 43.0 42
Purdue - STAT - 520
vw &lt;- scan(file=&quot;U:/.www/Stat520_07/m-vw2697.txt&quot;) length(vw) acf(vw,lag=20) acf(log(vw+1),lag=20) ar3.t &lt;- ar(vw,method=&quot;ols&quot;,order.max=3) ar3.t$order [1] 3 ar3.t$ar ,1 [,1] [1,] 0.10406023 [2,] -0.01027165 [3,] -0.12041467 vw.subset &lt;- vw[1:858] ar
Purdue - STAT - 520
Statistics 520: Time Series and Applications Dr. LevinePurdue University Spring 2006Yule-Walker method For an AR(p) process Yt = 1 Yt1 + . . . + p Ytpthe system of Y-W equations is(k) = 1 (k 1) + + p (k p)where k= 1, . . . , p
Purdue - STAT - 520
Simple Forecasts Consider a signal plus noise model Yn = mn + Xn . The trend mn can beeasily estimated by, e.g., exponential smoothing. Then, if the trend is constant, the series is stationary and the best linear predictor of Yn+h is mn Remind
Purdue - STAT - 520
tt &lt;- read.table(file=&quot;U:/.www/Stat520_07/ustbill.dat&quot;,header=F) ttb &lt;- as.matrix(tt[,2:7]) ttb_m &lt;- t(ttb) ttbill&lt;- as.vector(ttb_m) plot.ts(ttbill) acf(ttbill) plot.ts(log(ttbill) acf(log(ttbill) dlntbill &lt;- diff(log(ttbill) plot.ts(dlntbill) acf(d
Purdue - STAT - 1
Problem 2.34 A production facility employs 20 workers on the day shift, 15 workers on the swing shift, and 10 workers on the graveyard shift. A quality control consultant is to select 6 of these workers for in-depth interviews. Suppose the selection
Purdue - STAT - 511
Problem 2.34 A production facility employs 20 workers on the day shift, 15 workers on the swing shift, and 10 workers on the graveyard shift. A quality control consultant is to select 6 of these workers for in-depth interviews. Suppose the selection
Purdue - STAT - 1
1. An insurance company offers its policyholders a number of different premium payment options. For a randomly selected policyholder, let X be the number of months between successive payments. The cdf of X is as follows: 0 x &lt; 1 .30 1 x &lt;3 .40 3 x
Purdue - STAT - 511
1. An insurance company offers its policyholders a number of different premium payment options. For a randomly selected policyholder, let X be the number of months between successive payments. The cdf of X is as follows: 0 x &lt; 1 .30 1 x &lt;3 .40 3 x
Purdue - CES - 02
Master Gardener Classes AnnouncedMarion County Fall 2009 ClassesThe next series of Purdue Extension Master Gardener classes in Marion County begins September 22, 2009.Afternoon Class Master Gardener Class #2009-3: Tuesdays &amp; Thursdays, September
Purdue - CES - 02
PURDUE UNIVERSITY COOPERATIVE EXTENSION SERVICE Marion CountyOctober 27, 2008 Dear Gardener: Thank you for your interest in the Purdue Extension - Marion County Master Gardener training classes. This popular program is for people who would like to l
Purdue - CS - 590
A Simple Randomized Scheme for Constructing Low-Weight k-Connected Spanning Subgraphs with Applications to Distributed AlgorithmsMaleq Khan Gopal Pandurangan V.S. Anil Kumar Abstract The main focus of this paper is the analysis of a simple random
Purdue - CS - 590
A Fast Distributed Approximation Algorithm for Minimum Spanning TreesMaleq Khan and Gopal PanduranganDepartment of Computer Science, Purdue University, West Lafayette, IN 47907, USA {mmkhan, gopal}@cs.purdue.eduAbstract. We present a distributed
Purdue - CS - 590
1Distributed Algorithms for Constructing Approximate Minimum Spanning Trees in Wireless Sensor NetworksMaleq Khan Gopal Pandurangan V.S. Anil KumarAbstract The Minimum Spanning Tree (MST) problem is an important and commonly occurring primitive i
Purdue - STAT - 511
Chapter 1: Overview Probability is a subarea of mathematics. Statistics is targeting for application. Drawing statistical conclusion should be based on probability theory. Probability uses assumptions. Statistics uses data. Statistical answer
Purdue - STAT - 511
Chapter 2Sections 2.1 and 2.2.1Section 1 introduces the following concepts. Outcomes: single point of possibility. Sample Space: the set of all outcomes (possibilities), denoted by S. Empty set: denoted by . Subset: if all outcomes in A are
Purdue - STAT - 511
Section 2.3: Counting Techniques1Product Rule For a task with k steps, if there are n1 options (ways) in the rst step, there are n2 options (ways) in the second step, and until there are nk options (ways) in the k-th step, then, the total opt
Purdue - STAT - 511
Section 2.4: Conditional Probability1Denition For any two events A and B with P (B) &gt; 0, the conditional probability of A given that B has occured isP (A|B) =P (A B) . P (B)2Rules for Conditional Probability Multiplication Rule P (A B
Purdue - STAT - 511
Section 2.5: Independence1 Denition: For any two events A and B with P (B) &gt; 0, we say A and B are independent if P (A|B) = P (A). Proposition: P (A|B) = P (A) P (A B) = P (A)P (B) P (B|A) = P (B). Proposition: if A and B are indepenent, then
Purdue - STAT - 511
Homework 1 1. Let A = {1, 2, 3, 4, a, b, c}, B = {1, 2, 5, 6, b, c, d, e, f }, C = {1, 2, a, d, g, h}. (a) Is 1 A? Is {1} A? Is {1, 2} A? (b) Compute A B and A B. (c) Compute (A B) C and A (B C). Are they equal? (d) Show (A B) C = (A C)
Purdue - PPDL - 07
Ornamental Pest and Disease UpdateDrought StressFuture ImplicationsBy Alan WindhamJust as the Easter freeze was an outdoor classroom on the effects of cold temps on herbaceous and woody ornamentals, the drought of 2007 reminds us of the effects
Purdue - PPDL - 07
Article Title: Cold Damage-Situation Report 4-18-07 The amount of damage done by the cold weather that swept through Indiana a few weeks ago will depend mostly on the stage of plant growth at the time of the cold. Trees such as honey locust were stil