This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: “1.20 1. PValue Hypothesis Test: The critical ﬂicker frequency (cft) is the highest frequency at which a person
can detect the ﬂicker in a ﬂickering light source. At frequencies above the off, the light source appears to be
continuous even though it is actually ﬂickering. An investigation carried out to see whether true average cff
depends upon iris color yielded the following data': _ ——
f = 28.17 36 = 26.92 Does the average cff depend upon whether a person has blue or green irises? You will conduct a test to
decide this issue in this problem. a. State the null and alternative hypotheses.
. Ho ’ v/UB = M0
H9 3 Ag 1* x“ 6 b. The statistic that you compute in this problem has, depending upon assumptions made, either an
approximate normal density or .in Tell me which one — normal or t. If you say t then also give me
the degrees of freedom. 4,927.” x 7 / FWLMOQWMIA «mime; 1‘ c. Compute the test statistic (the z or tvalue) 4: a (’32:) 4°) 2 7.
‘lfédii
MB ".9 d. Using the appropriate table, compute as best you can, a pvalue. (Hint: Though it’s not necessary, I
would suggest drawing a picture.) 21.2! 13th ’b 7 J MA
‘2» [11
e. What is your conclusion? That is, are you going to accept or reject H o ? PavaIUg > (X z 0.05 1'0 NWT‘KD I 1 Based upon the article “The effects of iris color on critical ﬂicker frequency”, Journal of General Psychology, 1973,
pp. 9195. 2. (18 pts.) Fillin the blanks/Multiple choice: : . I
a. 1‘“ D is a measure to ﬂag inﬂuential observations in least squares/regression. When an inﬂuential observation is detected this measure will small (circle one). b. In a test/retest situation, the tendency for individuals who score high on the test to, on average, fallback on the retest and for individuals who score low on the test to improve, on average, on the retest is called the Max/>70» smear . c. In a least squares/regression model, an analyst would like to include the categorical variable “day of the week”. She will need to create A [give me a number here] dummy variables to handle
this categorical variable. R
0‘ CaJt, . rag  l , d. If two variables are correlated, then a change in one causes a change in the other. Truircle one). a e. If y = ax‘t is to be a reasonable model for a dataset (xv y,),(x2, y2 ),..., (x, , y" ), then conﬁrmation of this / would be obtained by looking for linearity in a plot of (circle one):
i. (x,,y,.), i= l,2,...,n
ii. (lnx,,y,), i=l,2,...,n iii. (xvlnyl), i=l,2,...,n v. (lnx,,1nyi), ‘=1,2,...,n f.. Suppose a response variable is being ﬁt by several predictor variables. If a particular predictor variable is not contained in the ﬁnal ﬁtted equation for the response, then that variable must not be useful in helping predict the response (assume that the model was “properly” developed by a skilled analyst). Tru (circle one). 3. (9 pts.) Weight, using a dataset of 507 individualsz, was ﬁt as linear in height, hip girth, thigh girth, and
waist girth. In particular, the following model was ﬁt by Minitab: Weight =ﬂo + ,8, (Height) + ,62 (Hip Girth) + [3, (T high Girth) + ,64 (Waist Girth) + 5
Here is the associated Minitab output: Regression Analysis: Weight versus Height, Hip Girth, Thigh Girth, Waist Girth The regression equation is
Weight = — 120 + 0.504 Height + 0.113 Hip Girth + 0.688 Thigh Girth +
0.693 Waist Girth Predictor Coef SE Coef"’ T P
Constant 120.443 3 .387 35.56 0.000
Height 0.50414 0.01843 27.35 0.000
Hip Girth 0.11253 0.05189 2.17 0.031
Thigh Girth 0.68762 0.06241 11.02 0.000
Waist Girth 0.69285 0.02108 32.87 0.000
S = 3.197 R—Sq = 94.3% RSq(adj) = 94.3% Analysis of Variance Source DF F
Regression 4 84992 21248 2078.88 0.000 Residual Error 502 5131 10
Total 506 a. The most important variable in predicting Weight '5 necessarily Waist Girth since this variable has the largest coefﬁcient (0.693) in the ﬁtted model. T ru  ' ircle one). b. If we were in the midst of performing a Backward Regression procedufe, then which of the four
variables would leave the least squares/regression model? (Comment: Assume the stopping criteria has not
yet been satisﬁed so that you must remove a variable.) HIP 61mm Q'ng (me cam/19(7 puma) c. The bold, underlined pvalue 0.000 under the “ANalysis Of VAriance” (ANOVA) section corresponds to
which one of the following hypothesis tests (circle one)? Hozﬂl=l62=ﬂ3=ﬂ4=o A
H A :At least one A (i = 1, 2,3, 4) is nonzero I i. H.:>t.=ﬂ,=ﬂ.=ﬂ.=ﬂ.=0 1.
H A :At least one ,6, (i = 1, 2, 3, 4) is nonzero 2Associated with the article by Heinz, G., Peterson, 1., Johnson, R., and Kerk, C. (2003), “Exploring relationships in
body dimensions”, Journal of Statistics Education, vol. 11, no. 2,
datasets.heinz.html. 4. Critical Region Hypothesis Test: In one study3 smokers attempting to quit smoking were randomly assigned to one of two groups. Each group had the same contact time with staff, but only one of the groups
included “vigorous exercise”. Does vigorous exercise seem, on average, to improve the ability to abstain from cigarette smoking? To help answer this question, use the summary data in the following table. The variable measured was abstinence from cigarettes, in days (i.e. ‘on average, the people in the exercise group abstained from
cigarettes 30.1 days). Group Sample Size Sample Mean Sample Standard Deviation Exercise Grou
NonExercise Grou approximate normal density or a t density. Tell me which one norma or t. If you say t then also give me b. The statistic that you compute in this problem has, depending \upon assumptions made, either an
the degrees of ﬁeedom. Wye. c. Compute the test statistic (the z or t—value) (E I @402 —___—__’_———~ (W: .34,
2: = 4 2. .. a. . "’
W 4. L”
’15 HQ 131‘ [+7 d. Tell me a decision rule. In other words, carefully describe when the null hypothesis is accepted and when
it is rejected. (Hint: Though it’s not necessary, I would suggest drawing a picture.) 544W —————+———~1————————I—' WM 2~ m,@ ' e. What is your cone usion? That is, are you going to accept or reject Ho ?
‘ 3Marcus, Bess et.al. (1999), “The efﬁcacy of exercise as an aid for smoking cessation in women: A randomized
controlled trial”, Archives of Internal Medicine, vol. 159, June 14, p. 12291234. 5. Assumptions
Problem 1: a. Are any assumptions about cff values for the population of people with blue in'ses needed in problem 1?
If so, what are they? 3/55, War We»? W 50” We» b. If you listed any assumptions in part a, how can you reasonany check such? (If you had none, skip this
part.) (’(AJEﬁVMW a: 1} “Wm, W3W(H1Y Md?
We mm (6 pm VW) c. Are any assumptions about cff values for the population of people with green irises needed in problem 1?
If so, what are they? Yes, our move W01 2m: wow/9b d. If you listed any assumptions in part c, how can you reasonably check such? (If you had none, skip this
part.)
(Mp/Mn] w of— » [Vampi PMBW/I/I’I‘.’ Vuf‘ 07: 7M5. W (5mm WWW“)
Problem4: e. Are any assumptions about abstinence times for the population of people trying to quit smoking while
involved in a vigorous exercise program? If so, what are they? Mo ’K'W‘F’“ 4‘ / f. If you listed any assumptions in part a, how can you reasonably check such? (If you had none, skip this
part) g. Are any assumptions about abstinence times for the population of people trying to quit smoking while
not involved in a vigorous exercise program? If so, what are they? “mi
0 ' N/Wl V N/ h. If you listed any assumptions in part 0, how can you reasonably check such? (If you had none, skip this
part) 6. Returning to problem 4, determine an approximate 95% conﬁdence interval for [15 — p", where #5 = Mean abstinence time from cigarettes for the Exercise population
,uN = Mean abstinence time from cigarettes for the Non—exercise population (30,! » /7.3) d: We :: (7.0521355) (mart; Fan 0mm) ...
View
Full Document
 Spring '04
 JOHNSON
 Math, Statistics, Probability

Click to edit the document details