This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Homework 3 for STA 5166 (Assigned, Oct. 8)
Statistics in Applications I Due: Oct. 17, 2007 (Wednesday) .Â» 1: BHH Ch.2; Problems 10, 12, 13(a,b,c,d); Pages 62â€”65. (40) 2: BHH Ch.3; Problem 2 (Pages 124125). Submit both your summary results and R/Splus
program for the problem. (20) 3: BHH Ch.3; Problems 4, 7, and 13 (Pages 125128). For each of the three problems,
perform a ttest on the difference of the two means and perform a test based on a ranâ€”
domization distribution (use R/Splus to generate 10000 samples and plot the histogram of the differences). Submit both your summary results and R/Splus program for each
of the problems. (40) 0 A$Â§ar~~m Wu, â€˜Ctrfâ€™kcwkgwgâ€˜x (,AKQLHJA \ogVH Luke/l4 â€˜1â€™.â€œ 5&5 UN! 3% 0A .. {mm/z gfiA is
(\i ng?â€˜ \qdaâ€˜ï¬mnâ€™ â€™ â€œNew Â«k iâ€™kizvzgï¬‚ A? 0.03%1 â€œa: 5;! $2; 23â€˜?
\2
:5; AÂ» w 0 10m, rÂ» 2.0mm
(3" iii: "â€œ 0.1093: 31mm â€a, M 4wâ€? )0 be M 3â€˜3 â€˜3â€˜ â€˜
M. E: MWâ€
â€˜* Q
x :3
â€˜1. ,
f? 3 5 3 h L.
3 â€my Â«QWW 10 M
W 1;Â»;  M k 12) T? 5324â€â€??? â€œ33$? $233,313 33KÂ» Wm?â€œ (H WA?) VCNINUâ€˜
h
u E?â€™ 0â€™1,
\* o 3 i 9Â» ,3 â€˜4. " "
\ RM (:11?! 61L
3 / 3;
\A& P b 93 3
'2 â€˜1
an s; 3
1â€˜ 7K ifâ€œ;
3'â€ 0.Â§{lâ€˜raâ€/3 toâ€œ
2: I â€œ:::â€œ:::;:;:::;XZ;â€œ : 33'} {30% 7 20.02: E: Lâ€œ)
"2/ I
) â€˜23:â€œ, 3,373 ( â€3:11 $1
"â€™22â€œ
f0}
rm  "â€˜ W K} , 3 {:2 .Q\
3: {31? 3.00%): Eâ€œ k < 2:33.635â€œ â€œKi; :3 3 3 3533 â€˜3 . 3 ,
3L â€œ'5'â€œ; QR % 323 $53: $5133.23 31? $3)Â»;
Â«33 3 _ 3 W
0.ch 3;? Y â€˜7?Â»
f 333'
â€â€œ1933â€œ? â€˜ "â€˜3
Pâ€˜W Uâ€˜xd \fcrmu :3 A0â€œ V323: k W953? Meal 'JÂ¢Â§J&,E3
XVWV {5393â€ ("a 0323? E? S 032%: Jaime Frade
STAS 1 66 Chapter 3.2 (.330 : Summary: Will try to test the hypothesis to see if there exists a signiï¬cant difference between the
mean values of levels of asbestos ï¬ber in the air of the industrial plant with and without
S142 chemical. From the comparative trail in the plant, the four consecutive readings
had a mean difference of 3.5. The null hypothesis is that with or without 8142, the
asbestos levels will not change, the alternative is that with Sl42, the level will decrease
since the mean difference is negative. To test this, used as a reference the past
observations of asbestos levels without 8â€”142. From the dataset, obtained a probability
that 1/109 (=0.009l743l 19) that there exists a mean difference less that the comparative
trail. Since this probability is less that 5%, we reject the null hypothesis and accept the
salesman claim that Sâ€”142 is beneï¬cial to reduce the level of asbestos levels in the air of
the industrial plant. Jaime Frade
STAS 1 66 mmmmmwmwuwmwmmÂ»mvmwmummmmmmxw 99$ mewwmmmwmumm,w, V4mwhammWÂ»\\NWWwLï¬whthmmmï¬u/W MW , WW LWWmÃ© data=scan("C:/Documents and Settings/Jaime/Desktop/FALLO7/STA5166/BHH2â€”
Data/datahw3.dat") data
nl=0
Meanlwout = mean(c(8,6))
Mean2with = mean(c(3,4)) diff_means = Mean2withâ€”Meanlwout y : c(rep(NA, (109))) x = c(rep(NA, (109))) for(i in l:lll){ y[i] = (data[i]+data[i+l])/2}
for(j in 1 109){ x[j] = y[j+2] â€” y[j];
if(x[j]<= diff_means) nl=n1+l} x sort(X) n1 diff_means nl/109 Wxï¬‚m:wwwn.{mxxwwwhM3â€?â€œ3:91winWMWWWWMï¬WKKWWWMhaMÃ©/wgwmmwï¬mtnmï¬4WWMHKW1W1MWYâ€˜WWWMâ€˜WWWWWWWH __,WWWWWWWWWLWWWxWWNW.Wammwwmmwmmwmmmzwmma11..7WMMWWWWMWAAMWWMWW:wywwmmwuWMWIW > data=scan("C:/Documents and Settings/Jaime/Desktop/FALLO7/STA5166/BHH2â€”
Data/datahw3.dat") Read 112 items > data [1]9108988876910119101111111110111213121312
[26]14151412131312131313131310898677656564
[51]544245456556567888791091098
[76]9877877788887656567665665
[101]434455656765 > n1=0 > Meanlwout = mean(c(8,6)) > Mean2With = mean(c(3,4)) > diff_means = Mean2withMean1wout > y = c(rep(NA, (109)Â» > x = c(rep(NA, (109))) > for(i in 1:111){ y[i] = (data[i]+data[i+1])/2} > fOTG in11109){XU]= YU+21  YD]; + if(x[j]<= diff_means) n1=n1+1} >â€˜X
[1] â€”1.0 â€”0.5 â€”0.5 â€”O.5 â€”0.5 1.5 0.0 3.0 3.0 0.5 â€”1.0 0.5 1.5 0.5 0.0
[16] ~05 â€”0.5 1.0 2.0 1.0 0.0 0.0 0.5 2.0 1.5 â€”1.5 2.0 0.0 0.0 0.5
[31] 0.5 0.5 0.0 0.0 1.5 â€”4.0 3.0 0.5 â€”1.5 2.0 0.0 0.0 â€”1.5 â€”1.0 0.0
[46] 0.0 0.5 1.0 0.5 0.5 1.5 ~1.0 1.5 1.5 0.0 1.0 1.0 â€”0.5 0.0 0.5
[61] 0.0 1.0 2.0 1.5 0.5 ~0.5 0.0 2.0 1.5 0.0 0.0 1.0 â€”1.0 0.0 â€”1.0
[76] 1.5 0.0 0.5 0.5 â€”O.5 0.5 1.0 0.5 0.0 â€”0.5 â€”1.5 2.0 â€”1.0 0.0 0.0 [91] 1.0 1.0â€”0.5 1.00.5 0.5 0.01.5â€”2.0â€”1.0 0.5 1.0 1.0 1.0 0.5
[106] 0.0 1.0 1.01.0 >X [1] 1.0â€”0.5 â€”0.5 â€”0.50.5 â€”1.5 0.0 3.0 3.0 0.5 1.0 0.5 1.5 0.5 0.0
[16] 0.5 â€”0.5 1.0 2.0 1.0 0.0 0.0 0.5 2.0 1.5 â€”1.5 2.0 0.0 0.00.5
[31] 0.5 0.5 0.0 0.0 â€”1.54.03.00.5â€”1.5 2.0 0.0 0.01.5 â€”1.0 0.0
[46] 0.0â€”0.5â€”1.0â€”0.5 0.51.5â€”1.0 1.5 1.5 0.0 1.0 1.0â€”0.5 0.0 0.5
[61] 0.0 1.0 2.0 1.5 0.5â€”0.5 0.0 2.0 1.5 0.0 0.01.0 â€”1.0 0.0 ~1.0
[76] 1.5 0.0 0.5 0.5 05 0.5 1.0 0.5 0.00.5 1.5 ~2.0â€”1.0 0.0 0.0
[91] 1.0 1.0â€”0.5 ~1.00.5 0.5 0.0â€”1.5 2.01.0 0.5 1.0 1.0 1.0 0.5
[106] 0.0 1.0 1.0 â€”1.0 >sort(x) [1] 4.0 â€”3.0 2.0 â€”2.0 2.0 2.0 1.5 â€”1.5 1.5 1.5 â€”1.5 1.5 1.5 1.5 â€”1.5 [16] â€”1.0 1.0 1.0 â€”1.0 â€”1.0 1.0 â€”1.0 1.0 1.0 1.0 â€”1.0 1.0 â€”0.5 0.5 â€”0.5
[31] â€”0.5 â€”0.5 â€”0.5 â€”0.5 â€”0.5 0.5 0.5 â€”0.5 0.5 â€”0.5 â€”0.5 â€”0.5 â€”0.5 â€”0.5 0.5
[46] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
[61] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.5 0.5 0.5 0.5 0.5
[76] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1.0 1.0 1.0 1.0 1.0 1.0
[91] 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.5 1.5 1.5 1.5 1.5 1.5 2.0 2.0
[106] 2.0 2.0 3.0 3.0 >n1 [1] 1 >diff_means [1] ~35 >n1/109 [1] 0009174312 J aime Frade
STAS 1 66 Jaime Frade
STAS 1 66 Chapter 3.4 (30 113) Assumptions: Ratings are both approximately normal distributed. Two samples, A and B, are
independent. Ratings in each brand are i.i.d. Summary:
To test the hypothesis that 77A = 773 , against the 77 A 9b 773 . I used a ttest to check if the difference in means is not equal to zero. The pâ€”value obtained 0.3316. Therefore there is
not sufï¬cient evidence to reject the null hypothesis. The assumptions were needed to
conduct test. Using a randomization distribution, I also tested the above hypothesis. Here, I did not
make assumptions about the distributions of the ratings. The mean of brand A: (3.875)
and brand B: (5.285714), to obtain a difference of 1.4107. There exist 6435 possible permutations of 8 ratings of brand A and 7 ratings of brand B.
Assuming that the null hypothesis, then there exist no difference in the ratings of brand A
and brand B. Can arrange and for each calculate the differences that are less than 1.4107.
Count the number of occurrences and this will lead to a calculation of the pvalue. The pâ€”Value obtained after a large number of observations should be approximately equal
to the pValue obtained from the ttest above. I obtained the pâ€”Value: 0.3551. This also
leads to the conclusion that one cannot reject the null hypothesis. Jaime Frade STA5166
922E
brandA = c(2,4,2,l,9,9,2.2)
brandB : C(8l3l5I3l7l7l4)
y = t.test(brandA, brandB)
Y
nl=0
hl=0 y1= c(2,4,2,1,9,9,2,2,8,3,5,3,7,7,4)
cl=C(rep("Aâ€œ, 8), rep("B", 7)) d1 = c(rep(0,lOOOO)) diff: 5.285714â€”3.875 for(i in l:lOOOO){ c2=sample(cl); xl=yl[c2==â€œA"]; x2=yl[cZ=="B"]; m1 = mean(xl); m2 = mean(x2); d1[i] = mZâ€”ml; hl=c(hl,dl[il); if(abs(dl[i]) >= 1.4107)nl=n1+l } nl hist(hl, main=â€œRandomization Distribution")
pvalue: nl/lOOOO pvalue vmWwNMWAï¬‚WWWKWWâ€˜3W'WWWWWHIWâ€˜Wï¬WWWW%MWkWivwmmlvwwmvwmwmmz maa,wmmWiWWWlmmnwmwwwwmwmâ€˜mmm , WWWWWMWWWWWMW:MWtwwmlkâ€˜wâ€˜ï¬iwmï¬mwâ€˜wmwww > brandA = c(2,4,2,1,9,9,2,2)
> brandB = c(8,3,5,3,7,7,4)
> > y = t.test(brandA, brandB) > y
Welch Two Sample tâ€”test data: brandA and brandB t = 1.0122, df= 11.923, pâ€”Value = 0.3316 alternative hypothesis: true difference in means is not equal to 0
95 percent conï¬dence interval: â€”4.449587 1.628159 sample estimates: mean of x mean of y
3.875000 5.285714 > n1=0 Jaime Frade
STAS 1 66 > h1=0 > y1= c(2,4,2,1,9,9,2,2,8,3,5,3,7,7,4)
> c1=c(rep("A", 8), rep("B", 7)) > d1 = c(rep(0,10000)) > diff: 5.285714â€”3.875 > f0r(i in 1:10000){ + c2=sample(cl); + x1=y1[c2==" "]; + x2=y1[02=="B"]; + m1 = mean(x1); + m2 = mean(x2); + d1[i] = m2â€”m1; + h1=c(h1,d1[i]); + if(abs(d1[i]) >= 1.4107)n1=n1+1
+ } > H] [1] 3551 > hist(h1, main="Rand0mization Distributionâ€œ)
> pvalue= Ill/10000 > pvalue [1] 0.3551 Randomization Distribution 1500 1000 Frequency 500 Jaime Frade
STA5 166 Chapter 3.7 (30 â€™13) Assumptions:
Results are both approximately normal distributed. Two samples, designs A and B, are
independent. Results in each design are i.i.d. Summary: Will try to test the hypothesis to see if there exists a signiï¬cant difference between the
mean values for the power attainable for the two designs. The null hypothesis assumes
there is no difference in the mean values. I used a tâ€”test to check if the difference in
means is not equal to zero. The pâ€”value obtained 0.4343. Therefore there is not
sufï¬cient evidence to reject the null hypothesis. The assumptions were needed to
conduct test. Using a randomization distribution, I also tested the above hypothesis. Here, I did not make assumptions about the distributions of the ratings. The mean of design A: (1.55)
and brand B: (1.75), to obtain a difference of 0.2 The pvalue obtained after a large number of observations should be approximately equal
to the pâ€”value obtained from the ttest above. I obtained the pâ€”value: 0.4454. This also
leads to the conclusion that one cannot reject the null hypothesis. Jaime Frade
STAS 166 W mmwwwwmwmwwmmmmwmmmmmmmmwmmwmwa 9.2115 w.Â«mmm.WmmmwmmmwwwxwrwuwwmuvWWWWWMNMMWï¬‚MmmmWWWWWWN rmwwmemWMWWWmmeme designA = C(1.8, 1.9, 1.1, 1.4)
designB = c(1.9, 2.1, 1.5, 1.5)
y = t.test(designA , designB)
Y n1=0 h1=NULL y1= C(1.8, 1.9, 1.1, 1.4, 1.9, 2.1, 1.5, 1.5)
c1=c(rep("A", 4), rep(â€œB", 4))
difle.75â€”1.55 d1 = rep(0, 10000) for(i in 1:1000O){
c2=samp1e(c1); x1=y1[c2=="Aâ€œ]; x2=y1[c2==" "]
m1 = mean(x1); m2 = mean(x2);
d1[i] = mZâ€”ml; hl=c(h1,d1[i]); if(abs(d1[i]) >= diff)nl=n1+1 } n1 hist(h1, main=â€œRandomization Distributionâ€œ)
pvalue= n1/10000 pvalue
OUTPUT
> designA = C(1.8, 1.9, 1.1, 1.4)
> designB = c(1.9, 2.1, 1.5, 1.5)
>
> y = t.test(designA , designB)
> Y Welch Two Sample tâ€”test data: designA and designB
t = â€”0.8402, df = 5.756, pâ€”value = 0.4343
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
â€”0.7885191 0.3885191
sample estimates:
mean of x mean of y
1.55 1.75 n1=0 h1=NULL y1= C(1.8, 1.9, 1.1, 1.4, 1.9, 2.1, 1.5, 1.5)
c1=c(rep(â€œA", 4), rep("Bâ€œ, 4)) diff=1.75â€”1.55 d1 : rep(0, 10000) for(i in 1:10000){ VVVVVVV Jaims Frade STA5166
+ c2=sample<cl);
+ xl=yl[c2:=â€œA"]; X2=yl[02=="Bâ€œ]
+ ml = mean(xl); m2 = mean(x2);
+ dl[i] = m2â€”ml;
+ hl=c(hl,dl[i]);
+ if(abs(dl[i]) >= diff)nl=nl+l
+ }
> n1
[1] 4454 > hist(hl, main="Randomization Distribution")
> pvalue= nl/lOOOO > pvalue [1] 0.4454 Randomizaticn Bistributioï¬‚ 1530 Frequency
1000 500 ï¬rï¬ Â£34 43...? 8.6 {3.2, â‚¬14 8.6
m Jaime Frade
STA5166 hapter 3.13 (:0 +3) Assumptions:
Results of production from each diet are both approximately normal distributed. Two
samples, designs A and B, are independent. Results in each diet are i.i.d. Summary: Will try to test the hypothesis to see if there exists a signiï¬cant difference between the
mean values for the power attainable for the two designs. The null hypothesis assumes
there is no difference in the mean values. I used a ttest to check if the difference in
means is not equal to zero. The pâ€”value obtained 0.07842. Therefore there is not
sufï¬cient evidence to reject the null hypothesis. The assumptions were needed to
conduct test. Using a randomization distribution, 1 also tested the above hypothesis. Here, I did not
make assumptions about the distributions of the ratings. The mean of diet A: (166.5)
and brand B: (156.6667), to obtain a absolute value of the difference of 9.83 The pâ€”value obtained after a large number of observations should be approximately equal
to the pValue obtained from the ttest above. I obtained the pvalue: 0.0913. This also
leads to the conclusion that one cannot reject the null hypothesis. A 95% conï¬dence interval for the mean difference: [â€”9.399669, 29.05967] Here, the 95 % conï¬dence interval for the difference in mean hen production between diet
A and diet B numbers above. Thus, not only do we estimate the difference to be 9.83
mg/dl, but we are 95% conï¬dent it is no less than lower bound or greater than upper
bound. Jaime Frade
STAS 166 memmmmmmwmumw,MWWMWWWWWWWWWW mmwmamxwWWWWWâ€˜WWMWWWWWMWWWWWWWW_MMW _ dietA
dietB C(l66,l74,150,166,165,l78)
C(158,159,142,163,l6l,157) y = t.test(dietA , dietB) Y n1=0 hl=NULL yl= c(166,174,150,166,165,178, 158,159,142,163,161,157)
cl=c(rep(â€œAâ€œ, 6), rep("B", 6))
diff=156.6667â€”166.5 for(i in l:lOOOO){ c2=sample(cl) Xl=yl[c2=="Aâ€œ]; x2=yl[c2=="B"] ml = mean(xl); m2 = mean(x2) d1 = m2â€”ml hl=c(hl,dl) if(dl <= diff)nl=nl+l } n1 hist(hl, main:"Randomization Distribution")
pvaluez nl/lOOOO pvalue 9.83â€”qt(0.975,10)* sqrt((5*var(dietA)+5*var(dietB))/lO)
9.83+qt(0.975,10)* sqrt((5*Var(dietA)+5*Var(dietB))/lO) :mewwwÂ«hï¬â€˜a;Mâ€œ:Wmmwwwmaydmmwwï¬wmwmï¬mw'wwmwmï¬‚wmwwWwwherw ,, OUTPUT mumm, m WW mzmwmuwwmâ€œWwwï¬mvwmwwï¬wlÃ©xwummIWMZWWMWmeWWmmWÂ»VWMmWWA6WWÂ¢W > dietA = c(166,174,150,166,165,178)
> dietB = c(158,159,142,163,l61,157)
> > y = t.test(dietA , dietB) > Y
Welch Two Sample tâ€”test data: dietA and dietB t = 1.9735, df= 9.436, pValue = 0.07842 alternative hypothesis: true difference in means is not equal to 0
95 percent conï¬dence interval: 1.359600 21.026267 sample estimates: mean of x mean of y
166.5000 156.6667 Jaime Frade
STAS 166 > n1=0 > h1=NULL > y1= c(166,174,150,166,165,178,158,159,142,163,161,157)
> cl=c(rep("A", 6), rep("B", 6)) > diff=166.5  156.6667 > d] = rep(0,10000) > for(i in 1 :10000){ + 02=sample(cl); + x1=y1[02=="A"]; + x2=y1[02==" "]; + m1 = mean(x1); + m2 = mean(x2); + d1[i] = mZâ€”ml; + h1=c(h1,d1[i]); + if(abs(d1[i])>=9.83)n1=n1+1 + } > n1 [1] 913 > hist(h1, main="Randomization Distribution") > pvalue= nl/ 10000 > pvalue [1] 0.0913 > 9.83â€”qt(0.975,10)* sqrt((5 *Var(dietA)+5*Var(dietB))/ 10)
[1] â€”9.399669 > 9.83+qt(0.975,10)* sqrt((5*Var(dietA)+5*Var(dietB))/ 10)
[1] 29.05967 Frequency 1500 1000 500 Randomization Distributian Jaime Frade
STAS 1 66 ...
View
Full Document
 Fall '11
 staff

Click to edit the document details