This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Stat 231 Exam III December 5, 1997 Prof. Vardeman Attached to this exam are 5 pages of Minitab regression analysis printout. Use them in answering the following questions. You need NOT compute by hand anything that you can get from the printout. (Indeed, you will be wise to avoid wasting time doing hand calculations for things that are obtainable from the printout.) The data used in creating the printout come from a study of a nitride etch process on a single wafer plasma etcher. The process variables studied were B" oe power applied to the cathode (W) B# oe pressure in the reaction chamber (mTorr) B$ oe gap between the anode and the cathode (cm) B% oe flow of the reactant gas C# F' and the response variable was C oe selectivity of the process (SiN/polysilicon) . The first 2 " pages of the printout concern a simple linear regression analysis based on the model # C3 oe "! "$ B$3 %3 . Until further notice, give answers based on (or related to) this SLR model. (a) What fraction of the raw variability in C is explained using B$ as a predictor variable? (b) What is the sample correlation between C and B$ ? (Give a number.) (c) There are two plots on page 2 of the printout. These are plots of standardized residuals versus s and C standardized residuals versus B$ . What difficulty with the simple linear regression model do they reveal? Explain. For purposes of answering the questions (d)(h) ignore the difficulty discovered in part (c). (d) What is indicated by the plot on the top of page 3 on the printout? 1 (e) If one uses ! oe !&, will one reject H! :.ClB$ oe "! "$ B$ based on a formal lack of fit test? Explain. Reject? yes Explanation: no (circle one) (f) Give a 90% twosided confidence interval for the increase in mean value of selectivity (C) that accompanies a 1 cm increase in the gap between the anode and the cathode. (No need to simplify after plugging in.) (g) Give a 95% twosided prediction interval for the next selectivity (C) that accompanies a .8 cm gap (B$ ). (h) As it turns out, the data have B$ oe *'$' and =# $ oe !$!&%&. Use these facts and find a 95% B confidence interval for the mean selectivity that accompanies a .9 cm gap. (No need to simplify.) Beginning in the middle of page 3 of the printout, there is a multiple linear regression analysis of the data. Note that there are both an "all possible subsets" regression and a regression of C on B" B# B$ and B% (and some plots and some sample correlations). Use these to answer questions (i) through (r). (i) Based on the "all possible subsets regression" output, what reduced model seems like one that should be investigated as a possible "simple" explanation of C? Explain in terms of V # = and G: . 2 Model: __________________ Explanation: (j) What degrees of freedom would be used in a formal test for lack of fit to the multiple linear regression model including all 4 predictors B" B# B$ and B% ? numerator .0 oe _____ denominator .0 oe _____ (k) Give the value of an F statistic, its degrees of freedom and the corresponding :value for testing whether the variables B" B# B$ and B% together provide any ability to predict or explain C. 0 oe __________ .0 oe _____ , _____ :value oe __________ (l) Find the value of an F statistic and its degrees of freedom for testing whether after taking B" and B$ into account, the variables B# and B% provide significant additional ability to explain or predict C. (This is not an easy question, but there is enough information on the printout to allow you to find this 0 .) 0 oe __________ .0 oe _____ , _____ (m) Find the value of a > statistic, its degrees of freedom and the corresponding :value for testing whether after taking B" B# and B$ into account, B% adds important explanatory power to the multiple regression model. > oe __________ .0 oe __________ :value oe __________ (n) Give a 97.5% lower prediction bound for the next value of C when B" oe $#& B# oe &!! B$ oe ) and B% oe #!!. 3 (o) Comment on the appearance of the plot of / versus s given at the bottom of page 4 of the printout. C (p) The first plot on page 5 of the printout is a plot of C versus s . It appears to be fairly linear. What C sample correlation should be associated with this plot? (Give a number.) (q) Based on the MLR model involving all 4 B's, give a 90% twosided confidence interval for increase in mean C that accompanies a 1 cm increase in gap (B$ ) if B" B# and B% are held fixed. (There is no need to simplify after plugging in.) (r) Notice that from the first (SLR) regression run ,$ oe "!%&) while from the last (MLR) regression run ,$ oe "!%"#. This difference is not simply numerical error in Minitab. Explain why you are not surprised that there is a difference in these two figures. (What do the correlations given on page 5 of the printout have to say about this?) 4 MTB > print c1c5 Data Display
Row 1 2 3 4 5 6 7 8 9 10 11 y 1.63 1.37 1.10 1.58 1.26 1.72 1.65 1.42 1.69 1.54 1.72 x1 x2 x3 x4 275 450 275 500 275 550 300 450 300 500 275 450 300 550 325 450 325 500 325 550 275 450 0.8 125 1.0 160 1.2 200 1.0 200 1.2 125 0.8 125 0.8 160 1.2 160 0.8 200 1.0 125 0.8 125 MTB> Name c6 = 'FITS1' c7 = 'SRES1' c8 MTB > Regress 'y' 1'x3'; SUBC> Fits' FITS1'; SUBC> SResiduals 'SRES1'; SUBC> Constant; SUBC> Pure; SUBC> Predict 'x3'; SUBC> PFits 'PFIT1'. = 'PF1T1' Regression Analysis
The regression equation is y = 2.52  1.05 x3 Predictor Constant x3 S Coef 2.5242 1.0458 StDev 0.1711 0.1750
T P 14.75 5.98 0.000 0.000 = 0.09670 RSq= 79.9% RSq(adj)= 77.6% Analysis of Variance Source Regression Error Total OF 1 9 10 SS 0.33410 0.08416 0.41825 MS 0.33410 0.00935
F
P 35.73 0.000 Unusual Observations Obs x3 y 3 1.20 1.1000 Fit 1.2692 StDev Fit 0.0506 Residual St Resid 0.1692 2.05R R denotes an observation with a large standardized residual Fit StDev Fit 1.6875 0.0409 1.4783 0.0298 1.2692 0.0506 1.4783 0.0298 1.2692 0.0506 1.6875 0.0409 1.6875 0.0409 1.2692 0.0506 1.6875 0.0409 1.4783 0.0298 1.6875 0.0409 95.0% CI 1.5950, 1.7800) 1.4108, 1.5459) 1.1547, 1.3837) 1.4108, 1.5459) 1.1547, 1.3837) 1.5950, 1.7800) 1.5950, 1.7800) 1.1547, 1.3837) 1.5950, 1.7800) 1.4108, 1.5459) 1.5950, 1.7800) P = 0.7214 95.0% PI 1.4500, 1.9250) 1.2493, 1.7073) 1.0222, 1.5161) 1.2493, 1.7073) 1.0222, 1.5161) 1.4500, 1.9250) 1.4500, 1.9250) 1.0222, 1.5161) 1.4500, 1.9250) 1.2493, 1.7073) 1.4500, 1.9250) Pure error test  F= 0.14 DF(pure error) = 8 1 MTB > Plot 'SRES1' 'FITS1'; SUBC> Symbol '*'. Character Plot
SRES1 1.5+ * * *
2 0.0+ * * * * * * 1.5+ ++++++FITS1 1.2BO 1.360 1.440 1.520 1.600 1.6BO MTB > Plot 'SRES1' 'x3'; SUBC> Symbo I'*, . Character Plot
SRES1 1.5+ * *
2 * * 0.0+ *
* *
1.5+ *
* ++++++x3 O.BOO O.BBO 0.960 1.040 1.120 1.200 MTB> SUBC> MTB > DATA> DATA> MTB > MTB > MTB > SUBC> MTB > SUBC> SUBC> SUBC> Sort 'SRES1'cB; By 'SRES1'. Set c9 1( 1 : 11 / 1 )1 End. sub .5 c9 c9 div c9 11.0 c9 InvCDF C9 C9; Normal 0.0 1.0. Plot CB C9; Symbo l *',; XLabel 'standardized residual quantile'; YLabel 'std normal quantile'. 2 Character Plot
s t d 1.5+
n * * o
r m a l q
u 0.0+ * *
1.5+ * * * * * a
n * *
+++++1.40 0.70 0.00 0.70 1.40 standardized residual quantile t i MTB > BReg 'y''x1' 'x2' 'x3' 'x4'; SUBC> NVars 1 4; SUBC> Best 4. Best Subsets Regression
Response is y Vars 1 1 1 1 2 2 2 2 3 3 3 3 4 RSq RSq (adj) Cop S x x x x 1 234 X X X X X X X X X X X X X X X X X X X X X X 79.9 77.6 26.6 0.096700 17.5 8.3 130.7 0.19583 6.7 0.0 148.7 0.20825 0.6 0.0 158.8 0.21489 X 87.6 84.5 15.7 0.080524 X 85.3 81.6 19.6 0.087782 80.4 75.5 27.7 0.10123 20.5 0.7 127.6 0.20384 95.7 93.9 4.1 0.050394X 89.1 84.4 15.2 0.080755 X 85.4 79.1 21.4 0.093509 24.8 0.0 122.5 0.21197X 96.4 94.0 5.0 0.050069X MTB > Name c10 = 'FITS2' c11 = 'SRES2' MTB > Regress 'y' 4 'x1''x2' 'x3' 'x4'; SUBC> Fits 'FITS2'; SUBC> SResiduals 'SRES2'; SUBC> Constant; SUBC> Predict 'x1' 'x2' 'x3' 'x4'. Regression Analysis
The regression equation is = 2.29 + 0.00327 x1  0.00133 x2  1.04 x3 0.000532 x4 y 3 StOev Predictor Coef 2.2896 0.2544 Constant 0.0032704 0.0007621 x1 0.0013315 0.0003811 x2 1.04120 0.09526 x3 0.0005321 0.0005094 x4 S T P 9.00 0.000 4.29 0.005 3.49 0.013 10.93 0.000 1.04 0.337 RSq(adj)= 94.0% = 0.05007 RSq= 96.4% Analysis of Variance Source Regression Error Total Source x1 x2 x3 x4 OF 4 6 10 OF 1 1 1 1 SS MS 0.40321 0.10080 0.01504 0.00251 0.41825 Seq SS 0.00267 0.08305 0.31476 0.00273 F 40.21 P 0.000 Fit StOev Fit 1.6903 0.0276 1.3968 0.0232 1.1007 0.0433 1.5239 0.0333 1.2890 0.0328 1.6903 0.0276 1.6203 0.0330 1.4187 0.0391 1.7473 0.0393 1.5124 0.0388 1.6903 0.0276 95.0% CI 1.6228, 1.7578) 1.3401, 1.4536) 0.9949, 1.2066) 1.4424, 1.6054) 1.2088, 1.3692) 1.6228, 1.7578) 1.5396, 1.7010) 1.3230, 1.5144) 1.6511, 1.8435) 1.4175, 1.6074) 1.6228, 1.7578) 95.0% PI 1.5504, 1.8302) 1.2618, 1.5319) 0.9388, 1.2627) 1.3767, 1.6710) 1.1425, 1.4355) 1.5504, 1.8302) 1.4735, 1.7670) 1.2632, 1.5742) 1.5915, 1.9031) 1.3574, 1.6674) 1.5504, 1.8302) MTB > Plot 'SRES2' 'FITS2'; SUBC> Symbol '*'; SUBC> XLabel 'yhat'; SUBC> YLabel 'standardized residual'. Character Plot
s t a
n d 1.2+ * * *
2 a
r d 0.0+ * *
* i z e
d * 1.2+
r e s i * *
++++++1.08 1.20 1.32 1.44 1.68 1.56 yhat MTB > Plot 'y' 'FITS2'; SUBC> Symbol '*'; 4 SUBC> SUBC> Xlabel 'yhat'; Ylabel 'y'. Character Plot 2 1.60+ * * * * * * * y 1.40+ *
1.20+ *
++++++1.56 1.44 1.08 1.20 1.32 1.68 yhat
MTB > Correlation 'x1' 'x2' 'x3' 'x4'. Correlations (Pearson)
x1 0.214 0.214 0.210 x2 0.214 0.210 x3 0.210 x2 x3 x4 5 I
1 Key
December 5. 1997 :=..  Stat 231 Exam III Prof. Vardeman I
I i
i
i Attached to this exam are S pages of Mini tab regression analysis printout. Use them in answering the following questions. You need NOT compute by hand anything that you can get from the printout. (Indeed, you will be wise to avoid wasting time doing hand calculations for things that are obtainable from the printout.) The data used in creating the printout come from a study of a nitride etch process etcher. The process variables studied were :rl = power applied to the cathode (W) %2 = pressure in the reaction chamber (mTorr) %5 = gap between the anode and the cathode (em) x . = flow of the reactant gas c,Fs and the response variable was
y on a single wafer plasma (e) If one uses Q = .05. will one reject Ro:!'vl" = f30 + J3:J:r,based on a formal lack or fit test? Explain. Reject? . yes ~ Explanation: \IA.t. (circle one) = selectivity of the process (SiN/polysilicon) ~~~ l 'F<\e.
'11 is
!A~..t1 fI/~I,t..(. th 1(., I u~ I
j
J
~ The first 2l pages
notice, 14" . OS)
.6 3 ( is ,72. ,"I.;,,~ is "~r\ (Mle A 41 +"'S+ of the printout concern a simple linear regression Yi=f30+f3J:r"+'i analysis based on the model (f) Give a 90% twosided confidence interval for the increase in mean value of selectivity (y) that accompanies a 1 em increase in the gap between the anode and the cathode. (No need to simplify after pluggingin.) (Ast :t t (~<J:It>f fJ..).rl bs ') Until further give 8'lSWerS based on (or related in is explained y to) this SLR model. using
Xl (a) What fraction of the raw variability as a predictor variable? j
',j
'j I
I
.~ ..... (g) Give (:r,). (b) What is the sample correlation between
y a 95% twosided prediction interval for the next selectivity(y) that accompanies a .8 em gap and X3? (Give a number.) M1~ J{..t
1:>3 So h?, is '1~'hV~ .79'3'> '! l'v.~ ~ ;.~ vs It. ,=  J
j(fl'Uro:.
~ ~~4(h) As it turns out, the data haveX3 = .9636 and s~.= .030545. Use these facts and find a 95% confidence interval for~he mean s~lectivity that accompanies a.9 em gap. (No need to simplify.) ~(:c.;i.x:~) = ~\)S~ = 10(.030545')'" .305+5 (c) There are two plots on page 2 oCthe printout These are plots of standardized residuals versusy and standardized residuals versus X3. What difficulty with the simple linear regression model do they reveal? Explain. :ct
tNM\ ~t
~~ lit Y..M'Ab'\'I~ I" Y"Sr"'''s:e:.
I\~<;;~t d~"'s. 1'= V:;2.  I.OS(.~) I'I~ .xSLR. 11k 'S, nJ. cr ~
1D .be:. (;'.?2I.O;(.~)) .t t>~{\f 1'h 'lID\ot~.
For purposes of answering 1I\CN\ ""'C;;V."'?l'l~s ~p~U'r~ the questions (d)(h) ignore the difficulty 1 discovered in part (c). Beginning in the middle oCpage 3 of the printout, there is a multiple linear regression analysis of the data. Note that there are both an "all possible subsets" regression and a regression ofy on X1.%2,%S and x" (and some plots and some sample correlations). Use these to answer questions(i) through (r). 2 ~ (i) Based on the "all possible subsets be investigated as a possible "simple" regression" explanation output. what reduced model seems like one that should of1/? Explain in terms of ',6 and 011' R %~ (n) Give a 97.5% =200. lower prediction bound for the next value or when y Xl = 325, %2 = 500,X3 = .8 and t. 5>,,/5 (0) Comment ~~ 6t(fv.p~~) i
SlAf.,' on the appearance of the plot of e" versusy given at the bottom of page. of the printout. 4 (.f,..O
numerator ~ df ;.Jl. '\'k..f~ie.."t.e~ I".~~ t'("rt,ttss: I;(u
~ ~
~ 1'''''~~S: VI~
. filL.~ 1M..(' = ....
its degrees denominator df = 2 I~ (P) The first plot on page 5 of the printout is a plotofy versus y. It appears sample correlation should be associated with this plot? (Give a number.) to be fairly linear. What (k) Give the value of an F statistic, whether the variables %11%'1 %s and %" together of freedom and the corresponding pvalue for testing provide any ability to predict or explain y. ..0~J:t1M~a~)= +~
.
.. . f=
"3. 40.2./ .. dl=~.~ pvalue= .000
(q) Based on the MLR. model involving all 4%'S, give a 90% twosided confidence interval for increase in mean y that accompanies a 1 cm increase in gap (X3) if XII X, and Xf are held fixed. (There is no need to (1) Find the value of an F statistic and its degrees of freedom for testing whether after taking and Xl %1 into account, the variables X2 and provide significant additional ability to explain or predict y. (This is not an easy question, but there is enough information on the printout to allo, you to:find this f.) x" snnPIi1Yafter plugging in.) = r;",(X,,);<;3 '). SST"or = . 676>\tI82.<;;)= .3""3;;' _ ~SR(&\\) ssg(,..~.:R.)2/(~e) (.4032\  3"6>39)/2FS'S&(.(l,.I\)/(r\~I) .01'30+/'=SSR (:CD ;,:~) lA~ b:!,:t i: (d:'. sle('eRe{"f b,3 ) =  (.04/'2. {,!:>43(."9:;;26 ) (r) Notice that from the first (SLR) regression runb3 =  1.0458 while from the last (MLR) regression
run b3 f= = 1.0412. This difference is not simply numerical error in Minitab. Explain why you arc not /.?h
%"
pvalue for testing power to the multiple surprised that there is a difference in these two figures. (What do the correlations given on page 5 of the
printout have to say about this?) (m) Find the value of a t statistic, its degrees of freedom and the corresponding ;;.;.. whether after taking Xl,X2, and Xs into account, adds important explanatory regression model. .5\s ~ i". $6'Ml. ~(= ~ ""\A.rti?ol\'~ri~
i~i6b(o; (),.'rc:
I\.4'X t~((k~'" G...iNul\. i~ <f'W:. :cero I $0 l f> 's d..rc ,.y.~.,.,e Th ?~..e. ~'~ t.v..l'\~ MI. ''vw~ I'" ,;. w..."e;. ( t= ?Ybb~. "sn~s.
"kt "f<S'V\. {.ot df = __ b__
3 pvalue =~
4. ...
View Full
Document
 Fall '08
 Staff
 Regression Analysis

Click to edit the document details