Announcements : Homework 1 due today before you leave. Remember to sign up for STAT 100-0a on the classes server AND for whichever section you are in! TA office hours and locations are listed online Open session with JDRS every Monday, 2:30-3:30 in OML 202. Homework solution sessions every Tuesday at 2:30 and 7pm in the STATLAB. STAT 101-106 Introduction to Statistics 69

Lurking Variables in Regression (things hidden . . ) A variable that has an important effect but was overlooked. DANGER – Confounding! This is when we think an effect is due to one variable, but it is really due to another, lurking variable. Example : A 1970 study showed coffee drinkers had a higher incidence of bladder cancer. However, a 1993 study showed that, if you also considered smoking, there was no evidence of a link between coffee and bladder cancer. (i.e. people who drink lots of coffee also smoke!). Example : There is strong association between GNP/capita and Fertility Rates. This does not mean that getting paid less CAUSES women to have more babies! Lurking variables can actually cause a reversal in the apparent magnitude of an effect Example : Examine how many times people can click a counter at two different temperatures. At what temperature do people get more counts? Boxplot of Results : 50 deg seems to have higher counts than 80 degrees. STAT 101-106 Introduction to Statistics 70 Temperature Group Counts 80 deg 50 deg 100 90 80 70 60 50 Boxplot of Counts vs Temperature Group
However : look at plot of counts by temperature group with age of subject included (age is the covariate ) 50 deg actually has lower counts 80 deg. BUT Age has a strong impact on Response Time, and the average age is quite different in each treatment group ( 32 . 1 = x , 57 . 2 = x ), so age effect becomes confounded with temperature effect . One Last Regression Warning : Regression estimates are valid ONLY over the region of explanatory variables where you have data! Example : Speed of Ants. (see http://www.scs.uiuc.edu/~che390/report2/rep2prog11.htm ) Ants are cold-blooded and their speed is temperature dependent. Experiments have shown that in fact ant speed can be modeled according to known facts about the rate of chemical reactions : + = Temp b b Speed 1 ) ln( 1 0 STAT 101-106 Introduction to Statistics 71

That is, the natural log of ant speed is linearly related to the inverse of the Temperature. Here is a plot of some experimental data. Now – think about what happens as temperature increases or decreases – is the linear estimate still valid? STAT 101-106 Introduction to Statistics 72 1/ Temp Ln(cm/sec) 0.00355 0.00350 0.00345 0.00340 0.00335 0.00330 0.00325 3.0 2.5 2.0 1.5 1.0 S 0.101393 R-Sq 97.5% R-Sq(adj) 97.0% Fitted Line Plot Ln(cm/sec) =   21.16 - 5610 1/Temp 35 deg C 9 deg C
Sampling and Experiments Research Design : Some General Advice Decide What you Want to Know : Explicitly define the parameters of your study. Make sure the things you

