lab4 - # Try running these regression ideas on the two...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
# Try running these regression ideas on the two continuous variables # you collected in an earlier hw. Do this on your own time - not in the lab, # but if you have questions/trouble, ask me or your TA. ########################################################### # In dealing with two continuous variables, we've learned that the first thing # to do is to look at their scatterplot. The correlation coefficient # summarizes the strength of the relationship between them, but it does not # allow us to predict y from x. What does that is "regression" or the # line that fits "through" the scatterplot. # 1) Regression: # The function lm() in R does regression, i.e., it fits a line thru a # scatterplot, or a surface thru higher-dimensional data (later!). # Let us just do simple linear regression, i.e. a fit of just one pair of # variables, x and y. # Consider a fake/simulated example: rm(list=ls(all=TRUE)) set.seed(123) x = runif(100,0,1) # x is uniform between 0 and 1. e = rnorm(100,0,1) # error is normal with mean=0, sigma=1. y = 10 + 2*x + e # The real/true line is y = 10 + 2x. plot(x,y) # Here is the scatterplot, cor(x,y) # and the correlation between x and y. lm.1 = lm(y ~ x) # lm stands for linear model. lm.1 # Note that the estimated coefficients are pretty close # to the true ones (i.e., intercept=10, slope=2) abline(lm.1) # This draws the fit on the scatterplot. # If you want to know what else is contained in lm.1, do this: names(lm.1) ################################################# # 2) Now, the example data from lecture: Compare answers there with those below. x = c(72,70,65,68,70) y = c(200,180,120,118,190) plot(x,y) cor(x,y) lm.1 = lm(y~x) abline(lm.1) # Draws the fit lm.1 # Gives you intercept and slope. summary(lm.1) # Gives that, and R-sqd, and more (for later). # The following does anova, i.e. decomposing SST into explained & unexplained. # Make sure you can identify the two pieces. anova(lm.1) # SS_explained=4942.3, SSE = 1308.9, (R^2 = 0.7906) . # By the way, the residuals (i.e. errors) and the fitted (predicted) values
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
# are also contained in lm(). For example, here is the residual plot: plot(lm.1$fitted,lm.1$residuals) # A random scatter of points around abline(h=0) # the horiz line is a GOOD thing. ############################################# # 3) Now, do regression on hail data: # In reality, divergence is measured by Doppler radar, and so # if we can predict hail size from divergence, then we can predict hail size # from Doppler radar. That's useful! # Do simple linear regression for predicting size from divergence. #
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 5

lab4 - # Try running these regression ideas on the two...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online