# hw5soln - Biostatistics 100B Solutions To Homework...

This preview shows pages 1–2. Sign up to view the full content.

Biostatistics 100B Homework Solutions 5 February 12th, 2007 Solutions To Homework Assignment 5 Warmup Problems (1) Interpreting A Multiple Regression Equation: (a) False. It is never safe to say that a change in X causes a change in Y. Just because X and Y are related does not mean one causes the other. The correct statement would be that a change of one unit in X 1 is associated with a 14 unit increase in Y, assuming X 2 is held fixed. Remember that you do not even know whether X 2 can be held fixed when X 1 is increased! (b) False: The sign of the coefficient has nothing to do with the strength of the relationship between X and Y. It simply tells you the direction of the relationship. If the coefficient is positive, increases in X are associated with increases in Y. If the coefficient is negative, increases in X are associated with decreases in Y. Even the size of the coefficient does not really tell you the strength of the relationship. Suppose X 1 is measured in inches. If I change the units to feet, I will multiply the coefficient by 12 but nothing will have changed. Beware of comparing magnitudes of coefficients! (c) True (maybe): Suppose X 1 is held fixed, say at 0. Then if X 2 is large enough (say 3 or greater), ˆ Y will be negative. This is not necessarily bad–Y may be a variable that takes on negative values. (Note that this does assume that X 1 can take on the value 0 when X 2 is 3 which need not be possible. Technically you would need to be sure it was realistically possible to get a pair of X 1 and X 2 values that would make Y negative.) (2) Special Issues In Multiple Regression: (a) Overfitting occurs when you add lots of predictor variables to your model that are not really signif- icantly related to your response variable. Adding more X variables makes your model look better. For instance R 2 must always go up and SSE must always go down as you add more predictors because the model cannot explain less variability. This may make it seem as if adding more predictors can only help and not hurt. However, if the X’s are not really related to Y then your model may give lousy predictions for new data points even if it does a good job on your current data points. This is where the term overfitting comes from. The model is constructed to “fit” your existing data points very well but it may have to go through so many contortions and wiggles to do it that it won’t fit new data points well–it is “overfitted” to the original points. In fact, it turns out that if you have as many predictor variables as you have data points your model will fit the data perfectly–you will get an R 2 of 100%–even if the predictors you are using have nothing to do with Y! There are many ways to avoid overfitting. First, you shouldn’t include in your model predictor variables that are not significantly related to Y. You can check this by looking at the individual t tests for the variables. Second, if you add lots of useless predictors R 2 adj will go down even though R 2 will continue to go up. This is because R 2

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern