Lecture6_R2_F_ANOVA-2012

Lecture6_R2_F_ANOVA-2012 - Lecture 6 Stat102 2012 Assessing...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
1 Lecture 6 Stat102 2012 • Assessing the importance of having a sloped regression line (Chapter 3.4) – R squared – ANOVA table – F-test NOTE: Your book titles this section: “Assessing the fit of the regression line”. But it’s not really about how well the regression line fits. It’s really about how much better a sloped line fits than does a horizontal one.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Example: College Contribution Rate Data: For a sample of major American Colleges & Universities, x = student /faculty ratio Y = % of alumni who contribute to the institution Giving Rate = 53.01 - 2.057 Stud/Fac Ratio Darker pt is U of Penn. Which school do you think is at Y =68%? Which one is at x = 23 and Y = 19%? “How well does the L-S line fit the data?” NOTE: Really should say: “How much better does it fit than does a horizontal line?” 0 10 20 30 40 50 60 70 Alumni Giving Rate 5 10 15 20 25 Student/Faculty Ratio
Background image of page 2
3 Key parts of the ANOVA TABLE • Total Sum of Squares (SST) = • Error Sum of Squares (SSE) = • “Regression” Sum of Squares (SSR) =SST-SSE • ANOVA Identity: SSE + SSR = SST n i i y y 1 2 ) ( 2 1 ) ˆ ( i n i i y y 2 1 SSR SSE R SST SST    .
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 Simple (4-point) Example: The Best Horizontal Line 0 2 4 6 8 10 Y 0 1 2 3 4 5 x 4 Data Points and The best horizontal line fitting those points . The best line here has the equation yY , and here 4 Y . The Total Squared Distance from this line is         2 2 2 2 4 4 4 4 2 8 1 30 5 SST .
Background image of page 4
5 Simple Example (cont): The L-S Line 0 2 4 6 8 10 Y 0 1 2 3 4 5 x 4 Data Points and The L-S line fitting those points . The L-S line here has the equation 3.5 .2 yx  The Total Squared Distance from this line is         2 2 2 2 2 8 1 3.7 3.9 4. 5 1 4. 2 3 9.8 SSE .
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 NOTE THAT SSE < SST {It must always be that SSE ≤ SST. WHY?} The Difference is called SSR egression or SSM odel . Thus, SSM = SST SSE In our example, SSM = 30 – 29.8 = 0.2 . Thus, SSM measures reduction in the sum of squares. It tells us how much less is the sum of squares about the L-S line than is the sum of squares about the horizontal line.
Background image of page 6
7 R 2 R 2 { aka: Coefficient of Determination } measures the proportion by which the L-S line reduces the Total Sum of Squares. Thus, 2 1 SSR SSE R SST SST   .
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 22

Lecture6_R2_F_ANOVA-2012 - Lecture 6 Stat102 2012 Assessing...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online