*This preview shows
pages
1–10. Sign up
to
view the full content.*

This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*This
** preview**
has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*
**Unformatted text preview: **Introduction to Probability Introduction to Probability and Statistics and Statistics Thirteenth Edition Thirteenth Edition Chapter 12 Linear Regression and Correlation Introduction Introduction In Chapter 11, we used ANOVA to investigate the effect of various factor-level combinations (treatments) on a response x . Our objective was to see whether the treatment means were different. In Chapters 12 and 13, we investigate a response y which is affected by various independent variables, x i . Our objective is to use the information provided by the x i to predict the value of y. Example Example Let y be a students college achievement, measured by his/her GPA. This might be a function of several variables: x 1 = rank in high school class x 2 = high schools overall rating x 3 = high school GPA 4 Example Example Let y be the monthly sales revenue for a company. This might be a function of several variables: x 1 = advertising expenditure x 2 = time of year x 3 = state of economy 4 Some Questions Some Questions Which of the independent variables are useful and which are not? How could we create a prediction equation to allow us to predict y using knowledge of x 1 , x 2 , x 3 etc? How good is this prediction? We start with the simplest case, in which the response y is a function of a single independent variable, x . A Simple Linear Model A Simple Linear Model In Chapter 3, we used the equation of a line to describe the relationship between y and x for a sample of n pairs, (x, y ) . If we want to describe the relationship between y and x for the whole population , there are two models we can choose Deterministic Model: y = + x Probabilistic Model: y = deterministic model + random error y = + x + A Simple Linear Model A Simple Linear Model Since the bivariate measurements that we observe do not generally fall exactly exactly on a straight line, we choose to use: Probabilistic Model: Probabilistic Model: y y = = + + x x + + E(y) = E(y) = + + x x Points deviate from the line of means line of means by an amount where has a normal distribution with mean 0 and The Random Error The Random Error The line of means, E(y) = E(y) = + + x x , , describes average value of y for any fixed value of x. The population of measurements is generated as y deviates from the population line by . We estimate and using sample information. The Method of The Method of Least Squares Least Squares The equation of the best-fitting line is calculated using a set of n pairs ( x i , y i ). We choose our estimates a and b to estimate and so that the vertical distances of the points from the line, are minimized....

View
Full
Document