This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Introduction to Probability Introduction to Probability and Statistics and Statistics Thirteenth Edition Thirteenth Edition Chapter 12 Linear Regression and Correlation Introduction Introduction • In Chapter 11, we used ANOVA to investigate the effect of various factorlevel combinations (treatments) on a response x . • Our objective was to see whether the treatment means were different. • In Chapters 12 and 13, we investigate a response y which is affected by various independent variables, x i . • Our objective is to use the information provided by the x i to predict the value of y. Example Example • Let y be a student’s college achievement, measured by his/her GPA. This might be a function of several variables: – x 1 = rank in high school class – x 2 = high school’s overall rating – x 3 = high school GPA 4 Example Example • Let y be the monthly sales revenue for a company. This might be a function of several variables: – x 1 = advertising expenditure – x 2 = time of year – x 3 = state of economy 4 Some Questions Some Questions • Which of the independent variables are useful and which are not? • How could we create a prediction equation to allow us to predict y using knowledge of x 1 , x 2 , x 3 etc? • How good is this prediction? We start with the simplest case, in which the response y is a function of a single independent variable, x . A Simple Linear Model A Simple Linear Model • In Chapter 3, we used the equation of a line to describe the relationship between y and x for a sample of n pairs, (x, y ) . • If we want to describe the relationship between y and x for the whole population , there are two models we can choose • Deterministic Model: y = α + β x • Probabilistic Model: – y = deterministic model + random error – y = α + β x + ε A Simple Linear Model A Simple Linear Model • Since the bivariate measurements that we observe do not generally fall exactly exactly on a straight line, we choose to use: • Probabilistic Model: Probabilistic Model: – y y = = α + β α + β x x + ε + ε – E(y) = E(y) = α + β α + β x x Points deviate from the line of means line of means by an amount ε where ε has a normal distribution with mean 0 and The Random Error The Random Error • The line of means, E(y) = E(y) = α + β α + β x x , , describes average value of y for any fixed value of x. • The population of measurements is generated as y deviates from the population line by ε ε . We estimate α α and β β using sample information. The Method of The Method of Least Squares Least Squares • The equation of the bestfitting line is calculated using a set of n pairs ( x i , y i ). • We choose our estimates a and b to estimate α and β so that the vertical distances of the points from the line, are minimized....
View
Full Document
 Fall '08
 Any
 Statistics, Correlation, Linear Regression, Probability, Regression Analysis, Total SS

Click to edit the document details