supplementary_regression_notes

# supplementary_regression_notes - Simple and M ultiple...

This preview shows pages 1–5. Sign up to view the full content.

Simple and Multiple Regression 10.1 and 10.2 Probabilistic Model The equation of a straight line model is represented as: are the unknown error components that represent the deviations of the observed value of y from the fitted value of y. They are the unobservable random variable. , is the deterministic component of the model and is referred to as the line of means. is the random error component of the model. The coefficients of the assumed model are unknown parameters that must be estimated from the sample data and we will use StatCrunch in this effort. The equation of the least squares line is: , the y-intercept has no meaning unless the data that was used to develop the least squares line includes at least one sample data point such that (x=0, y=c). , the slope is the amount that y will change by for every unit increase in x. If the slope is positive then y will increase on average by the value of the slope for every unit increase in x. 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Simple and Multiple Regression If the slope is negative then y will decrease on average by the value of the slope for every unit increase in x. The least squares line is used to predict the mean value of y for a given value of x, . 10.3 A straight line can be “fit” to the points in a scatter plot using the methods of least squares. The fitted line is referred to as the least squares line and passes through the scatter, such that, at the observed x-values, the sum of squares of the vertical distances of the points from the line is a minimum. The difference between the actual observed value and the predicted value (on the line) is called the error or residual. Residual (Error)=Observed- Fitted , The fit captures all the information in the data about the relationship between the explanatory variable and the response variable, then the residuals, 2
Simple and Multiple Regression Should be “informationless.” That is, the residuals should not contain any systematic patterns or be associated with variables in the current model. The residuals should appear to be completely random. The Least Squares line is unique in that it is the line for which the Sum of the Square Errors is minimized . X 2 4 6 7 Y 1 8 9 4 The least squares line for these points can be shown to be We call the estimate made from a model the predicted value or fitted value: When the predicted/fitted value is Residual (Error)=Observed-Fitted 2 1 3.36 -2.36 5.57 4 8 4.92 3.08 9.49 6 9 6.47 2.53 6.40 7 4 7.25 -3.25 10.56 3

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
Simple and Multiple Regression An estimate of is , Mean Square Error=MSE , p=the number of independent variables. (in simple linear regression p=1) Standard deviations of the residuals This value (s) is an estimator for sigma and approximately 95% of the observed values (y), will fall within 2(s) of their least squares predictive values. The largest deviation that you would expect to see would be at 2(s). An estimate of is ,
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 05/02/2011 for the course STATISTICS 2103 taught by Professor Zhao during the Spring '11 term at Temple.

### Page1 / 15

supplementary_regression_notes - Simple and M ultiple...

This preview shows document pages 1 - 5. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online