{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Business Statistics Lecture Notes 08

# Business Statistics Lecture Notes 08 - MN1025 Business...

This preview shows pages 1–2. Sign up to view the full content.

MN1025 – Business Statistics 35 Lecture 8—Friday 29/2/2008 LINEAR REGRESSION Reference: Lind et al. , Chapter 13. 8.1 Regression: introduction In the last lecture we introduced the concept of the best-fit line, which is an approximation to the data. The closeness of this approximation is measured by the correlation coefficient r . In this lecture we will see how the best-fit line can be used for prediction. Example: Suppose the College wishes to save money and asks: can we predict exam results well from weekly work? If the answer is yes, we dispense with exams. So to test this we need a sample of students already examined and see if for each student, their average weekly mark predicts their exam result. To estimate the predictive power, one uses Linear Re- gression . 8.2 Back to sales and scores Back to Example 7.7 (sales and scores). Here are the data again: Data Display Row scores sales 1 4 5 2 7 12 3 3 4 4 6 8 5 10 11 We wish to analyse how good an approxima- tion to the data the best-fit line is. We use STAT REGRESSION REGRESSION. We are asked to choose a RESPONSE column and a PRE- DICTOR column. In this case the only reasonable choice is “scores” as predictor (or cause) and “sales” as response (or effect). We get the Regression Anal- ysis table shown below. Regression Analysis: sales versus scores The regression equation is sales = 1.20 + 1.13 scores Predictor Coef SE Coef T P Constant 1.200 2.313 0.52 0.640 scores 1.1333 0.3569 3.18 0.050 S = 1.955 R-Sq = 77.1% R-Sq(adj) = 69.4% Analysis of Variance Source DF SS MS F P Regression 1 38.53 38.53 10.08 0.050 Residual Error 3 11.47 3.82 Total 4 50.00 The regression equation sales = 1.20 + 1.13 scores in the printout is the equation of the best-fit line. We can plot this on the scatter plot or get Minitab to plot it for us: we use STAT REGRESSION FITTED LINE PLOT and enter again “sales” as response, “scores” as pre- dictor. In all these examples we assume that the underlying populations have an approximately normal distribu- tion, and that a relation of the form sales = m × scores + c + random error is reasonable. In general, there could be more than one predictor. For instance, we could think that staff experience was a relevant factor and get a relation of the form sales = m 1 × scores + m 2 × experience+ c +random error . Here we have two predictors, “scores” and “experi- ence”. Generally, by a suitable choice of additional predictors we can reduce the random error. In this course, we will always use a single predictor only. 8.3 Testing if the slope is nonzero For the population of scores and sales there is an underlying (population) regression line: sales = m population × scores + c population . In this equation, m population is the slope of the (pop- ulation) regression line, and c population is its inter- cept. The sample slope of m = 1 . 13 is our estimate for m population , and the sample intercept of c = 1 . 20 is our estimate for c population .

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}