{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

Business Statistics Lecture Notes 08

Business Statistics Lecture Notes 08 - MN1025 Business...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
MN1025 – Business Statistics 35 Lecture 8—Friday 29/2/2008 LINEAR REGRESSION Reference: Lind et al. , Chapter 13. 8.1 Regression: introduction In the last lecture we introduced the concept of the best-fit line, which is an approximation to the data. The closeness of this approximation is measured by the correlation coefficient r . In this lecture we will see how the best-fit line can be used for prediction. Example: Suppose the College wishes to save money and asks: can we predict exam results well from weekly work? If the answer is yes, we dispense with exams. So to test this we need a sample of students already examined and see if for each student, their average weekly mark predicts their exam result. To estimate the predictive power, one uses Linear Re- gression . 8.2 Back to sales and scores Back to Example 7.7 (sales and scores). Here are the data again: Data Display Row scores sales 1 4 5 2 7 12 3 3 4 4 6 8 5 10 11 We wish to analyse how good an approxima- tion to the data the best-fit line is. We use STAT REGRESSION REGRESSION. We are asked to choose a RESPONSE column and a PRE- DICTOR column. In this case the only reasonable choice is “scores” as predictor (or cause) and “sales” as response (or effect). We get the Regression Anal- ysis table shown below. Regression Analysis: sales versus scores The regression equation is sales = 1.20 + 1.13 scores Predictor Coef SE Coef T P Constant 1.200 2.313 0.52 0.640 scores 1.1333 0.3569 3.18 0.050 S = 1.955 R-Sq = 77.1% R-Sq(adj) = 69.4% Analysis of Variance Source DF SS MS F P Regression 1 38.53 38.53 10.08 0.050 Residual Error 3 11.47 3.82 Total 4 50.00 The regression equation sales = 1.20 + 1.13 scores in the printout is the equation of the best-fit line. We can plot this on the scatter plot or get Minitab to plot it for us: we use STAT REGRESSION FITTED LINE PLOT and enter again “sales” as response, “scores” as pre- dictor. In all these examples we assume that the underlying populations have an approximately normal distribu- tion, and that a relation of the form sales = m × scores + c + random error is reasonable. In general, there could be more than one predictor. For instance, we could think that staff experience was a relevant factor and get a relation of the form sales = m 1 × scores + m 2 × experience+ c +random error . Here we have two predictors, “scores” and “experi- ence”. Generally, by a suitable choice of additional predictors we can reduce the random error. In this course, we will always use a single predictor only. 8.3 Testing if the slope is nonzero For the population of scores and sales there is an underlying (population) regression line: sales = m population × scores + c population . In this equation, m population is the slope of the (pop- ulation) regression line, and c population is its inter- cept. The sample slope of m = 1 . 13 is our estimate for m population , and the sample intercept of c = 1 . 20 is our estimate for c population .
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}