Business Statistics Lecture Notes 08

Business Statistics - MN1025 – Business Statistics 35 Lecture 8—Friday LINEAR REGRESSION Reference Lind et al Chapter 13 8.1 Regression

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: MN1025 – Business Statistics 35 Lecture 8—Friday 29/2/2008 LINEAR REGRESSION Reference: Lind et al. , Chapter 13. 8.1 Regression: introduction In the last lecture we introduced the concept of the best-fit line, which is an approximation to the data. The closeness of this approximation is measured by the correlation coefficient r . In this lecture we will see how the best-fit line can be used for prediction. Example: Suppose the College wishes to save money and asks: can we predict exam results well from weekly work? If the answer is yes, we dispense with exams. So to test this we need a sample of students already examined and see if for each student, their average weekly mark predicts their exam result. To estimate the predictive power, one uses Linear Re- gression . 8.2 Back to sales and scores Back to Example 7.7 (sales and scores). Here are the data again: Data Display Row scores sales 1 4 5 2 7 12 3 3 4 4 6 8 5 10 11 We wish to analyse how good an approxima- tion to the data the best-fit line is. We use STAT → REGRESSION → REGRESSION. We are asked to choose a RESPONSE column and a PRE- DICTOR column. In this case the only reasonable choice is “scores” as predictor (or cause) and “sales” as response (or effect). We get the Regression Anal- ysis table shown below. Regression Analysis: sales versus scores The regression equation is sales = 1.20 + 1.13 scores Predictor Coef SE Coef T P Constant 1.200 2.313 0.52 0.640 scores 1.1333 0.3569 3.18 0.050 S = 1.955 R-Sq = 77.1% R-Sq(adj) = 69.4% Analysis of Variance Source DF SS MS F P Regression 1 38.53 38.53 10.08 0.050 Residual Error 3 11.47 3.82 Total 4 50.00 The regression equation sales = 1.20 + 1.13 scores in the printout is the equation of the best-fit line. We can plot this on the scatter plot or get Minitab to plot it for us: we use STAT → REGRESSION → FITTED LINE PLOT and enter again “sales” as response, “scores” as pre- dictor. In all these examples we assume that the underlying populations have an approximately normal distribu- tion, and that a relation of the form sales = m × scores + c + random error is reasonable. In general, there could be more than one predictor. For instance, we could think that staff experience was a relevant factor and get a relation of the form sales = m 1 × scores + m 2 × experience+ c +random error . Here we have two predictors, “scores” and “experi- ence”. Generally, by a suitable choice of additional predictors we can reduce the random error. In this course, we will always use a single predictor only. 8.3 Testing if the slope is nonzero For the population of scores and sales there is an underlying (population) regression line: sales = m population × scores + c population ....
View Full Document

This note was uploaded on 04/17/2008 for the course MN 1025 taught by Professor Schack during the Spring '08 term at Royal Holloway.

Page1 / 4

Business Statistics - MN1025 – Business Statistics 35 Lecture 8—Friday LINEAR REGRESSION Reference Lind et al Chapter 13 8.1 Regression

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online