Unformatted text preview: ECON 103, Lecture 5:
Simple Regression II
Maria Casanova April 12 (version 0) Maria Casanova Lecture 5 1. Introduction We consider the linear regression model
Yi = β0 + β1 Xi + ui
The OLS estimators are: ˆ
β1 = i (Xi ¯
¯
− X )(Yi − Y )
sXY
=2
¯
sX
(Xi − X )2
i ˆ
¯
ˆ¯
β0 = Y − β1 X Maria Casanova Lecture 5 1. Introduction Outline:
¯
Properties of OLS (just like we did for Y , an estimator of µY )
Goodness of ﬁt of the regression
Discussion of the linearity assumption Maria Casanova Lecture 5 2. Properties of OLS 1 2 3 The sample regression function obtained through OLS always passes
through the sample mean values of X and Y .
¯
u=
ˆ
i i n ui
ˆ = 0 (mean value of residuals is zero) ui Xi = 0 (ˆi and Xi are uncorrelated)
ˆ
u Note: these results hold by construction, without the OLS
assumptions. Maria Casanova Lecture 5 2. Properties of OLS
Example where the properties in the previous slide hold, but the
estimator is not consistent.
3000
2800
2600
2400 wage 2200
2000
1800
1600
1400
1200
1000
800
20 25 30 35 Maria Casanova 40
age 45 Lecture 5 50 55 60 2. Properties of OLS
Example where the properties in the previous slide hold, but the
estimator is not consistent.
3000
2800
2600 sample data
population regression function
outlier
sample regression function 2400 wage 2200
2000
1800
1600
1400
1200
1000
800
20 25 30 35 Maria Casanova 40
age 45 Lecture 5 50 55 60 2. Properties of OLS Under the OLS assumptions (see Lecture 4), we have the following
ˆ
ˆ
results for β1 (the same results hold for β0 ):
1 2 3 ˆ
ˆ
E (β1 ) = β1 . In words, β1 is an unbiased estimator of β1
ˆ
As the sample size n increases, β1 gets closer and closer to β1 , i.e.
ˆ1 is a consistent estimator of β1 .
β
ˆ
If n is large, the distribution of β1 is well approximated by a normal
(see next slide). Maria Casanova Lecture 5 2. Properties of OLS
Distribution of OLS estimators:
ˆ
ˆ
β1 ∼ N β1 , Var (β1 ) , ˆ
ˆ
β0 ∼ N β0 , Var (β0 ) ,
where σu
ˆ2
¯2
i (Xi − X ) ˆ
Var (β1 ) = ˆ
Var (β0 ) = Xi2
n i σu
ˆ2
¯2
i (Xi − X ) The squared root of the variance is the standard error of the OLS
estimators:
ˆ
se (β0 ) = ˆ
var (β0 ) and Maria Casanova ˆ
se (β1 ) =
Lecture 5 ˆ
var (β1 ) 2. Properties of OLS
σu is known as the Root Mean Squared Error (RMSE) of the regression
ˆ
or simply the standard error of the regression (SER): σu =
ˆ σu =
ˆ2 ui2
ˆ
=
n−2
i ˆ
− Yi )2
n−2 i (Yi 2
When estimating σu , we divide by n − 2, the degrees of freedom, because
we lose 2 for estimating β0 and β1 . The standard error of the regression is a measure of the deviation of the
Y values around the regression line.
Notice the diﬀerence with the standard deviation of Y , which measures
the deviation of the Y values around the sample mean:
sY =
Maria Casanova ¯
− Y )2
(n − 1) i (Yi Lecture 5 10 5 0 5 10 15 2. Properties of OLS 2 1 0
x
y Maria Casanova 1 sample_mean Lecture 5 2 10 5 0 5 10 15 2. Properties of OLS 2 1
y
Fitted values Maria Casanova 0
x 1
sample_mean Lecture 5 2 2. Properties of OLS
Stata output Maria Casanova Lecture 5 2. Properties of OLS
Homoscedastic errors: The variance of each ui is constant for all i .
Homoscedastic errors Heteroscedastic errors Maria Casanova Lecture 5 2. Properties of OLS 4 Given the OLS assumptions and homoscedastic errors, the OLS
estimators have minimum variance among all unbiased estimators of
the β ’s that are linear functions of the Y ’s. They are Best Linear
Unbiased Estimators (BLUE). This results is known as the GaussMarkov Theorem.
This means that we are able to estimate the true β0 and β1 more
accurately if we use the OLS method rather than any other method that
also gives unbiased linear estimators of the true parameters. Maria Casanova Lecture 5 3. Goodness of Fit
How could we measure how well the sample regression function ﬁts the
data? Maria Casanova Lecture 5 3. Goodness of Fit
We can decompose each Yi value into the ﬁtted (or predicted) part given
Xi and the residual part, which we called ui :
ˆ
ˆ
Yi = Yi + ui
ˆ
¯
Subtracting Y from both sides we have
¯
ˆ
¯
Yi − Y = (Yi − Y ) + ui
ˆ
ˆ
Since ui = Yi − Yi ,
ˆ
¯
ˆ
¯
ˆ
(Yi − Y ) = (Yi − Y ) + (Yi − Yi )
Squaring both sides and summing over all i we get
¯
(Yi − Y )2 =
i ˆ
¯
(Yi − Y )2 +
i Maria Casanova ˆ
(Yi − Yi )2
i Lecture 5 3. Goodness of Fit
We may rewrite this as:
TSS = ESS + RSS,
where
Total Sum of Squares (TSS) is the total variation of the actual Y
values around their sample average:
TSS = ¯
(Yi − Y )2 Explained Sum of Squares (ESS) is the total variation of the ﬁtted
Y values around their average (i.e. the variation that is explained by
the regression):
ˆ
¯
ESS =
(Yi − Y )2
Residual Sum of Squares (RSS) is the unexplained variation of the
Y values around the regression line:
RSS =
Maria Casanova ˆ
(Yi − Y )2 =
Lecture 5 u2
ˆ 10 5 0 5 10 15 3. Goodness of Fit 2 1
y
Fitted values Maria Casanova 0
x 1
sample_mean Lecture 5 2 3. Goodness of Fit Dividing both sides by TSS we get
1= RSS
ESS
+
TSS TSS The coeﬃcient of determination or R2 is deﬁned as
R2 = 1 − RSS
ESS
=
TSS
TSS Maria Casanova Lecture 5 3. Goodness of Fit Properties of R2 :
1 0 ≤ R2 ≤ 1
R2 = 1 implies a perfect ﬁt (i.e. all the Yi ’s lie on a straight line)
R2 = 0 implies no linear relationship between Y and X (i.e. the Yi ’s
are randomly distributed around the horizontal line that passes
¯
through Y ) 2 For the simple regression,
R 2 = ρ2 ,
YX
i.e. the squared sample correlation coeﬃcient between Y and X . Maria Casanova Lecture 5 3. Goodness of ﬁt
Stata output Maria Casanova Lecture 5 4. Linearity Assumption
There are two ways in which a regression function can be linear:
Linear in the variables
Linear in the parameters
Linearity in the variables:
Linear function of the independent variables:
Yi = β0 + β1 Xi + ui
NOT a linear function of the independent variables:
Yi = β0 + β1 Xi2 + ui
Yi = β0 + β1 1
+ ui
Xi Yi = β0 + β1 log(Xi ) + ui
Maria Casanova Lecture 5 4. Linearity Assumption Linearity in the parameters:
All examples above are linear in parameters.
The following are examples of models that are not linear in the
parameters:
2
Yi = β0 + β1 Xi
Yi = β0 + β0 β1 Xi
From now on, when we say linear regression we will mean a regression
that is linear in the parameters. It may or may not be linear in the
variables. Maria Casanova Lecture 5 5. Example The concept of beta is very important in ﬁnancial economics. Beta
identiﬁes the relationship between an individual stock’s return and the
return of the market as a whole.
Individual stock return: change in the stock price plus any payout
(dividend) over some given period as a percentage of its initial price.
Higher values of beta correspond to greater risk.
E.g. If for a 10% increase in the overall market the value of a stock
rises by more than 10%, then the beta for that stock would be
greater than 1.
How do we estimate beta for an individual stock? Maria Casanova Lecture 5 5. Example We need time series data on stock market performance and values of the
individual stock.
The regression function would be:
(Rit − Rft ) = β (Rmt − Rft ) + uit ,
where R represents the rate or return for an individual stock (i ), the
market as a whole (m), and the riskfree investment (f ), and the index t
refers to the time period. Maria Casanova Lecture 5 5. Example Maria Casanova Lecture 5 5. Example Examples of ﬁrms’ betas (as of April 9, 2011):
Firm
Campbell Soup
CocaCola
Amazon
Microsoft
Google
Apple
Bank of America Maria Casanova Ticker symbol
CPB
KO
AMZN
MSFT
GOOG
AAPL
BAC Lecture 5 Beta
0.25
0.58
0.95
0.96
1.04
1.19
2.43 ...
View
Full
Document
This note was uploaded on 09/23/2011 for the course ECON 103 taught by Professor Sandrablack during the Spring '07 term at UCLA.
 Spring '07
 SandraBlack

Click to edit the document details