Econ103_spring11_lec5

# Econ103_spring11_lec5 - ECON 103 Lecture 5 Simple...

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ECON 103, Lecture 5: Simple Regression II Maria Casanova April 12 (version 0) Maria Casanova Lecture 5 1. Introduction We consider the linear regression model Yi = β0 + β1 Xi + ui The OLS estimators are: ˆ β1 = i (Xi ¯ ¯ − X )(Yi − Y ) sXY =2 ¯ sX (Xi − X )2 i ˆ ¯ ˆ¯ β0 = Y − β1 X Maria Casanova Lecture 5 1. Introduction Outline: ¯ Properties of OLS (just like we did for Y , an estimator of µY ) Goodness of ﬁt of the regression Discussion of the linearity assumption Maria Casanova Lecture 5 2. Properties of OLS 1 2 3 The sample regression function obtained through OLS always passes through the sample mean values of X and Y . ¯ u= ˆ i i n ui ˆ = 0 (mean value of residuals is zero) ui Xi = 0 (ˆi and Xi are uncorrelated) ˆ u Note: these results hold by construction, without the OLS assumptions. Maria Casanova Lecture 5 2. Properties of OLS Example where the properties in the previous slide hold, but the estimator is not consistent. 3000 2800 2600 2400 wage 2200 2000 1800 1600 1400 1200 1000 800 20 25 30 35 Maria Casanova 40 age 45 Lecture 5 50 55 60 2. Properties of OLS Example where the properties in the previous slide hold, but the estimator is not consistent. 3000 2800 2600 sample data population regression function outlier sample regression function 2400 wage 2200 2000 1800 1600 1400 1200 1000 800 20 25 30 35 Maria Casanova 40 age 45 Lecture 5 50 55 60 2. Properties of OLS Under the OLS assumptions (see Lecture 4), we have the following ˆ ˆ results for β1 (the same results hold for β0 ): 1 2 3 ˆ ˆ E (β1 ) = β1 . In words, β1 is an unbiased estimator of β1 ˆ As the sample size n increases, β1 gets closer and closer to β1 , i.e. ˆ1 is a consistent estimator of β1 . β ˆ If n is large, the distribution of β1 is well approximated by a normal (see next slide). Maria Casanova Lecture 5 2. Properties of OLS Distribution of OLS estimators: ˆ ˆ β1 ∼ N β1 , Var (β1 ) , ˆ ˆ β0 ∼ N β0 , Var (β0 ) , where σu ˆ2 ¯2 i (Xi − X ) ˆ Var (β1 ) = ˆ Var (β0 ) = Xi2 n i σu ˆ2 ¯2 i (Xi − X ) The squared root of the variance is the standard error of the OLS estimators: ˆ se (β0 ) = ˆ var (β0 ) and Maria Casanova ˆ se (β1 ) = Lecture 5 ˆ var (β1 ) 2. Properties of OLS σu is known as the Root Mean Squared Error (RMSE) of the regression ˆ or simply the standard error of the regression (SER): σu = ˆ σu = ˆ2 ui2 ˆ = n−2 i ˆ − Yi )2 n−2 i (Yi 2 When estimating σu , we divide by n − 2, the degrees of freedom, because we lose 2 for estimating β0 and β1 . The standard error of the regression is a measure of the deviation of the Y values around the regression line. Notice the diﬀerence with the standard deviation of Y , which measures the deviation of the Y values around the sample mean: sY = Maria Casanova ¯ − Y )2 (n − 1) i (Yi Lecture 5 -10 -5 0 5 10 15 2. Properties of OLS -2 -1 0 x y Maria Casanova 1 sample_mean Lecture 5 2 -10 -5 0 5 10 15 2. Properties of OLS -2 -1 y Fitted values Maria Casanova 0 x 1 sample_mean Lecture 5 2 2. Properties of OLS Stata output Maria Casanova Lecture 5 2. Properties of OLS Homoscedastic errors: The variance of each ui is constant for all i . Homoscedastic errors Heteroscedastic errors Maria Casanova Lecture 5 2. Properties of OLS 4 Given the OLS assumptions and homoscedastic errors, the OLS estimators have minimum variance among all unbiased estimators of the β ’s that are linear functions of the Y ’s. They are Best Linear Unbiased Estimators (BLUE). This results is known as the Gauss-Markov Theorem. This means that we are able to estimate the true β0 and β1 more accurately if we use the OLS method rather than any other method that also gives unbiased linear estimators of the true parameters. Maria Casanova Lecture 5 3. Goodness of Fit How could we measure how well the sample regression function ﬁts the data? Maria Casanova Lecture 5 3. Goodness of Fit We can decompose each Yi value into the ﬁtted (or predicted) part given Xi and the residual part, which we called ui : ˆ ˆ Yi = Yi + ui ˆ ¯ Subtracting Y from both sides we have ¯ ˆ ¯ Yi − Y = (Yi − Y ) + ui ˆ ˆ Since ui = Yi − Yi , ˆ ¯ ˆ ¯ ˆ (Yi − Y ) = (Yi − Y ) + (Yi − Yi ) Squaring both sides and summing over all i we get ¯ (Yi − Y )2 = i ˆ ¯ (Yi − Y )2 + i Maria Casanova ˆ (Yi − Yi )2 i Lecture 5 3. Goodness of Fit We may rewrite this as: TSS = ESS + RSS, where Total Sum of Squares (TSS) is the total variation of the actual Y values around their sample average: TSS = ¯ (Yi − Y )2 Explained Sum of Squares (ESS) is the total variation of the ﬁtted Y values around their average (i.e. the variation that is explained by the regression): ˆ ¯ ESS = (Yi − Y )2 Residual Sum of Squares (RSS) is the unexplained variation of the Y values around the regression line: RSS = Maria Casanova ˆ (Yi − Y )2 = Lecture 5 u2 ˆ -10 -5 0 5 10 15 3. Goodness of Fit -2 -1 y Fitted values Maria Casanova 0 x 1 sample_mean Lecture 5 2 3. Goodness of Fit Dividing both sides by TSS we get 1= RSS ESS + TSS TSS The coeﬃcient of determination or R2 is deﬁned as R2 = 1 − RSS ESS = TSS TSS Maria Casanova Lecture 5 3. Goodness of Fit Properties of R2 : 1 0 ≤ R2 ≤ 1 R2 = 1 implies a perfect ﬁt (i.e. all the Yi ’s lie on a straight line) R2 = 0 implies no linear relationship between Y and X (i.e. the Yi ’s are randomly distributed around the horizontal line that passes ¯ through Y ) 2 For the simple regression, R 2 = ρ2 , YX i.e. the squared sample correlation coeﬃcient between Y and X . Maria Casanova Lecture 5 3. Goodness of ﬁt Stata output Maria Casanova Lecture 5 4. Linearity Assumption There are two ways in which a regression function can be linear: Linear in the variables Linear in the parameters Linearity in the variables: Linear function of the independent variables: Yi = β0 + β1 Xi + ui NOT a linear function of the independent variables: Yi = β0 + β1 Xi2 + ui Yi = β0 + β1 1 + ui Xi Yi = β0 + β1 log(Xi ) + ui Maria Casanova Lecture 5 4. Linearity Assumption Linearity in the parameters: All examples above are linear in parameters. The following are examples of models that are not linear in the parameters: 2 Yi = β0 + β1 Xi Yi = β0 + β0 β1 Xi From now on, when we say linear regression we will mean a regression that is linear in the parameters. It may or may not be linear in the variables. Maria Casanova Lecture 5 5. Example The concept of beta is very important in ﬁnancial economics. Beta identiﬁes the relationship between an individual stock’s return and the return of the market as a whole. Individual stock return: change in the stock price plus any payout (dividend) over some given period as a percentage of its initial price. Higher values of beta correspond to greater risk. E.g. If for a 10% increase in the overall market the value of a stock rises by more than 10%, then the beta for that stock would be greater than 1. How do we estimate beta for an individual stock? Maria Casanova Lecture 5 5. Example We need time series data on stock market performance and values of the individual stock. The regression function would be: (Rit − Rft ) = β (Rmt − Rft ) + uit , where R represents the rate or return for an individual stock (i ), the market as a whole (m), and the risk-free investment (f ), and the index t refers to the time period. Maria Casanova Lecture 5 5. Example Maria Casanova Lecture 5 5. Example Examples of ﬁrms’ betas (as of April 9, 2011): Firm Campbell Soup Coca-Cola Amazon Microsoft Google Apple Bank of America Maria Casanova Ticker symbol CPB KO AMZN MSFT GOOG AAPL BAC Lecture 5 Beta 0.25 0.58 0.95 0.96 1.04 1.19 2.43 ...
View Full Document

## This note was uploaded on 09/23/2011 for the course ECON 103 taught by Professor Sandrablack during the Spring '07 term at UCLA.

Ask a homework question - tutors are online