**Unformatted text preview: **More Details on the Simple Linear Regression Model
Tyler Ransom
Univ of Oklahoma Jan 24, 2019 1 / 17 Today’s plan
1. Review reading topics
1.1 Units of Measurement
1.2 Functional Form
1.3 Conditions for Unbiasedness/Computation of standard errors 2. In-class activity: More practice running regressions and interpreting
estimates 2 / 17 Units of measurement 3 / 17 Background
- The three challenges of statistical inference are:1
1. Generalizing from sample to population (statistical inference)
2. Generalizing from control to treatment group (causal inference)
3. Generalizing from observed measurements to underlying constructs of
interest (measurement) 1 Taken from Andrew Gelman’s blog. 4 / 17 Units of measurement
- very important to know how y and x are measured in order to interpret
regression functions
- example: CEO salary and the company’s return on equity (roe). \ =963.191 + 18.501 roe
salary
N =209, R2 = .0132
- If salary is in thousands and roe is in percentage points, what is
interpretation of βˆ 1 = 18.501?
- What is the interpretation of βˆ 0 = 963.191?
5 / 17 Changing the units of measurement
- What if now we decide to measure roe as a decimal instead of a percent? 6 / 17 Changing the units of measurement
- What if now we decide to measure roe as a decimal instead of a percent? \ =963.191 + 1, 850.1 roedec
salary
N =209, R2 = .0132
where roedec = roe
100 6 / 17 Changing the units of measurement
- What if now we decide to measure roe as a decimal instead of a percent? \ =963.191 + 1, 850.1 roedec
salary
N =209, R2 = .0132
where roedec = roe
100 - And what if salary is in dollars instead of thousands of dollars? 6 / 17 Changing the units of measurement
- What if now we decide to measure roe as a decimal instead of a percent? \ =963.191 + 1, 850.1 roedec
salary
N =209, R2 = .0132
where roedec = roe
100 - And what if salary is in dollars instead of thousands of dollars? \ =963, 191 + 18, 501 roe
salary
N =209, R2 = .0132 6 / 17 Units, interpretation, and model performance
- Notice how the R2 didn’t change at all when we changed the units!
- Changing the units only changes the interpretation, not the performance
of the model
- Typically should choose units that correspond to plausible changes
- e.g. typical ∆roe = 1 pct. point (pp), not 100 pp 7 / 17 Functional Form 8 / 17 Functional Form
- Sometimes a linear function isn’t very realistic
- e.g. a simple wage-education equation [ = − 5.12 + 1.43 educ
wage
N =759, R2 = .133
where wage is the hourly wage earned, and educ is years of education
- What’s weird about this? 9 / 17 Functional Form
- Sometimes a linear function isn’t very realistic
- e.g. a simple wage-education equation [ = − 5.12 + 1.43 educ
wage
N =759, R2 = .133
where wage is the hourly wage earned, and educ is years of education
- What’s weird about this?
? 1. educ = 0 ⇒ wage = −5.12 9 / 17 Functional Form
- Sometimes a linear function isn’t very realistic
- e.g. a simple wage-education equation [ = − 5.12 + 1.43 educ
wage
N =759, R2 = .133
where wage is the hourly wage earned, and educ is years of education
- What’s weird about this?
? 1. educ = 0 ⇒ wage = −5.12
2. Constant return to education. Should be increasing!
9 / 17 The log transformation
- Instead, consider using log(wage):
log\
(wage) =1.142 + 0.099 educ
N =759, R2 = .165
where log(·) is the natural logarithm X Now we don’t have negative wage when educ = 0
X Model allows for increasing returns to educ (but constant percentage effect)
- Interpretation:
10 / 17 The log transformation
- Instead, consider using log(wage):
log\
(wage) =1.142 + 0.099 educ
N =759, R2 = .165
where log(·) is the natural logarithm X Now we don’t have negative wage when educ = 0
X Model allows for increasing returns to educ (but constant percentage effect)
- Interpretation: one-unit ↑ educ corresponds to ≈9.9% ↑ wage
10 / 17 Other uses of log
- Can also put the log on the x variable (or both), See Table 2.3:
Model
Level-level
Level-log
Log-level
Log-log Dep. Var. Indep. Var. Interpretation of β 1
y
y
log(y )
log(y ) x
log(x)
x
log(x) ∆y
∆y
%∆y
%∆y = β 1 ∆x
≈ ( β 1 /100) [1%∆x]
≈ (100β 1 ) ∆x
≈ β 1 %∆x - Note: putting in a log changes the R2 completely
- Use log to allow y and x to vary nonlinearly, but still be linear in parameters
11 / 17 Unbiasedness, standard errors 12 / 17 Gauss-Markov Assumptions
1. Linear in parameters
2. Random sampling
3. Var (x) > 0
4. E(u|x) = 0
5. Var (u|x) = σ2 (homoskedasticity)
With (1)-(4) satisfied: OLS estimates are unbiased and
With (5) satisfied: can easily compute standard errors
13 / 17 Are these crazy assumptions?
On a scale of “not at all” to “absolutely”:
Linear in parameters Not too crazy
Random sampling Not crazy if cross-sectional data
Var (x) > 0 Not at all crazy
E(u|x) = 0 Absolutely crazy if observational data!
Var (u|x) = σ2 Can be crazy, especially if time series / panel data
14 / 17 Why do we need to make these assumptions?
You might wonder why we bother to make these assumptions
- We do econometrics to learn something about a population of interest
- We can’t learn much if we don’t make any assumptions!
- Bothered by these assumptions?
- Think: “tell how to conduct statistical inference on experimental data” 15 / 17 Variance of OLS estimators
- Last time, we introduced the formulas for OLS estimators
- Also interested in their variance
- So we know how far away βˆ is expected to be from β
- A big component of these estimators is σ2 = Var (u)
SSR
N−2
N
1
=
uˆ 2i
∑
N − 2 i=1 σˆ 2 = 16 / 17 Variance of OLS estimators
- Once we have σˆ 2 , we can obtain the SE of the β’s
Var βˆ 0 = 2
σ 2 ∑N
i = 1 xi N ∑ i = 1 ( xi − x ) 2
σ2
σ2
=
Var βˆ 1 =
2
SSTx
∑ i = 1 ( xi − x ) - Don’t worry about memorizing these formulas
- Key takeaway: we can write them down in a fairly compact form
- We can do that because of the assumptions we made
17 / 17 ...

View
Full Document

- Fall '08
- STAFF
- Econometrics, Linear Regression, Regression Analysis, Variance, Estimation theory