Measuring the Relationship between 2 Variables
Correlation:
quantifying the relationship between 2 variables
•
Correlation does not imply causality
•
Causation: moving x leads to a change in y
Linear Regression
•
Predicted: Ŷ
i
= a + bX
i
LHS variable= intercept + coefficient*RHS variable
•
Actual/Observed: Y
i
= a + b X
i
+ error
Estimating Relationships with Randomized Experiments
The
pvalue
of the test tells you the following:
If the null hypothesis is true (there’s no effect), what is the probability you would have seen a
difference between conditions at least this extreme…just
by chance
If the pvalue is low enough, we can reject the null hypothesis
If the pvalue higher, we cannot reject the null hypothesis
It doesn’t mean the alternative hypothesis is true
If pvalue < 5%, the difference is significant at the 5% level
SEE
is a measure of how much the data varies around the regression line
We can use it to compare regressions with the same dependent variable
The smaller the SEE, the better the fit
Confidence intervals
68% of the time, our prediction for Y at a given X will be within
±
1 standard error of the
actual value
It means, “We are 68% certain that the actual Y lies within
±
1 standard error of the Y
predicted
95% of the time, our prediction will be within
±
2 (1.96) standard errors of the actual value
The Standard Error of the Coefficient
68% of the time, the true relationship between Y and X will be within
±
1 standard error of
the coefficient
95% of the time, the true relationship between Y and X will be within
±
1.96 standard errors
of the coefficient
TStatistic/pvalue
t = b  b
H
/S.E.(b)
t=the number of standard errors away the coefficient b is from the hypothesized value 0
If tstat is close to 1, pvalue is .32
If tstat is close to 2, pvalue is .05
Tstat=the coefficient / std error of coefficient
If b is 1 std error away from 0, the tstat is 1, and we can be 68% certain that the true
relationship is not 0.
If b is 1.96 std errors away from 0, the tstat is 1.96, and we can be 95% certain that the true
relationship is not 0.
If absolute value of t > 1.96, pvalue < 0.05
If absolute value of t < 1.96, pvalue > 0.05
pvalue < 5%
statistically significant at the 5% level
The pvalue is the probability we will be wrong when we reject the hypothesis that the true
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
value of the coefficient is 0
If the pvalue is less than .05, we say that we can reject the hypothesis of no relationship
between X and Y at the 5% level of significance, or that the coefficient is statistically
significant at the 5% level
Rsquared, Adjusted Rsquared, and Choosing the Right Multiple Regression
SST= SSR + SSE
R
2
= SSR / SST
or
R
2
= 1 SSE/SST
R
2
in a simple regression= square of the correlation between
x
and
y
SSR=SSTSSE
Total variation=explained variation + unexplained variation
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '08
 KAHN
 Linear Regression, Regression Analysis, Excess Return, average earnings, Randomized Experiments

Click to edit the document details