7.Suppose you repeat this experiment 99 times, each time using the
same1, 2, andXvalues. Of course, the ui values will vary from experiment
to experiment. Therefore, in all you have 100 experiments, thus generating
100 values each of 1and2. (In practice,
[women] are disposed, as a rule and on average, to increase their consumption as their income
increases, but not by as much as the increase in their
income, that is, the marginal propensity to consume (MPC) is greater than
zero but less than one. Although
2
i
. That is, given
2
, the larger the variation in the Xvalues, the
smaller the variance of 2and hence the greater the precision with which 2
can be estimated. In short, given
2
, if there is substantial variation in the X
values (recall Assumption 8)
x
2
i =variation of the estimated Y
values about their mean (
Y= Y), which appropriately may be called the
sum of squares due to regression [i.e., due to the explanatory variable(s)], or
explained by regression, or simply the explained sum of squares (ES
estimateof the expected or mean value of Ycorresponding to the chosen X
value; that is, Yi
is an estimate of E(Y|Xi). The value of 2=0.5091,which
measures the slope of the line, shows that, within the sample range of X
between $80 and $260 per week, as Xi
Some of the properties of rare as follows (see Figure 3.11):
1.It can be positive or negative, the sign depending on the sign of the
term in the numerator of (3.5.13), which measures the sample covariationof
two variables.
2.It lies between the limits of
or, alternatively, as
(3.5.5a)
The quantity r
2
thus defined is known as the (sample) coefficient of determinationand is the most commonly used
measure of the goodness of fit of
a regression line. Verbally, r
2
measures the proportion or percentage of the
2
(3.3.8)
is known as the standard error of estimate orthe standard error of the
regression (se).It is simply the standard deviation of the Yvalues about
the estimated regression line and is often used as a summary measure of
the goodness of fit of the es
OLS are based on the assumptions of CLRM already discussed and are
enshrined in the famous GaussMarkov theorem.But before we turn to
this theorem, which provides the theoretical justification for the popularity
of OLS, we first need to consider the precis
is more diffused or widespread around the mean value than the distribution
of 2. In other words, the variance of
*
2
is larger than the variance of
2.
Now given two estimators that are both linear and unbiased, one would
choose the estimator with the sm
is very large (technically, infinite). A general discussion of finite-sample and
large-sample properties of estimators is given in Appendix A.
3.5 THE COEFFICIENT OF DETERMINATION r
2
:
A MEASURE OF GOODNESS OF FIT
Thus far we were concerned with the prob
interpretation of the intercept term does not make economic sense in the present instance because the
zero
income value is out of the range of values we are working with and does not represent a likely outcome
(see
Table I.1). As we will see on many an oc
From (3.7.2) we see that if total expenditure increases by
1 rupee, on average, expenditure on food goes up by
about 44 paise (1 rupee =100 paise). If total expenditure were zero, the average expenditure on food
would
be about 94 rupees. Again, such a mec
to draw inferences on the population parameters, the coefficients.
5.The overall goodness of fit of the regression model is measured by the
coefficient of determination,r
2
.It tells what proportion of the variation in
the dependent variable, or regressan
27
average hourly earnings (Y) on
education (X), we obtain the following results.
Yi =0.0144+0.7241 Xi (3.7.3)
var ( 1)=0.7649se( 1)=0.8746
var ( 2)=0.00483se( 2)=0.0695
r
2=0.9077
2=0.8816
As the regression results show, there is a positive association
The greater the extent of the overlap, the greater the variation in Yis explained by X. The r
2
is simply a numerical measure of this overlap. In the
figure, as we move from left to right, the area of the overlap increases, that
is, successively a greater
is a more meaningful measure than r, for
the former tells us the proportion of variation in the dependent variable
explained by the explanatory variable(s) and therefore provides an overall
measure of the extent to which the variation in one variable dete
will be presented in Chap. 8.
EXAMPLE 3.1
CONSUMPTIONINCOME RELATIONSHIP
IN THE UNITED STATES, 19821996
Let us return to the consumption income data given in
Table I.1 of the Introduction. We have already shown the
data in Figure I.3 along with the estima
intercept term may not be always meaningful, although in the present example it can be argued that a
family without any income (because of unemployment, layoff, etc.) might maintain some minimum level
of consumption
expenditure either by borrowing or diss
2 to be distributed symmetrically (but more on this in Chapter 4). As the
figure shows, the mean of the 2values,E(
2), is equal to the true 2.In this
situation we say that 2is an unbiased estimatorof2. In Figure 3.8(b) we
have shown the sampling distrib
19
As noted earlier, given the assumptions of the classical linear regression
model, the least-squares estimates possess some ideal or optimum properties. These properties are
contained in the well-known GaussMarkov
theorem.To understand this theorem, we