Course Hero has millions of student submitted documents similar to the one

below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

Course Hero has millions of student submitted documents similar to the one below including study guides, practice problems, reference materials, practice exams, textbook help and tutor support.

6
MULTIPLE MODULE REGRESSION
CONTENTS:
MODULE 6
1. Introduction
2. Matrix Notation
3. Forecasting
4. Data Problems
4.1 Multicollinearity
4.2 Measurement Errors
4.3 Outliers and Undue Influence
5. The F Test for Linear Restrictions
5.1 The Redundant Variables Test
5.2 The Linear Restrictions or Wald Test
5.3 The Chow Test
6. Dummy Variables
6.1 Introduction
6.2 Generating Dummy Variables and Time Trends in Eviews
6.3 Interpreting the Coefficients of Dummy Variables
6.4 Comparing Regression Models
6.5 An Event Study
7. Financial Applications
7.1 Testing the CAPM
7.1.1 The Fama-MacBeth Approach
7.1.2 The Black-Jensen-Scholes Approach
7.2. The Predictability of Share Returns
7.3. Using the APT Model
7.3.1 Introduction
7.3.2 Estimating and Testing the APT Model
7.4. Testing Market Timing and Stock-Selection Performance
2
REFERENCES
Gujarati, Ch 7,8,9,and 10
Johnson and Di Nardo, Ch 3
Thomas, Ch 7 and 9
Wooldridge, Ch 4.4-4.5,7.2-7.3, 9.4, Appendices D&E
Other Useful References
Z.Y.Bello and V.Janjigian (1997) A Reexamination of the Market-Timing and SecuritySelection Performance of Mutual Funds, Financial Analysts Journal,
September-October, p 24 - 30.
E.R.Berndt (1991) The Practice of Econometrics: Classic and Contemporary , Ch 2
W.Daniel (1990) Applied Nonparametric Statistics, Duxbury, Boston, p 63 - 66.
M.Grinblatt and S.Titman (1998) Financial Markets and Corporate Strategy , Chs 5-6
M.Kritzman (1994) What Practitioners need to know about Serial Dependence
Financial Analysts Journal, March-April, p 19 - 22
3
1. INTRODUCTION
When there are two or more independent variables we have what is called a Multiple
Regression model. Where there are k = 2 independent variables we write this model
either as:
E(Y|X1i, X2i) = β0 + β1X1i + β2X2i
or
Yi = β0 + β1X1i + β2X2i + ui
(i = 1 , ... , n)
The coefficients β1 and β2 in this model are what we call partial derivatives. This
means that a coefficient such as β1 tells us the impact on Y of a unit change in X1,
when the values of all other independent variables are held constant.
Once we have two or more independent variables the formula used can become very
large and very complicated. It is possible to write these formulae more concisely if we
use the matrix notation discussed in the next subsection. The formula needed when
forecasting with this type of model are discussed in the next section.
There are many situations in which the data we have to work with leads to problems
with these estimators so that they no longer possess the desired properties. The key
data problems that we need to take into consideration are discussed in section 4.
When we estimate a multiple regression model the estimates of the coefficients are
not independent if the independent variables are related to each other. Because of
this the t statistics are not always reliable so we will often use F statistics instead.
We will look at using F statistics for testing for groups of variables and more
complicated restrictions on the coefficients in section 5.
There are many situations in which our Y variable is affected by factors for which we
do not posses a set of numerical values. Here we have to create what are called
dummy variables. We will examine the way in which the coefficients are interpreted
and how these variables are used in section 6.
In section 7, we examine some financial applications that use the multiple regression
model. These include various approaches of testing the CAPM, the APT model and
evaluating funds managers.
4
2. MATRIX NOTATION
If there are two or more independent variables and a sample of n observations then
we use the following matrices or blocks of values when we write our model and the
corresponding formulae. With k = 2 independent variables we use the matrices:
y(nx1)
⎡
⎢
⎢
=⎢
⎢
⎢
⎢
⎣
y1 ⎤
y2
:
:
yn
⎥
⎥
⎥
⎥
⎥
⎥
⎦
$
y (nx1)
⎡
⎢
⎢
=⎢
⎢
⎢
⎢
⎣
$
y1 ⎤
$
y2 ⎥
⎥
:⎥
⎥
:⎥
$⎦
yn ⎥
X(nx3)
⎡
⎢
⎢
=⎢
⎢
⎢
⎣
1
X 11
1
X 12
:
:
:
:
:
:
1
X 1n
X 2n
⎡
⎢
⎢
u(nx1) = ⎢
⎢
⎢
⎢
⎣
X 21 ⎤
X 22 ⎥
⎥
⎥
⎥
⎥
⎦
u1 ⎤
u2 ⎥
⎥
:⎥
⎥
:⎥
un ⎥
⎦
and
β(3x1)
⎡β ⎤
⎢ 0⎥
=⎢β⎥
⎢ β1 ⎥
⎢ 2⎥
⎦
⎣
$
β (3x1)
⎡
⎢
=⎢
⎢
⎢
⎣
$
β⎤
0⎥
$
β⎥
1
⎥
$
β⎥
2⎦
This can easily be extended to any number of variables just by adding extra columns
$
to the X matrix and rows to the β and β vectors.
The PRF can be written in matrix form as
y
= Xβ + u
and the SRF can be written in matrix form as
$
y
$
= Xβ
The formula for the OLS estimated coefficients is as follows:
$
β
-1
= (X′X) X′y
Details of the derivation are given in the appendix to this module. You can show that
$
the matrix formula with one independent variable gives the same formulae for β0 and
$
β1 we found in the rnivariate regression module. You should compare this formula
with the formula for the slope in the simple linear regression model where
n
n
)
β1
∑
t=1
=
∑ xt y t
t=1
(X t - X)(Yt - Y)
n
∑
t=1
(X t - X) 2
5
=
n
∑ x2
t
t=1
In the matrix formula the expression X′ is called the transpose of the matrix X. We
obtain the transpose of any matrix by taking the original matrix and turning it on its
side so that all the columns now become rows. The inverse of the product of the
-1
transpose and the X matrix which we write as (X′X) is similar to the expression
n
∑ (X t
t=1
- X) 2
which appears in the denominator. This matrix product gives the sum of the squared
deviations from the mean for all the independent variables. The other matrix product
X′y is similar to the term
n
∑ (X t t=1
X)(Yt - Y)
which appears in the numerator. This matrix product gives the sum of the products of
the differences from the mean for all the independent variables and the dependent
variable.
When we wish to test whether any individual coefficient is significantly different from
0 we use the t statistic. In the simple linear regression model this t statistic is equal to
the ratio of the estimated coefficient and the estimated standard deviation of the
possible values of this estimated coefficient. The degrees of freedom of this t statistic
$
$
will be n - 2 because we use 2 estimates β and β when we obtain the estimate of
0
1
the variance of the error terms used in this formula. We write this formula in the
following way:
$
β
tn-2
=
$
β
1
=
$
V( β )
1
$
β
1
=
$
SE( β )
1
1
n
$
σ/
∑ xi
i=1
2
With the multiple regression model where we have to estimate the estimates of the
$
$$
$
constant term β and the coefficients β , β , …. , β of the k independent
0
1
2
k
variables the formula for the t statistic is similar to the above expression. Because we
have (k + 1) estimates the degrees of freedom will be (n - k - 1). The variance of the
$
possible values of the estimate β of the coefficient of independent variable i is found
i
from what is called the variance-covariance matrix which is the product of the
-1
estimated variance of the error terms and the (X′X) matrix we write in the following
way
-1
$
$2
V( β ) = σ (X′X)
6
This is a square matrix with (k + 1) rows and columns. The variances of the possible
values of all the estimated coefficients are found on the main diagonal of this matrix.
$
For β the variance is the (i + 1)th. term down the main diagonal. The off diagonal or
i
(i, j)th. terms contain the covariances between the estimated values of the different
$
$
coefficients β and β j .
i
To obtain the coefficient of determination R2 or the estimated variance of the error
$2
terms σ we use our different measures of variation. Using matrix notation we write
these formula in the following ways:
n
Total Variation or TSS
=
∑ (Y
− Y )2
i
i =1
= y ′y −
1⎛ n ⎞
⎜ ∑ Yi ⎟
n ⎝ i =1 ⎠
2
n
Explained Variation or ESS
=
∑ (Yˆ − Y )
2
i
i =1
1⎛ n ⎞
ˆ
= β ′X ′y − ⎜ ∑ Yi ⎟
n ⎝ i =1 ⎠
2
Residual Variation or RSS = TSS - ESS
$
= y'y - β 'X'y
The formula for the estimated the variance of the error terms is now written as:
n
$
σ2 =
∑ ei2
i=1
n-k -1
=
ˆ
y' y - β ′ X' y
n - k -1
The formula for the coefficient of determination R2 will be written in the following way:
R2 =
Explained Variation
Total Variation
=1-
=1-
Residual Variation
Total Variation
ˆ
y' y - β ′ X' y
⎛1⎞n 2
⎟∑ Y
⎝ n ⎠ i=1 i
y' y - ⎜
The formula for the F statistic, which is used in the ANOVA procedure, can also be
written using matrix notation. Here we now have:
7
Fk,n-k-1 =
=
(Explained + Residual) Variance
Residual Variance
ESS / k
RSS / (n - k - 1)
n
⎡ˆ
⎤
⎛ 1⎞
β ' X' y - ⎜ ⎟∑ Y 2 ⎥ /k
⎢
⎝ n ⎠ i=1 i ⎦
⎣
=
ˆ
y' y - β ' X' y /(n − k − 1)
[
]
3. FORECASTING
If we have a model in which there are two independent variables then our PRF is:
Yi = β + β1X1i + β2 X2i + ui
(i = 1 , ... , n)
Suppose that we have estimated the coefficients and obtained the following SRF
$
Y i = 10 + 3X1i + 2X2i
(i = 1 , ... , n)
We want to obtain forecasts of E(Y|X) and Y when X1 and X2 are given the following
values:
X10 = 4 and X20 = 7.
To write the formulae for forecasts in matrix notation we first define X0'. This is the
row vector which contains a 1 for the constant and the values of X1 and X2 for which
we want to produce a forecast. In our example
X0'
= [1 4 7]
The point estimate of both the conditional mean E(Y|X1i, X2i) and the actual value Y
is the value of the SRF for these values of the independent variables. The formula we
use when we are working with matrices is:
$
$
Y 0 = X0' β
⎡ 10
⎢
= [1 4 7] 3
⎢
⎢
⎣2
⎤
⎥
⎥
⎥
⎦
= [ (1x10) + (4x3) + (7x2)]
= 36
8
$
To obtain an appropriate expression for the variance of the Y values in different
$
ˆ
ˆ
ˆ
samples, we note that Y depends on the values of 3 estimates β 0 , β 1 and β 2 . The
$
estimated variance of Y is a function of the estimated variances and covariances of
ˆ
ˆ
ˆ
β 0 , β 1 and β 2 , and these values are stored in the estimated variance-covariance
matrix:
-1
$
$2
V( β ) = σ (X'X)
It is possible to show that when we wish to forecast the value of the conditional mean
of the Y values E(Y | X0'), for the set of X values in the vector X0', the variance of the
$
possible Y values is given by the following formula:
-1
$
$2
V( Y 0| X0') = σ X0'(X'X) X0
The formula for the interval estimate is:
-1
0.5
$
$2
Y 0 ± t(n-k-1)[ σ X0'(X'X) X0]
When we wish to forecast individual Y values rather than the conditional mean at this
set of X values, the formula for the variance of these values is:
-1
$
$2
V( Y 0| X0') = σ [1 + X0'(X'X) X0]
Our interval estimate will be:
$
Y0
2
-1
$
± t(n-k-1)[ σ (1 + X0'(X'X) X0)]
0.5
If we have a reasonable sized sample (say 50 or more) we can use the following
approximate, but still reasonably accurate, 95% prediction interval. The terms
-1
X0'(X'X) X0 will be of the size 1/n and so can be neglected. The value t(n-k-1) ≈ 1.96
≈ 2. Using these approximations we have
-1
0.5
$
$
$2
Y 0 ± t(n-k-1)[ σ (1 + X0'(X'X) X0)] ≈ Y 0 ± 2σ.
If you need to use the matrix formulae for a small sample or the confidence interval
for a mean value, it is not particularly easy in Eviews. You would need to use the
underlying programming language. In such a case it is easier to copy and paste the
covariance matrix into Excel and use the matrix multiplication commands to obtain
the required interval.
9
4. DATA PROBLEMS
Besides the same assumptions about the error terms we also make three further
assumptions about the data we are using namely that:
1. There is no multicollinearity, by which we mean that if we have three
independent variables while the X1, X2 and X3 then these variables are related to
Y but they are not related to each other.
2. The individual observations or Xij terms in our sample do not exert undue
influence on the estimated values of the βi's. This means that the values of our
estimates will not change dramatically when we change the sample we are using
by adding or taking away a particular case.
3. There are no measurement errors in the values of either the dependent variable
or independent variables.
We will now look at the impact on our estimates when these assumptions do not hold
and how we can deal with these problems.
4.1 Multicollinearity
There are two types of Multicollinearity that can occur namely exact and close but
not exact multicollinearity.
(a) Suppose for example that there is an exact linear relationship between two or
more variables, for example if one variable is the sum of or the difference between
two of the other two variables with
X3
= [ X2 - X1 ]
then we cannot obtain estimates of the βj’s. Indeed in this situation our usual
approach to interpreting the value of β1 makes little or no sense. We originally said
that β1 shows the impact on Y of a unit change in X1 when all other X variables
are constant. If there is exact multicollinearity, then unit changes in X1 will always
be accompanied by changes in the values of other X variables such as X3. This
means we can never be sure whether the change in Y was caused by the change
in X1 or the change in X3.
For those of you who are familiar with matrix algebra, when there is an exact
relationship between X variables such as X1 and X3 the determinant of (X'X) is 0
-1
and the elements of its inverse (X'X)
are infinite because they are all equal to
some finite value divided by 0. This means that we cannot find the value of the
variance of the possible values of which means we cannot find interval estimates
of βs or conduct t tests of possible values.
10
Most computer packages will tell you that it is not possible to obtain estimates of
the coefficients because you have a nonsingular matrix which means we cannot
-1
find (X'X) .
(b) In most practical situations there is a close but not exact relationship between
the X variables in the model. Our computer package will now obtain the estimates
ˆˆ
ˆ
ˆ
β 0 , β 1 , β 2 and β 3 of the constant and the coefficients of the independent
variables in the model. This type of situation is more difficult to deal with than
exact multicollinearity because now our package does not automatically tell that
we have a problem. We have to find a way of determining whether our X variables
are closely related and whether this will have a significant impact on how useful
our model is.
The reason why we need to find whether there is a close but inexact relationship
between the X variables is as follows. In this situation while the determinant of (X'X)
is no longer 0 it will now be equal to some small value which is close to 0. When this
happens small changes in the data can cause large proportional changes in the
determinant. Here the inverse matrix can be found but the elements of the inverse or
-1
ˆˆ
ˆ
(X'X) matrix are very large as are the variances of β 1 , β 2 and β 3 . When these
estimates have very large variances it means that their values vary widely from one
sample to another. This means that when a new set of observations becomes
available at the end of the month or the end of the quarter when we use these new
values to re-estimate our model we may obtain estimates that differ significantly from
the ones we are now using. Where these estimates are changing from one period to
the next we would be foolish to use the model to analyse market behaviour. The
model may still be used to forecast future values provided that we are using actual
data values for the dependent variables.
To help understand how we can determine whether there is close but not exact
multicollinearity present in the independent variables that we are using we will use
the multiple regression model with two independent variables
Yi = β0 + β1X1i + β2 X2i + ui
(i = 1 , ... , n)
In the simple linear model where there was one independent variable the variance of
ˆ
the possible values of β 1 the estimate of the coefficient of X1 is given by the
following formula
ˆ
V( β 1 ) =
σ2
n
∑ (X 1i
i=1
- X )2
1
11
When there are two independent variables in a multiple regression model it can be
ˆ
shown that the variance of the possible values of β 1 the estimate of the coefficient of
X1 is given by the following formula
σ2
ˆ
V( β 1 ) =
n
(1 − r 2 )
12
∑ (X 1i
i=1
- X )2
1
In this formula the r12 term is the correlation coefficient between the two independent
variables X1 and X2. The square of this value r 2 is the coefficient of determination
12
2
or R value when we have a model in which X1 is the dependent variable and X2 is
the independent variable.
ˆ
From this formula we can see that the variance V( β 1 ) will have a very large value
when the value of (1 - r 2 ) is very small. This occurs when r 2 is close to 1, which is
12
12
the same as saying when there is close but not exact multicollinearity.
If there are more than two independent variables in the model then the variance of
ˆ
the possible values of β 1 the estimate of the coefficient of X1 is given by the
following formula
σ2
ˆ
V( β 1 ) =
= VIF
n
∑
(1 − R 2 ) (X - X ) 2
1. i=1
1i
1
σ2
n
∑ (X 1i
i=1
- X )2
1
The variance inflation factor or VIF is defined in the following way
VIF =
1
(1− R 2 )
1.
2
where the R12. term in this expression is the coefficient of determination or R value
when we have a model in which X1 is the dependent variable and there are (k - 1)
2
independent variables namely X2, X3 , …. , Xk. i.e. it is the R value for the model
X1
= α0 + α1X2 + α2X3 + …. + αk-1Xk + v
(The corresponding expression for any other variable Xj is R 2. ). From this expression
j
ˆ
for V( β 1 ) we see that if multicollinearity is a problem and there is one or more R 2.
j
ˆ
term which is close to 1 then we will have large values of the VIF and V( β j ).
12
The above results and the description of the consequences of close but inexact
multicollinearity can be used to develop the following procedures for determining
whether multicollinearity is a problem in any model we estimate.
(a) We look at our estimated coefficients to see if they are unstable as where the
ˆ
variance of possible values of the estimated coefficient V( β 1 ) are large we will
find that small changes in the existing values of the variables or the addition of
new values to the sample can produce very large changes in the values of these
estimates.
ˆ
(b) Where there are large V( β 1 ) terms we will find that our estimates are not only
unstable but they will take extreme or unusual values and can even have the
opposite signs to what we would normally expect.
ˆ
(c) With the large V( β 1 ) values for different coefficients we may find that the t
statistics indicate that the individual variables in the model do not have a
2
significant impact. At the same time we may find that the R for the complete
model indicates that the combined impact of these supposedly insignificant
2
independent variables is significant. In other words the presence of a large R
value along with small values for the t statistics can also indicate that
multicollinearity is a problem.
(d) Large R 2. values indicate that multicollinearity could be a problem for a model.
j
We say these values are large when the R 2. value for any variable Xj exceeds
j
2
the R value for the complete model.
(e) Look at the correlation matrix between the X variables. Although this only gives
the relationships between pairs of variables, high correlations almost always
mean we wil have problems with multicolliearity.
It should be noted that it is very difficult to solve all the problems associated with
multicollinearity. The main reason is that where there are relationships between the k
different X variables then these k variables do not provide k independent sources of
information. To achieve a partial resolution we could omit one or more independent
variables from the model. If we omit important variables however, this could make
our estimates of the coefficients of the remaining variables biased i.e. too high or too
low, because these coefficients now show the impact of the omitted variables.
Another approach is to develop alternative estimators to the OLS estimators which
are biased but which have smaller variances. These estimators however still do not
overcome the fundamental problem of accurately determining the impact of the
individual X variables on Y.
Multicollinearity will make our t-statistics unreliable due to the variance inflation
effect. The F-test and the R2, which look at the total amount of variation explained by
13
the model, rather than the behaviour of the individual variables, are not effected by
multicollinearity.
4.2 Measurement Errors
In any regression study we assume that it is possible to measure the values of the
variables accurately. The impact of measurement errors depends upon where they
occur.
(a) If there are errors in the Y values then this does not have a major impact on the
quality of the estimates that we obtain with OLS. Consider the model
Yi + ε i = β 0 + β 1 X i + u i ⇒ Yi = β 0 + β 1 X i + ω i
where ε i is the measurement error. As you can see, by the way the model has been
rewritten, the estimates of the coefficients should be unbiased. The estimate of the
standard deviation of the model errors will be too high. If the errors in the Y values
and the model errors are independent the estimated variance will be the sum of the
two variances. This means that all our test statistics will be too small, leading us to
think the model is not a good fit to the data.
(c) When there are errors in the values of the X variables this has a major impact on
the quality of the OLS estimates.
Yi = β 0 + β 1 ( X i + ε i ) + u i
The error in the X variable will impact on the estimate of the coefficients. We now
have estimates which are both biased and inconsistent which means that even if
we were to use very large samples the quality of our estimates would not improve.
Typically the estimates of the coefficients will tend to be too small.
Measurement errors can occur when a variable is correctly measured but is used for
a proxy variable. For example, we often use indices such as the All Ordinaries as a
measure, or proxy, of the market, when estimating the market model. Although these
indices are good indicators of what is happening to the market, they do not measure
the w hole of the market, and so contain measurement errors in that context.
The effects of measurement errors can sometimes be reduced by the use of
instrumental variables (beyond the scope of this course but details of this approach
can be found in the Eviews manual and a detailed discussion is given in Wooldridge
Chapter15).
14
4.3 Outliers and Undue Influence
The second assumption says that the results for any model should not be too heavily
influenced by one or two observations. Unusual observations can quite easily occur
in Finance applications particularly if we are working with returns. Large changes in
share prices can produce very large changes in the returns on those shares. To see
why this is a problem consider the situation where we have a sample of n = 7 values.
Case
X
Y
1
2
3
4
5.0
5.5
6.0
7.0
10
15
5
10
5
6
7.0
8.0
12
9
7
12.0
20
If you examine the diagram shown below you will see that 6 out of the 7 points lie in a
circular area. For these 6 points the SRF has a slope which is negative or zero.
The SRF for the 6 Typical Points
15
When we include point 7 which is the unusual point, then using OLS we now get a
SRF with a significant positive slope.
The SRF when we include the Unusual Data Point
From this simple numerical example we see that it is possible that our results are
very dependent on a relatively small number of observations. These unusual
observations are said to have undue influence over our results. This is why in any
applied study you should always check whether your results are unduly affected by
observations which may not be typical. In many cases the unusual observations are
found to have been incorrectly entered into the data file with for example a decimal
point in the wrong place.
There are 3 different approaches that are used to determine whether any of our
observations have exercised undue influence.
(a) We can measure whether the X value of an observation has a very large impact
$
on the value we obtain for the fitted or Y value. This
(b) We can measure whether the Y value of an observation is very different by
$
looking at the size of the residual e = Y - Y which is the difference between the
actual value of Y and estimated value of Y.
(c) We can measure the size of the change in the estimated values of the
coefficients when an observation such as observation 7 is omitted. This third
approach can be shown to be a combination of the first two approaches.
16
THE IMPACT OF X VALUES
When we discuss the measure which looks at the impact of the X value we begin by
noting that if we have a simple or multiple regression model then the set of estimates
for all the coefficients in the model can be written in matrix form as
$
β
-1
= (X′X) X′y
Using matrix notation the forecast or fitted values of the dependent variable can be
written in matrix form as
$
y
$
= Xβ
$
If we substitute the formula for β into this expression we obtain
-1
$
y = X(X′X) X′y
= Hy
where
-1
H = X(X′X) X′y
is called the hat matrix. This is an n x n matrix where we have one row and one
column for each of the n sets of values for the independent variables in our sample. It
is called the hat matrix because it appears in the expression which shows how the
$
estimated or fitted values which we call Y hat or y are related to the actual values of
Y.
The n terms down the main diagonal of the H matrix or hii terms are called the
leverage values. They are called leverage values because the size of hii is the main
factor which determines how much influence or leverage the actual value Yi has on
$
the fitted or estimated value Y i. It can be shown that if we have k independent
variables in our model then the values of these leverage or hii terms have the
following properties
0 ≤ hii ≤ 1
and
n
∑ hii
i=1
= k+1
This means that the average value of the hii terms is
n
h
= (1/n)
∑ hii
i=1
= (k + 1)/n
17
Consider the second leverage term h22. This term is associated with the second
value in our sample. If we have k independent variables X1, X2, X3, …. , Xk in our
model then the set of second values for all the independent variables is written as
follows
[X12, X22, X32, …. , Xk2]
The set of average or mean values for all k independent variables is
[ X , X , X , …. , X ]
1
2
3
k
The second leverage value h22 can be interpreted as the distance between the
point associated with the second set of values in our sample and the point
associated with the average values of all the independent variables. This implies
that the further the X values are from their mean values the more impact that set of
sample values has on the estimated or fitted value of Y.
The criterion that is used to determine whether a leverage value such as h22 is
large enough to say that a sample value has X values which are unusual or outliers
is twice the mean value. In other words we would say that the second value in the
sample has an X value which is an outlier if
h22 ≥ 2 h = 2(k + 1)/n
THE IMPACT OF Y VALUES
To test whether the Y value in a period such as period 2 is an outlier we can look at
the residual in this period or
$
e2 = Y2 - Y 2
Because this value depends upon the units of measurement we look at its estimated
z score which we call the studentized residual. (This could also be called Student’s t
statistic for the residual.) As the mean of these values is 0 we define the studentized
residual in period 2 in the following way
e
r2 =
2
$
σ 1− h
22
$
(In this expression σ is the standard deviation of the residuals and h22 is the
leverage of the second value in our sample.) If the studentized residual exceeds 2
this is seen as evidence that the Y value for this sample value is an outlier.
18
THE COMBINED IMPACT OF X AND Y VALUES
If a sample value is an outlier either because of its X or its Y value then we can
determine whether this sample value has an undue influence on our estimates by
using a measure called Cook’s distance measure.
Before we define this measure we must note that in a multiple regression model the
$
vector β contains the estimated values of the constants and coefficients of the k
independent variables which are found using all n values in the sample. The vector
$ ( i)
β contains the estimated values of the constants and coefficients of the k
independent variables which are after we have omitted observation i from the
sample. This set of estimates is found using the remaining n - 1 values in the sample.
We write the formula for Cooks distance for value i in the sample in the following way
Di =
$$
$$
(β - β (i) )' (X' X )(β - β (i) )
$
(k + 1) σ 2
This measure can be interpreted as the squared standardized distance between our
estimates of all the coefficients with the complete sample of n values and the
estimates we obtain when we omit the i th. value in the sample and use only n - 1
values.
Cooks distance can also be written as function of the other two terms used to
measure whether the separate X or Y value was an outlier. For value 2 in the sample
we have
D2 =
1
h
22
k + 1 1 − h 22
r2
2
Here k+1 is the number of estimated coefficients or β terms in the equation, h22 and
r 2 are the leverage and the square of the studentized residual for value 2 in the
2
sample.
The values of Cooks distance do not have exactly the same probability distribution as
the values of the F statistic. We can however use the critical values from the F
statistic tables to find values of Di which seem to show that value i does exert undue
influence. The rules we use are based on an F statistic whose first degree of freedom
is (k + 1) the number of coefficients we estimate and whose second degree of
freedom is (n - k -1) the sample size less the number of estimates we use when
finding Di. The key points to remember
19
(a) If Di is less than the value of Fk+1,n-k-1 which gives 0.20 in the left tail and 0.80
in the right tail then value i is not an influential observation.
(b) If Di is greater than the value of Fk+1,n-k-1 which gives 0.50 in the left tail and
0.50 in the right tail then value i is an influential observation.
(c) If Di lies between the value of Fk+1,n-k-1 which gives 0.20 in the left tail and 0.80
in the right tail and the value of Fk+1,n-k-1 which gives 0.50 in the left tail and 0.50
in the right tail then it is uncertain whether value i is an influential observation.
The closer Di is to the value of Fk+1,n-k-1 which gives 0.50 in the left tail and 0.50
in the right tail then the more likely it is that value i does exert undue influence on
the estimates we obtain.
RECURSIVE ESTIMATION
The three techniques given above were designed to avoid the large amount of
computation involved in re-estimating the model while leaving out one data point at a
time, then checking what happened to the coefficients. With today’s high speed and
cheap computers, it is often easier to write a small program to estimate the
coefficients, deleting each data point in turn, and obtain a graph of the coefficients.
This allows us to directly observe the effect of each data point in the values of the
coefficients.
A way of obtaining similar information to this is to use the recursive estimation option
in Eviews. On your estimated equation select
View → Stability tests → Recursive estimation
And in the dialogue box that appears select “Recursive Coefficients”. This will
start with a small sample then introduce on more data point at a time. You would
expect the estimated coefficient to be slightly unstable to start with then settle down
to a steady value. Here is an example using the data file guj5i9.
100
1.8
1.6
0
1.4
-100
1.2
1.0
-200
0.8
-300
0.6
10
15
20
Recursive C(1) Estimates
25
30
± 2 S.E.
10
15
20
Recursive C(2) Estimates
25
± 2 S.E.
It is clear that there are two highly influential data points here, points number 17 and
22.
20
30
5. THE F TEST FOR LINEAR
RESTRICTIONS
5.1 The Redundant Variables Test
In the simple linear regression model the ANOVA or F test can be used to choose
between the hypotheses:
H0 : β1 = 0
and
H1 : β1 ≠ 0
If you look at these two hypotheses you can see that in H0 we are imposing a linear
restriction, as we are saying that β1 = 0. In H1 the value that β1 can take is not
restricted in any way.
Suppose we have a multiple regression model such as the example given on pp 276277 of Gujarati (3e) where we wish to model the Annual Sales or Demand for
Telephone Cable (Y). The independent variables that we use in our demand function
are
X1
X2
X3
X4
X5
=
=
=
=
=
Gross National Product
Housing Starts
Unemployment Rate
Prime Rate lagged 6 months
Customer Line Gains
The yearly data from 1968 to 1983 is stored in the file GUJ832.XLS in the following
order
X1, X2, X3, X4, X5, Y
When we estimate the model which contains all 5 independent variables we obtain
the following SRF.
$
Yt
= 5962.656 + 4.883663X1t + 2.363956X2t - 819.1287X3t
+ 12.01048X4t - 851.3927X5t
If we want to test whether the complete set of independent variables has a significant
impact on the value of Y then our hypotheses will be
H0 : β1 = β2 = β3 = β4 = β5 = 0
H1 : Not all βi are equal to 0
21
The null hypothesis can be thought of as a set of 5 linear restrictions namely β1 = 0,
β2 = 0, β3 = 0, β4 = 0 and β5 = 0. To choose between these two hypotheses we
use the
Fk,n-k-1 Statistic and its p value which are given in the bottom LHS of the EViews
output. In this example we have
F5,10 = 9.283507
and
p = 0.001615
This p value indicates that even for a very small level of significance such as α = 0.01
we would reject H0 and conclude that this set of independent variables does have a
significant impact on the level of demand (Y).
In this situation we are testing whether the combined impact of the complete set of
independent variables has a significant impact on the dependent variable. We want
to develop a procedure which lets us test whether the combined impact of a
subset of independent variables X4 and X5 is significant.
(a) When we do this we will refer to the model which contains all 5 independent
variables as the Unrestricted model because in this model there are no
restrictions on the values that the coefficients of the 5 independent variables can
take. The Residual Sum of Square (RSS) for this model is called the URSS to
indicate that it is the RSS in the unrestricted model. The value of URSS shows
what the model with all 5 variables cannot explain.
(b) The model we obtain when we impose the restrictions implied by the null
hypothesis is called the restricted model and the Residual Sum of Square (RSS)
for this model is called the RRSS to indicate that it is the RSS in the restricted
model. The value of RRSS shows what the model from which variables have
been excluded cannot explain.
If we want to test whether the combined impact of the independent variables X4 and
X5 is significant we will now use the following hypotheses
H0 : β4 = β5 = 0
H1 : Either or both of β4 and β5 are not equal to 0
The model associated with this H0 is called the restricted model because in this
model the values of the coefficients of X4 and X5 are restricted to a value of 0. If this
model is correct then Y is affected by 3 independent variables with:
Yi
= β0 + β1X1i + β2X2i + β3X3i + ui
When we estimate the SRF of this restricted model the sum of squared residuals
RSS is now written as RRSS to indicate that it is the RSS for the Restricted model.
Here the value of RRSS shows what the model with only 3 variables cannot explain.
22
The testing statistic that we use to choose between these hypotheses is
Fm,n-k-1 =
(RRSS - URSS) / m
URSS / (n - k - 1)
To see why we use this statistic we first note that because we have an extra two
independent variables in the Unrestricted model we would expect that less variation
will be left unexplained than in the Restricted model. This implies that
URSS < RRSS
and their difference
(RRSS - URSS) ≥ 0
In the numerator for this F statistic we have:
(RRSS – URSS)/m
To see how we interpret this expression we note that
1. The term (RRSS – URSS) shows how much more variation is left unexplained
when we leave out the two variables. This is equivalent to saying it shows how
much more variation we can explain when we include X4 and X5.
2. The term m represents the number of restrictions we impose in H0.
When we divide the total amount of variation explain by these two variables by the
number of variables we obtain the average amount of variation explained by each
variable.
The F statistic can be thought of as the ratio of two different measures of variance
1. The numerator is said to measure explained variation because it shows the
variation which is explained by the two excluded variables X4 and X5. It also
measures unexplained or chance variation because it is based on a sample of
values. Our choice of a sample and hence the value of the numerator can also
said to be affected by chance.
2. The denominator only measures chance variance as it is simply the estimated
variance of the error terms based on the Unrestricted model with all 5 independent
variables.
We could write the denominator in the following way
n
$2
σ=
∑ e2
t =1 t
(n - k - 1)
=
23
U RSS
n-k -1
If you did not have EViews to work with then you would have to estimate the model
with all 5 independent variables and note the sum of squared residuals URSS then
estimate the model with only 3 independent variables and note the sum of squared
residuals RRSS. You would then substitute these values into the formula for F. In this
example there are m = 2 restrictions, n = 16 sample values and k = 5 independent
variables in the unrestricted model.
F tests for redundant variables are particularly useful when we have multicollinearity.
In this case the t-statistics are not reliable. However, because the F test measures
the difference in the amount of variation explained by the model with and without the
variable(s) it will give a reliable test of whether the variable contributes to the model.
Doing the test in Eviews
The F Statistic and its p value which are given in the bottom of the EViews output are
used to choose between the following hypotheses
H0 : β1 = β2 = β3 …. βk = 0
H1 : Not all βi are equal to 0
In this situation we are testing whether the combined impact of the complete set of
independent variables has a significant impact on the dependent variable.
Suppose you wish to test whether the combined impact of X4 and X5 is significant.
This is equivalent to choosing between the following hypotheses
H0 : β4 = β5 = 0
H1 : Either or both of β4 and β5 are not equal to 0
This model is called the restricted model because in this model the values of the
coefficients of X4 and X5 are restricted to a value of 0. The residual sum of squares
(RSS) in this model is called the RRSS. If 2 variables are excluded then we say that
there are m = 2 restrictions.
The testing statistic that we use to choose between these hypotheses is
Fm,n-k-1 =
(RRSS - URSS) / m
URSS / (n - k - 1)
In this expression the denominator is the estimate of the variance of the error terms
based on the unrestricted model with all 5 independent variables. The numerator is
the average amount of variation in the Y values that each of the two excluded
variables X4 and X5 explains.
24
If you did not have EViews to work with then you would have to estimate the model
with all 5 independent variables and note the sum of squared residuals URSS then
estimate the model with only 3 independent variables and note the sum of squared
residuals RRSS. You would then substitute these values into the formula for F. In this
example there are m = 2 restrictions, n = 16 sample values and k = 5 independent
variables in the unrestricted model.
If you do have EViews to work with you would simply estimated the unrestricted
model with all 5 independent variables then click
View
Coefficient Tests
Redundant Variables - Likelihood Ratio
When the following Omitted-Redundant Variable Test dialog box appears you will
enter the names of the variables whose joint impact you wish to test namely X4 and
X5. After you click [OK] the EViews output shown after the dialog box on the next
page now appears.
This output consists of 4 parts. The final three parts are the standard output for the
estimated restricted model which contains 3 independent variables. The information
that we require is given at the top of the EViews output. If shows that our F statistic is
6.248127 and it has a p value of 0.017356. In our example where we wish to choose
between the hypotheses
H0 : β4 = β5 = 0
H1 : Either or both of β4 and β5 are not equal to 0
if we have α = 0.05 then p = 0.017356 < α and we would Reject H0. Should we use a
smaller level of significance such as α = 0.01 then we will have p > α and we will
accept H0
25
Information needed to test whether the
joint impact of X4 and X5 is significant.
Redundant Variables: X4 X5
F-statistic
6.248127
Probability
0.017356
Log likelihood ratio
12.97222
Probability
0.001524
Test Equation:
LS // Dependent Variable is Y
Sample: 1968 1983
Included observations: 16
Variable
C
X1
X2
X3
Coefficient
195.6043
6.208393
1.495421
-469.7181
Std. Error
2308.317
1.905791
0.625402
198.5697
R-squared
Adjusted R-squared
S.E. of regression
Log likelihood
Durbin-Watson stat
t-Statistic
0.084739
3.257646
2.391134
-2.365507
0.601254
0.501568
859.3058
-128.4996
1.581264
Prob.
0.9339
0.0069
0.0341
0.0357
Mean dependent var
S.D. dependent var
Akaike info criterion
F-statistic
Prob(F-statistic)
7543.125
1217.152
13.72457
6.031455
0.009557
5.2 The Linear Restrictions or Wald Test
When we test whether we should exclude one or more independent variables in our
model we are testing linear restrictions such as β4 = 0 and β5 = 0. Our procedure
can also be used to test other more complicated types of linear restrictions. We shall
consider two common applications of this procedure which is also called the Wald
Test.
In the first application we have a cubic total cost function:
TC or Yi = β0 + β1Xi + β2Xi2 + β3Xi3 + ui
In this model the dependent variable is the total cost, while the independent variables
are the level of output along with the squared and the cubed values of the level of
output. If we differentiate the cubic total cost function, we obtain as our marginal cost
function the U-Shaped quadratic function which appears in Microeconomic textbooks:
MC or Yi
= β1 + 2β2Xi + 3β3Xi2 + ui
Past experience with cubic Total Cost functions has lead many analysts to believe
that the coefficients of Xi2 and Xi3 are equal. If we wish to test whether this is the
case we will now have the following hypotheses:
H0 : β2 = β3
and
26
H1 : β2 ≠ β3
We once again think of the model which is associated with the null hypothesis as the
restricted model. If H0 is correct and β2 = β3,then our cubic total cost function can be
written in the following way:
Yi
= β0 + β1Xi + β2Xi2 + β3Xi3 + ui
= β0 + β1Xi + β2Xi2 + β2Xi3 + ui
= β0 + β1Xi + β2(Xi2 + Xi3) + ui
So, if β2 = β3, our cubic function can be rewritten as a function which contains 2
rather than 3 independent variables, namely the level of output Xi and the sum of the
squared and cubed levels of output (Xi2 + Xi3). This function, which contains only
these two independent variables, is called the restricted model.
We now estimate this model and obtain the residual variation RRSS. We then
estimate the original cubic function or the unrestricted model and obtain the Residual
variation URSS. These values are then substituted into our formula:
Fm,n-k-1 =
(RRSS - URSS) / m
URSS / (n - k - 1)
The number of restrictions imposed is m = 1, namely that β2 = β3 and the number of
independent variables k is 3 the number in the Unrestricted model.
A second common application arises when we are working with the Cobb–Douglas
function. Although this function was originally used in production studies, it is now
used in a wide range of finance applications particularly in studies of the demand for
money. The formula for this function is:
β
Y = β0 X1 1 X2
β
2
ui
This function is nonlinear in the values of the variables. If, however, we find the
natural logs of both sides of this equation we obtain as our model:
ln Y = ln β0 + β1 ln X1 + β2 ln X2 + ln ui
This model is said to be linear in the logs of the values. The coefficients in this type of
model now represent elasticities. For example, β1 shows the percentage change in Y
when X1 changes by 1% of its current value.
Depending on the type of application, you may be asked to test various different
restrictions upon the values of these coefficients, such as whether they are equal in
magnitude but opposite in sign. This can be expressed in two ways either that β1 = β2 or β1 + β2 = 0.
27
We can also be asked to test whether these two elasticities sum to 1. In this situation
the hypotheses that we must choose between are:
H0 : β1 + β2 = 1
H1 : This restriction is not justified and β1 + β2 ≠ 1
Here in H0 we have m = 1 restriction, which we can write either as β1 + β2 = 1 or as
β2 = 1 - β1. Once again we say that the model which is associated with H0 is the
Restricted model. We can write the log linear version of this model in the following
way:
ln Yi = ln β0 + β1 ln X1i + β2 ln X2i + ln ui
= ln β0 + β1 ln X1i + (1 - β1) ln X2i + ln ui
= ln β0 + β1 ( ln X1i - ln X2i ) + ln X2i + ln ui
Subtracting ln X2i from both sides we obtain:
ln Yi - ln X2i = ln β0 + β1 ( ln X1i - ln X2i ) + ln ui
The difference between any two logs is equal to the log of their ratio. This means that
the expression on the LHS can be written as
ln Yi - ln X2i
or as
ln ( Yi / X2i )
Similarly the expression on the RHS can be written as
ln X1i - ln X2i
or as
ln (X1i / X2i )
This means that when we impose the restriction that β1 + β2 = 1 our model can be
written as a simple linear model in which the dependent variable is ln (Yi / X2i) and
the independent variable is ln (X1i / X2i) with:
ln ( Yi / X2i ) = ln β0 + β1 ( ln X1i / ln X2i ) + ln ui
To test this restriction we estimate the original Unrestricted log linear model with 2
independent variables and obtain the residual variation URSS. We then estimate this
Restricted model with just the 1 independent variable and obtain the residual
variation RRSS. The number of restrictions is m = 1 and k = 2 is just the number of
independent variables in the Unrestricted model. These values are now used in our
general formula for F.
28
Doing the test in Eviews
When discussing this procedure we will use Example 7.3 on pp 214-217 of Gujarati
(3e). Here we wish to model the level of real GNP (Y). The independent variables
that we use in our production function are
X1 = Labour Input
X2 = Capital Input
The yearly data from 1958 to 1972 is stored in the file GUJT73.XLS in the following
order
Y, X1, X2
The model we use is called the Cobb-Douglas function which is a power function with
β
Y = β0 X1 1 X2
β
2
(Here to simplify our treatment we omit the error term.) If there are constant returns
to scale in an economy the powers or exponents of the inputs X1 and X2 should
sum to 1. When we test whether there are constant returns to scale we have the
following hypotheses
H0 : β1 + β2 = 1
H1 : This restriction is not justified
While the Cobb-Douglas function is a nonlinear function if we take the natural logs of
both sides of this model we obtain a model which is linear in the logs of the values.
ln Y = ln β0 + β1 ln X1 + β2 ln X2
Before we estimate this model in EViews we must first obtain the sets of log values
for all 3 variables. In the case of the Y variable whose log we call LY we click [Genr]
then enter
LY = LOG(Y)
We do the same for the logs of X1 and X2 which we call LX1 and LX2. We now
estimate the linear relationship between the logs of the values of the variables and
obtain as our SRF
$
ln Y
= -3.338455 + 1.498767 ln X1 + 0.489858 ln X2
or
$
Y
= 0.035492 X11.498767 X20.489858
If we did not have EViews to work with we would call the sum of squared residuals
from this model URSS as in this model the values of the coefficients are unrestricted.
29
We would then estimate a second model in which the two coefficients were restricted
in the sense that they were assumed to sum to 1 so that
β1 + β2 = 1
and
β2 = 1 - β1
The sum of squared residuals from this restricted model is called RRSS.
To test this hypothesis of constant returns to scale with EViews after we estimate the
equation we now click
View → Coefficient Tests → Wald - Coefficient Restrictions
In the Wald Test dialog box shown below we enter the restriction that we wish to test.
In EViews the coefficients are not written as β0, β1, β2, etc. Instead we write the
estimated intercept or constant as C(1), the coefficient of X1 as C(2) and the
coefficient of X2 as C(3). In this case we write the constraint as C(2) + C(3) = 1.
When we click [OK] we obtain the following output
Wald Test:
Equation: Untitled
Test Statistic
F-statistic
Chi-square
Value
4.344966
4.344966
df
Probability
(1, 12)
1
0.0592
0.0371
Value
Std. Err.
0.988625
0.474284
Null Hypothesis Summary:
Normalized Restriction (= 0)
-1 + C(2) + C(3)
Restrictions are linear in coefficients.
30
From the EViews output we see that the F statistic has a value of 4.344966 with a p
value of 0.059154. This means that if H0 is correct and β1 + β2 = 1 then the
probability of obtaining a sample in which the estimated coefficients are 1.498767
and 0.489858 is 0.059154. If we are using α = 0.05 then we would accept H0
because we have p > α. When α is set at 0.10 however we would now reject H0 as
we now have p < α.
5.3 The Chow Test
Another important application of this procedure is the Chow test. This is used to test
the structural stability of a model. We use this test when our sample can be divided
into meaningful components such as the values before and after financial
deregulation or the returns from retail and manufacturing firms. We will use the
simple linear regression model to explain how this test is used.
Suppose we are working with time series data and the sample of n observations
can be divided into the n1 observations before some event and the n2 observations
after this event. For example we might be using the n = 20 yearly returns from 1976
to 1995 to estimate the market model for a firm. There are n1 = 11 returns for the
period before, and n2 = 9 returns for the period since, the crash of ‘87.
One analyst thinks that the market model for this firm has not changed since the
crash. We write the model for the complete 20 year period as:
Yi
= β0 + βi Xi + ui
i = 1 ... 20
We call it the restricted model because the coefficient values are restricted in the
sense that they are the same in the two sub-periods before and after the crash.
A second analyst thinks that the market model for this firm changed after the Crash.
His model is written in the following way. For the first n1 = 11 years it is
Yi
= a0 + a1 Xi + ui
i = 1 ... 11
and for the final n2 = 9 years it will be
Yi
= b0 + b1 Xi + ui
i = 12 ... 20
This is called the Unrestricted model because the intercept and slope are free to take
different values a0, b0, a1 and b1 before and after the crash.
The hypotheses that we choose between are:
31
H0 : a0 = b0
and
a1 = b1
H1 : Either the intercepts, the slopes or both are not equal.
To obtain the URSS we use the Unrestricted model. Here we estimate the separate
regression lines for each sub-period and note the two residual variations RSS1 and
RSS2 for these two equations. Our URSS is just the sum of these separate residual
sums of squares i.e.
URSS = RSS1 + RSS2
The Restricted model is the one in which we say that the intercept and slope are
equal in both periods. To obtain the RRSS we simply estimate the regression line
using all n = 20 observations, and note the sum of squared residuals which are equal
to the RRSS.
The number of constraints in this situation is equal to the number of coefficients that
we need to be equal for the model to be the same in both periods:
m = k+1 = 2
In the denominator the degrees of freedom are now given by the following formula:
df2 = n - 2(k + 1)
whereas in our original formula they were equal to n - (k + 1). This change is
necessary because in the Unrestricted model we are now estimating 2(k + 1)
parameter values for two different equations.
When we are using the Chow test our formula for the F statistic changes from
Fm,n-(k+1)
=
Fk+1,n-2(k+1)
=
(RRSS - URSS) / m
URSS / (n - k - 1)
to
(RRSS - URSS) / (k +1)
URSS / (n - 2(k +1))
In order to carry out this test the model is fitted to both sub-periods separately. This
means that we must have enough data in each period to be able to estimate the model
reasonably accurately. If this is not the case, we can use the Chow Forecast test that
will be discussed in Model 6.
You should not try doing the test at every possible date as this will mean you no longer
know what you level of significance is. Rather, pick one or two dates of events that you
think may cause the coefficient values to change, or divide the data set into two or
three equal parts.
32
Doing the test in Eviews
When discussing this procedure we will use the data in the file DHSY.XLS which
contains quarterly data on gross consumption, income, measured by GDP, and a
price index for the UK economy from 1957 to 1975. First we estimate the equation
Consumptiont = β 0 + β 1 Incomet + μ t
over the whole period. Then we click
View → Stability tests → Chow Breakpoint test.
In the dialogue box which follows we enter the date of the start of the second period.
In this case we will use the first quarter of 1971, which we write as 1971Q1 or 1971:1
We then get the following output
Chow Breakpoint Test: 1971Q1
F-statistic
Log likelihood ratio
3.870280
7.760536
Probability
Probability
So our test is written as
H0: The model does not change over the period
H1: The model is different after 1971
Level of Significance: α = 0.05
Test Statistic: Breakpoint Chow Test
p-value: p = 0.025323
Conclusion: Reject H0. The model is different after 1971.
33
0.025323
0.020645
6. DUMMY VARIABLES
6.1 Introduction
In many situations the variable whose behaviour we wish to model is influenced by
factors that are difficult to quantify. An obvious example is where we wish to model
the impact of some major change such as an alteration in the way the government
influences the interest rate. We call studies of the impact of these changes event
studies. When we are attempting to model 'retail sales' we know that they will be
higher during the quarter in which the major religious festival occurs. If we want to
include factors such as the time of year, the sex of the buyer or the socio-economic
group a person belongs to in our model, we must first create appropriate dummy
variables.
If we are looking at quarterly retail sales in economies which celebrate Christmas the
dummy variable we would construct would consist of a column of values in which we
have a 1 in every December quarter and a 0 in each of the other three quarters. For
economies which celebrate Chinese New Year the dummy variable will have a 1 in
the first quarter and 0’s in the other three quarters. When the sex of the buyer is a
relevant factor then we will have a column of values in which we have a 1 for female
buyers and 0 for male buyers.
When we wish to model the impact of the time of the year, if we have quarterly data
there are four different dummy variables we could have:
⎧1 March quarter
⎪
Q1 = ⎨
⎪ 0 otherwise
⎩
⎧1 June quarter
⎪
Q2 = ⎨
⎪ 0 otherwise
⎩
⎧1 September quarter
⎪
Q3 = ⎨
⎪0 otherwise
⎩
⎧1 December quarter
⎪
Q4 = ⎨
⎪ 0 otherwise
⎩
In our model, however, we never include all four dummy variables. If we did we would
have a problem with exact multicollinearity as the sum of all four quarterly dummy
variables is a column with 1’s in every position. When we estimate a model which
contains a constant or intercept term the column of values for the intercept is just a
column of 1’s. In this situation we would be unable to obtain any OLS estimates of
the parameters. This is why we can at most use 3 out of 4 quarterly dummy variables
or 11 out of 12 monthly dummy variables. (Using all four dummy variables for the four
quarters is called falling into the dummy variable trap.)
34
6.2 Creating Dummy Variables and Time Trends in Eviews
We can also use EViews to generate both Dummy Variables and Time Trends. To
see how this is done suppose we have set a workfile in which we have entered
quarterly data for the period from 1978:1 to 1994:4 i.e. from the first quarter of 1978
to the last quarter of 1994. To generate a dummy variable D3 with 1's in the third
quarter of each year and 0's in other quarters we click the [Genr] button then enter
the equation
D3 = @SEAS(3)
If we are working with the monthly data that we have used to estimate the Market
Model then when testing for the ‘January Effect’ we might want to use a Dummy
Variable DJAN which has a value of 1 in January i.e. month 1 of each year and 0’s
elsewhere. To generate this variable we click the [Genr] button and then enter
DJAN = @SEAS(1)
If we want a dummy variable D87 to model the impact of the October ’87 Crash with
zeros before October of 1987 and 1’s from October 1987 onwards, we click [Genr]
and enter
D87 = 0
The column D87 contains 0’s in every month. To make these one after October 1987
we click [Genr] change the sample from
1984:01 1993:12
to
1987:10 1993:12
The column D87 will now contain values of 0 up to in October 1987 and 1 values
afterwards.
If we want a dummy variable D87 to model the impact of the October ’87 Crash with
a 1 in October of 1987 and 0’s elsewhere we click [Genr] and enter
DE87 = 0
The column DE87 contains 0’s in every month. To place a 1 in October 1987 we
could change the sample as we did before to 1987:10 1987:10 and enter DE87 = 1.
I find it easier to open the series, click on Edit+/-, scroll through to October 1987 and
type in a 1. Make sure you hit enter, then click on Edit+/- again.
The column DE87 will now contain a value of 1 in October 1987 and 0 values
elsewhere.
35
These variables that we have generated are examples of what are called additive
dummy variables. The estimated coefficients of additive dummy variables show the
changes in the intercept when the quality which the dummy variable represents is
present.
A multiplicative dummy variable is equal to the product of the additive dummy
variable and the relevant independent variable. The coefficient of the multiplicative
dummy variable shows the impact on the slope when the quality which the dummy
variable represents is present. To obtain the multiplicative dummy MD87 associated
with the October ’87 crash and the independent variable in the Market Model AGSMV
we click [Genr] and enter
MD87 = D87*AGSMV
A Trend Variable is one which contains the values 1, 2, 3, …. n associated with the
periods in the sample. Suppose we want to obtain a trend variable TR which consists
of the set of values 1, 2, 3, …. 15 for the 15 year period from 1958 to 1972. We
would click [Genr] and then enter
TR = @TREND(1957)
You will note that in the TREND function we have the previous year 1957 rather than
the year 1958 in which our data starts. The reason we do that is that EViews gives
the year we enter a value of 0. If we want 1958 to have a value of 1 then we must
use 1957 as the starting date.
6.3 Interpreting the Coefficients of Dummy Variables
Consider the problem faced by a firm that manufactures a product whose sales rise
significantly in the December quarter. The accountant thinks sales depend on income
and this December or Christmas effect. The PRF which she plans to use when
forecasting sales (Y) has two independent variables namely income (X) and what we
call an additive dummy variable (D1) We write this PRF in the following way:
E(Y| Xi) = β0 + β1 Xi + β2 D1i + ui
This dummy variable is our dummy variable for the fourth quarter or Q4. When she
uses this model she is really saying that we need two different models to explain the
sales of the product. In the December quarter when Q4 or D1 = 1, the model we use
is
E(Y| Xi) = β0 + β1 Xi + β2 (1)
= (β0 + β2) + β1 Xi
In the other quarters D1 = 0, so our model will be:
36
E(Y| Xi) = β0 + β1 Xi + β2 (0)
= β0 + β1 Xi
From these two models we see that including an additive dummy variable
changes the intercept in the relevant quarter by an amount equal to the coefficient
of the dummy variable.
The additive dummy variable in this model could be used if we think that sales
increase by β2 in the December quarter regardless of the level of income.
An alternative approach is to think of Christmas as having an effect on β1, the slope
or coefficient of income, rather than on the intercept β0. That is, we think that
Christmas affects the proportion of income spent rather than the amount of income
spent. To model this type of effect we define a variable (D1i Xi) which is the product
of our dummy variable and the independent or income variable. We call this a
multiplicative dummy variable and we write our model as:
E(Y| Xi) = β0 + β1 Xi + β2 (D1i Xi)
During the December quarter we have D1i = 1 so that we now write our model as:
E(Y| Xi) = β0 + β1 Xi + β2 (1 Xi)
= β0 + (β1 + β2) Xi
In the other quarters D1i = 0 and now our model will be:
E(Y| Xi) = β0 + β1 Xi + β2 (0 Xi)
= β0 + β1 Xi
From this we see that the coefficient of the multiplicative dummy (D1i Xi) shows the
impact of the December quarter on the slope β1, that is, on the proportion of income
we spend.
If you think that Christmas affects spending in both ways (that is, it increases the
basic level of sales β0 and the proportion of income spent β1) then we would include
an additive and a multiplicative dummy variable, in which case our model is:
E(Y| Xi) = β0 + β1 Xi + β2 D1i + β3 (D1i Xi)
In this model the coefficient of the additive dummy β2 would show the impact on the
intercept or general level of spending and the coefficient of the multiplicative dummy
37
β3 would show the impact on the slope or proportion of income spent in the
December quarter.
6.4 Comparing Regression Models
Earlier, we saw that the Chow test could be used to determine whether there is a
stable regression model for the whole sample. We can also use dummy variables to
carry out this same procedure more effectively. Suppose we have a sample of n
quarterly observations on the sales of car air conditioners. This sample can be
divided into two groups. The first n1 values are the sales before the government
banned the use of harmful gases, and the second n2 values are the sales after the
ban. To represent the impact of the government legislation we use the following
dummy variable:
D2
⎧ 0 for the n1 values before the ban
⎪
=⎨
⎪1 for the n 2 values after the ban
⎩
In our standard model of car air conditioner sales we use 'per capita income' as our X
or independent variable. If we want to test whether the government legislation has
changed the nature of our model,
E(Y| Xi) = β0 + β1 Xi
We now estimate the model which contains both types of dummy variable:
E(Y| Xi) = β0 + β1 Xi + β2 D2i + β3 (D2i Xi)
When we used the Chow test we were able to choose between the following
hypotheses:
H0 : α0 = β0
and
α1 = β1
H1 : Either the intercepts, the slopes or both are not equal.
If our sample evidence supported H1, this told us that the parameter values were
different but this test did not make it possible to determine just where the differences
lay. If we use the model with the additive and multiplicative dummy variables we can
now determine where the differences lie by performing three different tests.
TEST 1
Suppose we want to test whether the model is different because the intercept has
changed; that is, because the impact of other variables on sales is different. We
estimate the SRF:
38
$
Yi
$
$
$
$
= β 0 + β 1 Xi + β 2 D2i + β 3 (D2i Xi)
$
and use the 't' statistic for β 2 to choose between:
H0 : β2 = 0
and
H1 : β2 ≠ 0
If we choose H1 : β2 ≠ 0 this implies that the legislation has changed the intercept
from β0 to
β0 + β2.
TEST 2
To test whether the model is different because the slope has changed; that is,
$
because the impact of income on sales is different, we use the t statistic for β 3 to
choose between:
H0 : β3 = 0
and
H1 : β3 ≠ 0
If we choose H1 : β3 ≠ 0 this implies that the legislation has changed the slope from
β1 to
β1 + β3.
We can if we want to, test whether there is a specific change in the size of a
parameter. For example, to test whether the change in the intercept is less than 2,
we would use the following hypotheses:
H0 : β2 = 2
and
H1 : β2 < 2
When choosing between these hypotheses the ‘t’ statistic we use is:
tn-k-1
=
$
β2 - β2
$
σ
n
/
∑ xi2
i=1
=
$
β2 - 2
$
σ/
n
∑ xi2
i=1
If we choose H0 : β2 = 2 this implies that the legislation has increased the intercept
by 2.
TEST 3
The third test we can perform is similar to the Chow test. If we wish to determine
whether the model has changed because of changes in both the intercept and the
slope, we now use the F test for linear restrictions.
Our restricted model when we use the Chow test or dummy variables is the model in
which the parameter values are the same over the whole sample. The unrestricted
model is the model which contains both types of dummy variable:
39
E(Y| Xi) = β0 + β1 Xi + β2 D2i + β3 (D2i Xi)
In this model there are k = 3 independent variables Xi, D2i and (D2i Xi).
Our restricted model is the model in which the coefficients β2 and β3 for the two types
of dummy variables D2i and (D2i Xi) are both set to 0 so:
E(Y| Xi) = β0 + β1 Xi
and hence there are m = 2 restrictions. We obtain the SRFs for both models and use
them to obtain the RRSS and the URSS. These values are used in our formula or :
Fm,n-(k+1)
=
(RRSS - URSS) / m
URSS / (n - k - 1)
where in this example we will have:
F2,n-4
=
(RRSS - URSS) / 2
URSS / (n - 4)
6.5 An Event Study
In the previous section we have looked at using dummy variables to examine the
stability of a model. In this section we apply these techniques to examine the effect
of the World Trade Center disaster on the market model for Boeing, the aircraft
manufacturing company. In addition we look at a type of dummy variable, called an
event dummy, which can be used to measure the effect of a single data point.
For this study I have used daily closing prices sourced from Datastream. The data
starts on the 1st June 2001 and ends on the 23rd November 2001. The S&P 500
index has been used to proxy for the market.
.10
.05
RBOEING
.00
-.05
-.10
-.15
-.20
-.08
-.04
.00
RMKT
40
.04
You should notice one very large negative return in the bottom left hand corner of the
graph. This occurred on the day the New York Stock Exchange re-opened following
the disaster. A further examination of the data set shows zero returns from the 11th
September to the 14th September inclusive. This is the period the exchange was
closed. I have therefore estimated the model without these zero return days. The
Eviews output is given below.
Dependent Variable: RBOEING
Method: Least Squares
Sample(adjusted): 6/01/2001 9/11/2001 9/15/2001 11/23/2001
Included observations: 124 after adjusting endpoints
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
RMKT
-0.003381
1.697307
0.002220
0.180181
-1.523200
9.419988
0.1303
0.0000
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.421077
0.416332
0.024678
0.074301
284.0861
1.834838
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
-0.004582
0.032302
-4.549776
-4.504288
88.73618
0.000000
An examination of this output shows the following points:
•
The model explains a significant amount of variation, as evidenced by the p-value
of the F-statistic,
•
The R2 is approximately 42%, which is quite good for this type of model,
•
The Durbin-Watson statistic shows no significant 1st order correlation and
•
The mean return for Boeing over the period was negative.
If we carry out a t-test on the slope we see it is significantly greater than one.
Now re-estimate the model with dummy variables. The first, DA, contains zeros up
until the 14th September and ones from the 15th September onwards. The other,
DDAY, contains zeros everywhere except the 15th September, where it has a one. I
have also included a multiplicative dummy variable for DA. I have not included a
multiplicative dummy for DDAY as this would give exact multicollinearity.
41
The output is given below.
Dependent Variable: RBOEING
Method: Least Squares
Sample(adjusted): 6/01/2001 9/11/2001 9/15/2001 11/23/2001
Included observations: 124 after adjusting endpoints
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
RMKT
DA
DA*RMKT
DDAY
-0.003432
0.878987
-0.000268
0.924153
-0.099192
0.002653
0.258876
0.004105
0.352788
0.025621
-1.293637
3.395404
-0.065342
2.619571
-3.871479
0.1983
0.0009
0.9480
0.0100
0.0002
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.546852
0.531620
0.022107
0.058159
299.2731
1.949414
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
-0.004582
0.032302
-4.746340
-4.632619
35.90186
0.000000
As before the model explains a significant amount of variation and the adjusted R2
has improved. Let us now consider each of the dummy variables.
The additive dummy variable, DA, tells us the change in the intercept. The p-value of
0.95 tells us this change is not significantly different from zero, that is it could easily
be caused by sampling errors. As the intercept in the market model measures the
risk free rate of return, this study has not detected any change in the risk free rate
after the disaster.
The multiplicative dummy tells us the change in the slope. The p-value of 0.0100
tells us this change is statistically significant. The beta, or the systematic risk of
Boeing, has increased dramatically from 0.88 before the disaster to 1.80 afterwards.
The event dummy, DDAY, tells us if there was an immediate exceptional one day
return, in addition to the change in the systematic risk captured by the multiplicative
dummy variable. It is obvious from the scatter plot we looked at earlier that the return
for Boeing on that day was exceptional, but so was the return on the market as a
whole on the day the market re-opened. The coefficient of this dummy variable
measures how well, or in this case badly, Boeing does after the effects of the general
downturn have been removed. In this case the p-value shows this is significantly
different from zero. On the day the market re-opened Boeing had an abnormal
negative return compared to the market as a whole of about 10%.
A second, less rigorous, approach that is often used is to calculate the cumulative
abnormal returns, CARs. First we estimate a market model, like the first in this
section, without any dummy variables. The estimated beta tells us the expected
stock return given a market return on any given day. The difference between the
42
expected and the actual return is the abnormal return. An easy way to estimate these
abnormal returns is to take the residuals from the previously estimated model.
Sometimes even this level of sophistication is not used and the abnormal return is
just taken to be the return on the share minus the market return.
If we add these abnormal returns up over time we get the CARs, which can be
plotted and our interest lies in finding any informative patterns. Normally, we would
expect the CARs to fluctuate randomly about the value zero. An increasing CAR
suggests the stock is outperforming the market and vice versa. I have done this for
the first model I estimated in this section and produced the graph given below.
.20
.15
.10
.05
.00
-.05
-.10
-.15
01:06
01:07
01:08
01:09
01:10
01:11
CAR
We can see the CARs follow a rising pattern through July and August, then fall
rapidly in September. This suggests that market did not anticipate the events of
September, which we know to be true. By then end of September the CAR seem to
have settled down into fairly steady behaviour suggesting the impact of the attack
was quickly priced into Boeing shares and its returns have been consistent with the
market model since that time. Thus, the effect of the September events can be
clearly seen using these CARs although a disadvantage of this method is that it does
not allow for the same sort of test procedures we were able to apply with the dummy
variables.
43
7. FINANCIAL APPLICATIONS
In this section we will look at some further applications of Financial Econometrics. In
most of these applications we will use the Multiple Regression model.
The first type of application that we will look at is set of the procedures used to test
the CAPM. We will look at the general theory behind these tests and two of the basic
techniques that have been used to test this theory. We will then look at other
variables which have been included in regression models which seem to give better
forecasts of returns than the CAPM does. While using these other variables such as
size and the price to book ratio gives superior forecasts it has not been possible to
find a theoretical justification for using these variables.
The Arbitrage Pricing Theory (APT) model is a multifactor model which is based on
simpler assumptions than is the CAPM namely that there are no arbitrage
opportunities available in financial markets. In other words assets or portfolios of
assets with the same level of risk must have the same level of returns. We will look at
this model and one of the procedures which is used to estimate and test the APT
model.
Our final application of the Multiple Regression model looks at how we can use the
Treynor-Mazuy (TM) model to examine the market-timing and the stock-selection
performance of Portfolio managers.
7.1 Testing the CAPM
If we assume that investors are only concerned with Risk and Expected Returns and
the risk associated with any asset can be measured by the standard deviation of the
returns from that asset then we can use mean-variance analysis to select portfolios
of assets which have the maximum returns for a given level of risk. If we also assume
that the financial markets are frictionless or very competitive then it can be shown
that for any set of assets the optimal combinations of risk and returns for different
portfolios of these assets are shown on the efficient frontier.
If we make a further assumption that there is some riskfree asset for which the rate of
return is the riskfree rate of return Rf then it can now be shown that the optimal
combinations of risk and returns for different portfolios of these assets are shown on
the linear function which intercepts the vertical axis at the riskfree rate of return Rf
and which just touches or is tangential to the efficient frontier. In the capital asset
pricing model (CAPM) it is also assumed that investors have the same beliefs about
the risks and returns for these assets. If the tangency portfolio is the portfolio
44
associated with the point at which this linear function touches the efficient frontier
then the CAPM says that this tangency portfolio is the Market Portfolio i.e. the
portfolio in which the asset shares for the portfolio are the same as the asset shares
for the whole market.
In the CAPM this linear function is called the Capital Market Line (CML) and we
write it in the following way
E(RP) = β0 + β1SD(RP)
= Rf +
[E(R
M ) - Rf
SD(RM )
] SD(R )
P
You should note that the terms in this expression can be interpreted as follows
1. The dependent variable is the Expected or Equilibrium returns on the portfolio
E(RP).
2. The independent variable is the measure of total risk, the standard deviation of
portfolio returns SD(RP). The P subscript indicates that this is a measure for any
Portfolio.
3. The [E(RM) - Rf] term is the market risk premium. The M subscript indicates that
this is a measure for the Market Portfolio. The parameters in this model are as
follows:
4. The intercept or β0 term is equal to the Risk-free rate of return Rf.
5. The slope or β2 term shows the impact on Equilibrium returns on the portfolio
E(RP) of a unit change in the standard deviation of portfolio returns SD(RP). This
slope also can be interpreted as the Market Price of Risk as it shows the amount
of Market Risk Premium for each unit of Market risk.
When we use the returns for the 14 large Australian firms in the file BFODAT then
the efficient frontier and capital market line when Rf = 0.11 or 11% is as shown
below.
45
The CML and Efficient frontier for the firms in BFODAT
The CML shows the theoretical relationship between the risk and the returns for a
portfolio. At the tangency point it is possible to show that the returns on each asset
are a linear function of the systematic or market risk for that asset. This linear model
is called the security market line (SML) and we write it in the following way
E(Ri)
= Rf + [E(RM) - Rf] βi
You should note that the terms in this expression can be interpreted as follows
1. The dependent variable is the Expected or Equilibrium returns on that asset E(Ri).
2. The independent variable is the measure of systematic risk we call beta βi for this
asset and the market portfolio.
3. The intercept or β0 term is equal to the Risk-free rate of return Rf.
4. The slope or β2 term is [E(RM) - Rf] or the market risk premium. This now shows
the impact on Equilibrium returns on the asset E(Ri) of a unit change in the
systematic risk.
It is obvious that the assumptions upon which the CAPM are based do not
completely agree with what happens in actual financial markets. We need to note
that
1. Investors do not look at just risk and returns and they do not all use the standard
deviation of returns as the only measure of risk.
2. Markets do have some imperfections i.e they are not frictionless.
3. Not all investors have the same expectations about future risks and returns for all
assets.
46
While these assumptions do not always hold the results provided by the CAPM may
be accurate enough to use in various applications. This is why we need to test
whether the predictions provided by the CAPM are accurate enough for investors
who need to model the returns on assets. Before we look at two of the basic
procedures that are used to test the CAPM we need to briefly consider the critique of
these testing procedures which was put forward by Roll.
In a 1977 paper Roll argued that it was not possible to test the CAPM. The
underlying reason why we could not do so was because we do not know what the
returns on the Market Portfolio are actually equal to. The best that we can do is
to find the returns on some proxy for the market such as the All Ordinaries Index in
Australia, the Hang Seng Index in Hong Kong or the S & P 500 Index in the US. The
argument that tests which are based on these proxies can lead to incorrect
conclusions can be summarized in the following way.
1. For any set of assets it is always the case that there must exist some portfolio for
which the expected returns is a linear functions of beta (ie. a measures of
systematic risk with respect to this portfolio. The portfolio must be mean-variance
efficient or lie on the efficient frontier.
2. Even if the CAPM is incorrect, if the proxy portfolio we have chosen lies on the
efficient frontier we will find that the expected returns are linear functions of the
betas. Here the test will incorrectly support the CAPM.
3. If the CAPM is correct but we happen to choose a proxy portfolio which does not
lie on the efficient frontier then the expected returns will not be linear functions of
the betas. Now the test will incorrectly reject the CAPM.
7.1.1 The Fama-MacBeth Approach
To explain how the Fama-MacBeth or two-pass approach to testing the CAPM
works we shall use the following example. Consider the set of returns for 14 large
firms that are stored in the file BFORET.XLS in columns C to P in rows 2 to 121. We
want to use the AGSM Value weighted index as the proxy for the Market portfolio.
The returns on this index are stored in column R. The procedure that we use is as
follows
1. At the first step or pass we calculate the systematic risk or betas for all 14 firms
with respect the AGSM Value weighted index. We do this by estimating the market
model
Rit
= β0 + β1RMt
for all 14 firms and noting the estimated values of the slopes. While it is usually a
good idea to use EViews to perform most regression calculations in this case it is
easier to use Excel. To find all the betas we go to cell C126 and enter
47
=LINEST(C2:C121,$R2:$R121)
We now highlight C126 and drag it to cell P126.
2. At the second step or pass we estimate the relationship between the expected or
average returns for all firms and their betas or measures of systematic risk. While
we could estimate the simple linear regression model
$
R j = γ0 + γ1 β j + u j
usually we estimate a multiple regression model in which beta, beta squared and
some other key factors or characteristics (CHAR) are included as independent
variables. We write this model in the following way
$
$2
R j = γ0 + γ1 β j + γ2 β j + γ3 CHAR + uj
The CHAR variable might be the variance of the error terms in the market models
for the different firms. This variance can be seen as a measure of the nonsystematic variance or risk that is present. With this model we can conduct several
tests
a. If γ3 is significantly different from 0 this means that nonsystematic risk has
a significant impact on expected returns.
b. If γ2 is significantly different from 0 this means that there is a nonlinear
relationship between the expected returns and systematic risk or betas.
c. Once we establish that γ3 and γ2 are not significantly different from 0 then
we can test whether γ1 is different from 0. If it is, then we can conclude that
there is a significant linear relationship between expected returns and the
measure of systematic risk that we call beta.
3. To find the average returns for each firm R j , the squared beta values and the
variances of the error terms for each market model we use the following Excel
procedures.
a. To find all 14 average returns go to cell C125 and enter
=AVERAGE(C2:C121)
We then click this cell and drag it to P125.
b. To find the squares of the beta values go to cell C127 and enter
=C126^2
We then click this cell and drag it to P127.
c. To find the variances of the error terms in all 14 Market Models we use the
STEYX function to find their standard deviation and then we square this
48
value. When we use the STEYX function we enter the Y values and then
the X values. To do this go to cell C128 and enter the following
=STEYX(C2:C121,$R2:$R121)^2
We then click this cell and drag it to P128.
Once you have generated these values you can estimate the multiple regression by
reading these values into EViews and using Eviews to estimate the model.
USING EVIEWS
To estimate the multiple regression model using Eviews we use the following
procedure
1. We open a new workfile by clicking
File → New → Workfile
When the Workfile Create dialog box shown below appears we note that the
values for the 14 firms are Unstructured/Undated and that there are 14 of them
4. To copy data into the Eviews workfile from the Excel file on the a: drive click
Procs
Import
Read Text-Lotus-Excel
When the Open dialog box appears choose a:drive and then the file BFORET.
Now click the [Open] button.
5. When the Excel Spreadsheet Import dialog box shown below appears we now
a. Click By Series - series in rows
b. Enter the first cell which contains the values of the variables which is C125. You
can if you wish specify the worksheet in the workbook which contains the data.
49
c. Enter the names of the variables in the same order that they appear in the Excel
worksheet namely RET, BETA, BETA2 and VAR. Now click [OK].
Once you have imported the data from the Excel worksheet into the Eviews workfile
you will estimate the equation in the usual way. The output that you obtain is shown
on the following page. In this case the values have been rounded to 4 decimal
places.
Dependent Variable: RET
Method: Least Squares
Sample: 1 14
Included observations: 14
Variable
Coefficient
C
0.0201
BETA
-0.0151
BETA2
0.0116
VAR
-0.7614
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.4746
0.3170
0.0053
0.0003
56.0465
2.5091
Std. Error
0.0062
0.0140
0.0070
0.2886
t-Statistic
3.2638
-1.0765
1.6450
-2.6380
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
Prob.
0.0085
0.3070
0.1310
0.0248
0.0130
0.0063
-7.4352
-7.2526
3.0110
0.0811
7.1.2 The Black-Jensen-Scholes Approach
Another approach to testing the CAPM was developed by Black, Jensen and
Scholes. With this approach we estimate the following model
Rjt - Rf
$
= α0 + β j(RMt - Rf)+ ujt
50
Here the dependent variable and the independent variable are the excess returns on
the asset itself (Rjt - Rf) and on the Market portfolio (RMt - Rf). To generate these
excess returns variables we subtract the returns from the riskfree asset from the
original returns. In the file BFORET these returns from the riskfree asset are stored in
column Q.
Once we have estimated this model we then test whether the value of α0 is
consistent with the value implied by the CAPM. Since the Security market Line is
E(Rj) = Rf + βi[E(RM) - Rf]
if we subtract Rf from both sides we have
E(Rj) - Rf = [E(RM) - Rf] βi
This implies that in our model we should have α0 = 0.
Most researchers have used this approach to see whether firms with some particular
characteristic have returns which are above those implied by the CAPM. For example
to test whether firm size affects returns instead of using the returns for just 1 share
we would use the returns from a portfolio of 100 shares of small firms. If the CAPM
accurately describes returns the size of the firms is irrelevant and we should have α0
= 0. If on the other hand there is a size affect with smaller firms giving higher returns
than the CAPM predicts we will find that α0 will have a positive value.
7.2 The Predictability of Share Returns
One way of testing the validity of the CAPM is to estimate multiple regression models
in which returns are the dependent variable and there are other independent
variables besides the measure of systematic risk or beta. If the CAPM is correct the
impact of these other variables should be negligible. If they are found to have a
significant impact then this can be interpreted as evidence that the CAPM does not
provide a general explanation of how returns on assets are determined. Studies have
shown that including the following variables as independent variables seems to give
better forecasts but it should be noted that it has not been possible to develop
theoretical models which explain why these variables affect returns in these ways.
1. When the Market size or capitalization is used as an independent variable it is
often found that it has a coefficient with a significant negative value i.e. smaller
firms have higher returns. This relationship cannot be explained by the fact that
smaller firms have larger betas.
2. If we use the ratio of the market price (P) and the book value (B) for each share we
may find that the coefficient of the ratio (P/B) also has a significant negative value
i.e. firms with a low price to book ratio tend to have higher returns. It is possible
51
that this variable has a significant impact when beta does not for the following
reason. When there is a large fall in the price of a firm and its debt levels are
stable investors see the firm as being more highly leveraged and so more risky.
The OLS procedure for estimating beta gives an equal weight to more distant and
more recent values of returns and so may give a beta value which is too low. The
(P/B) ratio may have a significant impact because it is measuring the systematic
risk more accurately than beta is.
3. A third variable is the earnings rate (i.e. the ratio of the earnings and price (E/P)) or
its inverse the price earnings ratio (P/E). If the (P/E) ratio is used as an
independent variable we now find that the coefficient of the (P/E) ratio also has a
significant negative value i.e. firms with a low price to earnings ratio tend to have
higher returns. Some studies have used the Cash Flow (CF) rather than the
earnings and looked at the impact of the (CF/P) ratio.
4. A fourth variable is the momentum which is a measure of how well a stock has
performed in some recent period such as the last 6 months. This coefficient is
positive indicating that stocks which have performed well in the last 6 months will
perform well in the next 6 months.
The different variables namely the size, the E/P or CF/P ratios and the P/B ratio are
also related to each other as all are affected by the share price P. Also values of
these ratios are highly correlated over time. If we attempt to include all of these
variables in the one model we may have a problem with Multicollinearity. This means
that our estimates of the coefficients for these variables will be quite inaccurate as
the OLS procedure is unable to accurately estimate the impact of individual variables.
The recent studies show that the most reliable forecasts are obtained when we use
size and the P/B ratio as the independent variables.
An important feature of these studies is it was often found that the results change
when different time periods are used. There are 2 key results which should be noted.
1. In most studies they found that there was a very strong relationship between
returns and these variables in the month of January but little or no relationship in
other months.
2. We can also use long term contrarian strategies to make excess returns as shares
that have achieved poor returns in the last 3 or 5 years will often achieve high
returns in the next 3 or 5 years. This means that long term performance or
momentum has a negative relationship with future performance while short term
momentum has a positive relationship with future performance.
7.3 Using the APT Model
7.3.1 Introduction
52
In the CAPM, returns are affected by a single factor namely the measure of
systematic risk. Many studies showed that when we look at the returns from different
types of firms the forecasts from the CAPM consistently underestimate the returns
from small firms or firms with a low market-to-book ratio. The Arbitrage Pricing
Theory model can be thought of as an attempt to construct a sound theoretical model
which takes into account further sources of risk over and above market risk which
make it possible to explain these so called anomalies i.e. why small firms and firms
with low market-to-book ratios have higher returns than the CAPM says they should
have.
The APT model also uses fewer assumptions than does the CAPM. The 3 key
assumptions are as follows.
1. As in any multifactor model the APT model assumes that the returns on any
asset (Ri) are dependent upon a limited number of factors. If there are K
factors then for asset i we can write this relationship in the following way
Ri = αi + βi1F1 + βi2F2 + …. + βiKFK + ui
In this model the returns depend upon a number of factors F1, F2, …. , FK.
These factors are usually interpreted as proxies for unexpected changes in
macroeconomic variables such as GDP, inflation, interest rates, oil prices and
the volatility of share prices. The values of these factors are usually scaled in such
a way that they all have 0 expected values. This means that the expected returns
E(Ri) should be the intercept term αi. For each asset i the impact of each factor j
or sensitivity to that factor is βij. The error term ui represents the risk which is
specific to this asset.
2. The second assumption is that in financial markets there are no arbitrage
opportunities. This means that if any assets or portfolios of assets have the same
risks they should also have the same returns as each other.
3. The third assumption is that there are a very large number of financial assets
and we can use these to construct portfolios for which the portfolio risk is not
affected by the risk which is specific to each asset.
Using these 3 assumptions it is possible to show that the expected return on the
financial asset can be written in the following way
E(Ri) = Rf + βi1λ1 + βi2λ2 + …. + βiKλK
The λj terms are the risk premiums for the separate factors. The equivalent term in
the SML is the single market risk premium [E(Ri) - Rf].
When we use the APT model we have to have to estimate the different factors that
are used in the model. There are 3 procedures which are used to estimate the factors
53
1. The statistical procedure uses factor analysis to obtain sets of values which have
similar covariances to the covariances between the returns on different assets.
These factors are similar to index numbers which summarize the values of several
variables into a single value. Unfortunately in most cases we are not able to
determine which economic variables are summarized by this single set of values.
2. The second approach is to calculate the unexpected changes in the
macroeconomic variables which affect the level of systematic or market risk.
The unexpected changes in macroeconomic variables such as GDP, inflation,
interest rates, oil prices and the volatility of share prices are found by calculating
the differences between the actual values and the forecast values of these
variables. There are many different ways of forecasting these values. In the
application discussed in the next section we will use a very simple forecast namely
the average of the past values. The 5 factors which have been shown to be most
closely related to returns on assets in the US are
a. Changes in the monthly growth rate of the GDP. This variable affects investors
future expectations about the growth rate in GDP and corporate earnings.
b. Changes in the risk premium associated with a firm defaulting on bond
repayments. This is calculated by finding the difference between the returns on
bonds with a BAA rating and bonds with a AAA rating. As this increases
investors worry more about firms defaulting.
c. Changes in the difference between the yields on long term and short term
government bonds. These changes affect the discount rate which is used to
find the present value of future returns from financial assets.
d. Unexpected changes in the rate of inflation alter the future returns for many
firms.
e. Changes in the expected rate of inflation as measured by changes in the yield
on short term Treasury bills. These changes affect interest rates, consumer
confidence and government policy.
3. The third approach is to identify the characteristics of firms which have very high or
very low returns. These are then used to form portfolios of shares where the firms
posses this characteristic e.g the firms with a small size or the firms with a low P/E
ratio. The returns from these portfolios are treated as proxy factors. The reason for
using this approach is as follows. If there is a risk premium or extra return
associated with some characteristic such as the small size of a very then the
portfolio of shares with this characteristic are usually very sensitive to the risk
associated with this characteristic.
7.3.2 Estimating and Testing the APT Model
In general when we are testing the APT model we test whether the data is consistent
with certain implications of the APT model. The key implications of this model are
54
1. The expected rate of return of any portfolio with zero betas is the riskfree rate of
return.
2. The expected returns on securities increase linearly with increases in a given
factor beta.
3. No other characteristics of stocks other than the factor betas affect expected
returns.
The application that that we will examine is based on question 9 in Chapter 2 of
Berndt. The data can be found in the file APT which contains 121 monthly
observations for the period from 1977:12 to 1987:12. There are values for the
following six variables in columns B to G.
B.
CPI :
The consumer price index.
C.
POIL :
The price of domestic crude oil.
D.
FRBIND :
The Federal Reserve Board index of industrial
production.
E.
DEC :
Monthly returns on Digital Corporation shares
F.
MARKET :
Monthly returns on the Market Portfolio
G.
RKFREE :
The riskfree rate of return
We will use the following procedure to estimate the APT model in this situation.
1. Our first task is to use the [Genr] function in Eviews to calculate the discrete
growth rates for 3 of these variables namely. The new variables we must generate
are as follows
a. RINF or the rate of inflation is the rate of change in the price level
RINFt =
CPI t - CPI(t - 1)
CPI(t - 1)
b. ROIL is the rate of increase in the real price of oil where the real price is the
ratio of the price of oil POIL over the general price level or CPI. We write this in
the following way
ROILt =
⎞
⎛ POIL (t - 1)
⎛ POIL t
⎞
⎜
⎟ -⎜
CPI t ⎠
CPI(t - 1) ⎟
⎝
⎠
⎝
⎛ POIL (t - 1)
⎞
⎜
CPI(t - 1) ⎟
⎝
⎠
c. GIND or the discrete growth rate in industrial production is given by
GINDt =
FRBIND t - FRBIND (t - 1)
FRBIND (t - 1)
55
When you calculate discrete growth rates you lose an observation from the start of
the sample. This means we have 120 values of the growth rates from 1978:1 to
1987:12.
2. After calculating the growth rates we then calculate the forecasts of these growth
rates which we will use to find the unexpected changes in the growth rates. A
simple forecast is just the mean of the set of values. The difference between the
values of these growth rates and the mean of each set of values is called a
surprise variables. To find the unexpected change in the rate of inflation or
SURINF we click [Genr] and enter the following function in EViews
SURINF=RINF-@MEAN(RINF)
A similar procedure is used to estimate the other surprise variables SUGIND and
SUROIL.
3. Once we have generated these surprise variables we now estimate the APT model
for the particular share which in this case is Digital Corporation or DEC. In this
model we have
a. The risk premium for DEC shares as the dependent variable. To obtain this risk
premium we find the difference between the returns for DEC and the riskfree
rate of return.
b. We include four independent variables namely the market risk premium MRP or
difference between the return on the market index and the risk free rate of
return and the three surprise variables.
The output we obtain is shown on the top of the following page. This shows that as
the DEC shares are concerned the CAPM rather than the APT model is the
appropriate way to model the returns. The reason why is as follows
1. The p value for the intercept term shows that the intercept is not significantly
different from 0.
2. The p values for the 3 surprise variables all show that none of these coefficients
are significantly different from 0 whichy is the same as saying that the unexpected
changes in these variables do not help us to explain the returns on DEC shares.
3. The p value for the Market Risk Premium shows that this coefficient is significantly
different from 0 so that the returns on DEC shares are mainly affected by Market
risk.
Dependent Variable: DECRP
Method: Least Squares
Sample(adjusted): 1978M01 1987M12
Included observations: 120 after adjusting endpoints
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
0.006914
0.007517
0.919743
0.3596
56
MRP
SURINF
SUROIL
SUGIND
0.838438
0.659194
0.059071
-0.018290
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.345182
0.322405
0.081880
0.771001
132.5808
2.144707
0.111875
0.866928
0.211734
0.074656
7.494414
0.760380
0.278985
-0.244985
0.0000
0.4486
0.7808
0.8069
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
0.012911
0.099470
-2.126347
-2.010202
15.15531
0.000000
4. This evidence does not however provide very strong support for the CAPM as the
2
R value of 0.356053 indicates that about 35.6% of the variation in returns is
systematic which means that about 64.4% is unsystematic or specific to DEC.
The estimated model that we would use to forecast the future excess returns of DEC
shares over and above the riskfree rate of return based on this output would be
Rit - Rf = 0.829871(RMt - Rf)
7.4 Testing Market-Timing and Stock-Selection
Performance
An important issue that many investors are concerned with is whether they should
invest in actively managed mutual funds. In these funds the portfolio manager
attempts to choose assets that will outperform the market. The alternative approach
is to choose a passive fund where the portfolio manager chooses assets which will
replicate the performance of the market. One way of deciding whether we should
choose an actively managed fund is to look at how well the portfolio managers in
these funds select shares and time when to buy and sell shares.
One model which is used to evaluate the performance of the portfolio managers in
actively managed funds is the Treynor-Mazuy model in which the excess returns on
the portfolio are treated as a quadratic function of the excess returns on the market.
(Excess returns are the difference between the actual returns and the riskfree rate of
return.) We write this model in the following way
2
(RPt - Rf) = αP + β1(RMt - Rf) + β2(RMt - Rf)
+ ut
where
(RPt - Rf) = the excess returns on portfolio P
(RMt - Rf) = the excess returns on the market portfolio
αP = estimated measure of selectivity performance
57
β1 = the traditional measure of systematic risk or beta
β2 = estimated measure of market-timing performance
The coefficient of the squared excess return on the market portfolio β2 is seen as a
measure of market-timing performance for the following reason. The derivative or
slope of this function w.r.t. the excess returns on the market portfolio shows how unit
changes in the excess returns on the market portfolio affect the excess returns on
portfolio P. From the formula for the derivative
d(R
−R)
d(R
−R)
P
M
f
= β1 + 2 β2(RMt - Rf)
f
we see the following
1. If β2 is 0 then the impact of a unit change in the excess returns on the market
portfolio is just the traditional measure of systematic risk β1.
2. If however β2 has a positive value then the way in which returns from the market
portfolio are moving affects the size of impact on the returns from portfolio P from
a unit change in the excess returns from the market portfolio. We now find that
a. When the market is moving up so there are positive changes in the excess
returns on the market portfolio then the size of the impact on the excess returns
from portfolio P will be larger than it would have been under the CAPM where it
is β1. In other words as long as β2 has a positive value we can conclude that
portfolio managers are timing their trading in stocks in such a way that the
greater the excess returns from the market portfolio the greater the impact on
the returns from portfolio P.
b. It also means that when the market is performing poorly and there is a negative
excess return from the market portfolio then because of the portfolio managers
ability to time the trading of shares the size of the negative impact on the
excess returns from portfolio P will be smaller than what it would have been
according to the CAPM.
Because of the importance of this issue there are have been many studies which
looked at the performances of different types of actively managed portfolios. If you
want to perform this type of study using the basic Treynor-Mazuy model you will use
the following procedure
1. Obtain a set of values for the market returns, the riskfree rate of return and the
returns on a particular portfolio P. Use these values to calculate the excess
returns for the market portfolio and the actively managed portfolio P.
2. Estimate the quadratic equation
2
(RPt - Rf) = αP + β1(RMt - Rf) + β2(RMt - Rf)
58
+ ut
and use the estimated value of β2 to choose between the following hypotheses
H0 : β2 = 0
There is no evidence of being able to time the market
H1 : β2 > 0
There is evidence of being able to time the market
3. Finally you will use the estimated value of αP to choose between the following
hypotheses
H0 : αP = 0
There is no evidence of being able to select superior
shares
H1 : αP ≠ 0
There is evidence that portfolio managers selected
superior or inferior shares
For example consider the file managed_fund.wf1 which contains the weekly exit price
for an Australian Equity Fund. The estimation results are presented below and
actually produce a negative coefficient suggesting that this managers trading
activities do not add to portfolio wealth.
Dependent Variable: RFUND-IRATE
Method: Least Squares
Date: 04/23/07 Time: 16:12
Sample(adjusted): 3/03/1995 11/02/2005
Included observations: 520 after adjusting endpoints
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
RMKT-IRATE
(RMKT-IRATE)^2
-0.010721
0.868749
0.892953
0.002970
0.108945
0.957923
-3.609755
7.974170
0.932176
0.0003
0.0000
0.3517
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat
0.645532
0.644160
0.010575
0.057811
1629.297
2.446525
Mean dependent var
S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)
-0.054572
0.017727
-6.254990
-6.230448
470.7611
0.000000
More generally, various studies have produced quite different results. One reason
why we can obtain different results with different data is that our model is not
correctly specified.
1. A possible reason why it may not be correctly specified is that many studies use
the Standard and Poors 500 Index as the proxy for the market portfolio but
portfolio managers often include shares of companies that are not one of these
500 firms.
2. Another reason is that portfolio managers can also include other financial assets
such as government bonds and corporate bonds in a portfolio.
59
The way to allow for this is to include two extra independent variables in our model.
The first of these is the excess returns on a second index called the Wilshire 4500
Index which uses shares from 4500 different firms from those in the Standard and
Poors 500 Index. The second of these is the excess returns on a second index which
summarizes the prices of government and corporate bonds. In the US one index
which does this is the Shearson Lehman Government Corporate Index. If we call the
excess returns on these two indices rW and rB then the model we will now estimate is
the following one.
(RPt - Rf) = αP + β1(RMt - Rf) + β2(RMt - Rf)
2
+ β3rW + β4rB + ut
This should give us more accurate estimates of β2 and αP and this will give us a
better idea of whether the portfolio managers have been able to time the market and
to pick stocks with higher returns.
60
APPENDIX – DERIVATION OF THE
MATRIX LEAST
EQUATION
SQUARES
As for the simple linear regression case, we calculate the residuals then square and
sum them to get the sum of squared residuals. Let the vector of estimated residuals
be denoted by e. This is an n×1 matrix similar to the y and u vectors defined earlier.
n
ˆ
ˆ
SSE = ∑ ei2 = e′e = (y − y )′(y − y )
i =1
ˆ
ˆ
= (y − Xβ )′(y − Xβ )
ˆ
ˆ
= (y ′ − β ′X′)(y − Xβ )
ˆˆ
ˆ
ˆ
= y ′y − y ′Xβ − β ′X′y + β ′X′Xβ .
Now similarly to the simple linear regression case we take the derivative with respect
to β and set this equal to zero to find the minimum of the sum of squared error terms.
∂SSE
ˆ
= 0 − X′y − X′y + ( X′X + ( X′X)′) β = 0
ˆ
∂β
ˆ
⇒ 2 X′Xβ = 2X′y
Dividing both sides by 2, then premultiplying by (XtX)-1 gives,
ˆ
β = ( X′X) −1 X′y .
For those interested in the details of the matrix differentiation, rules can be found in
Lutekepohl H. [1991] Introduction to Multiple Time Series Analysis, SpringerVerlag, Germany, Appendix A and
Magnus J.R. and Neudecker H. [1998] Matrix Differential Calculus with
Applications in Statistics and Econometrics, Wiley, New York.
61

**Find millions of documents on Course Hero - Study Guides, Lecture Notes, Reference Materials, Practice Exams and more. Course Hero has millions of course specific materials providing students with the best way to expand their education.**

Below is a small sample set of documents:

CUHK - STAT - 3008

MODULE 5SIMPLE LINEAR REGRESSIONWITH FINANCIAL APPLICATIONSCONTENTS:MODULE 51. Introduction2. Model Specification2.1 The Nature of Stochastic Functions2.2 The Types of Mathematical Functions2.3 The Choice of Variables2.4 The Assumptions about th

CUHK - STAT - 3008

Solutions forApplied Linear RegressionThird EditionSanford Weisberg2005, Revised February 1, 2011ContentsPrefacevii1Scatterplots and Regression12Simple Linear Regression73Multiple Regression354Drawing conclusions475Weights, Lack of Fi

Ashford University - ECE - 332

DEVELOPMENTALDevelopmental Milestones of a PreschoolerSharon StoneECE 332Jennifer AshtonJanuary 15, 2012Developmental Milestones of a PreschoolerThe preschool age is a wonderful time for children; they start to trust other individuals beyondtheir

Boise State - HIST - 100

A Brief History of LifeHow Did Everything Begin?Intelligent Design Theories(Creation Myths)More Stories from Around theWorldon how the world beganall of them uniqueIf the earth has existed for 4.5 billion years,what about life on earth? Earliest

Boise State - HIST - 100

Before we begin!Political: Who controls what? What type ofgovernment is there? Anything to do withlaws or war.Economic: What type of economy? Howdo people make a living?Geography: Where is it? Is the landmountainous? Desert? Oceanic?Social: Religi

Boise State - HIST - 100

Ancient Greece500-323 B.C.E.Geography Greeceis apeninsula aboutthe size ofLouisiana in theMediterraneanSea. Its very close toEgypt, the Persianempire (includesTurkey) and Rome.Greek geographyGreece is mountainousGreek communitiesoften tim

Boise State - HIST - 100

AncientMexicoAncientMexicoMayan,Incan,andAztecCivilizationsBy:Mrs.MeredithSandersTeotihuacanTeotihuacan4 well-knownbuildings in thisancient city:Pyramid of the SunPyramid of the MoonTemple of QuetzalcoatlTemple of the JaguarsOlmecsOlmecs T

Boise State - HIST - 100

WorldHistoryFallFinalAncientHistorytotheAmericanRevolutionPrehistoricmeansbefore_?WritingThenhowdoweknowaboutprehistoricpeopleandwhattheydid?IntheOldStoneAgetheyhadntfiguredouthowtoraisetheirowncrops,sotheywereknownas_?NOMADSEveryonestartstospec

Boise State - HIST - 100

AsiaPowerPointMr.ClutterVillegasMiddleSchoolThreeEmpiresMongols12601294OttomanEmpire1400s&1500sMughalEmpire15561605GenghisKhanFactsMilestones1187?AssumedthetitleofGenghisKhan(Khankingpresident1206WasproclaimedrulerofallMongolpeoplebyanassemblyo

Boise State - HIST - 100

BhopalGasTragedyPreviewAmongsttheworstIndustrailDisastersofitstime.Occurrence:3rdDecember1984.Placeofoccurrence:Bhopal,MadhyaPradesh,India.Company:UnionCarbideCorporation.Chemical:MethylIsocyanate(27tons)Among the 500,000 people exposed to the gas,

Boise State - HIST - 100

Britis h His to ryB ritisThis powerpoint was kindly donated towww.worldofteaching.comhttp:/www.worldofteaching.com is home to over athousand powerpoints submitted by teachers. This is acompletely free site and requires no registration. Pleasevisit

Boise State - HIST - 100

The City-States of GreeceSparta and AthensThe Persian WarsThe Delian LeagueThe Decline of AthensThe City-States ofAncient GreeceSparta and AthensThe Persian WarsThe Delian LeagueThe Decline of AthensNow thats tough! There is a story about a Sp

Boise State - HIST - 100

USCivilRightsMovementBeginnings through the 60sBy J. Aaron CollinsAbolitionistsFrederick Douglas was the editor of anabolitionist newspaper.Onasidenote. Arethey related?HarrietTubmanHelped slaves escape via the UndergroundRailroad.JohnBrown H

Boise State - HIST - 100

ClassicalGreeceClassicalPeriod500339BC "Classical"means: Standardagainstwhichothersarejudgedorevaluated Greatest Enduring Stylisticform(music,art,etc) Stylisticperiod(e.g.afterBaroque) GoldenageofacivilizationClassicalPeriodorTheGoldenAgeofGre

Boise State - HIST - 100

ClassicalGreekPhilosophySocrates Simple man Stonemason Shrewish wife Loyal service in the war Incredible concentration Wisest man in Athens (oracle) Gad fly (Dialectics/Socratic method) Theunexaminedlifeisnotworthliving.Socrates Convicted of co

Boise State - HIST - 100

Review of our Presidents fromthe Progressive Era to Cold WarTeddy Roosevelt , William Howard Taft , WoodrowWilson (D)Warren G. Harding , Calvin Coolidge (D),Herbert Hoover , FDR (D)Kamikaze Japanese pilots crashed their bomb-filledplanes into Allie

Boise State - HIST - 100

The Cold War 1945The1991Two sides of Cold War NATO North Warsaw Pact proAtlantic TreatySoviet countries OrganizationUSSR, and allcountries controlled USA, France, Greatby the USSR.Britain, West COMMUNISMGermany CAPITALISMCold War The Cold

Boise State - HIST - 100

Villegas Middle SchoolFall 2007Mr. ClutterAfricawasthehomeoffourgreatcivilizations;Nubia,Ghana,MaliandSonghay.MaliEnterSonghayEnterEnterEnterNubiaGhanaGhanadevelopedinWestAfricabetweentheNiger(NIjhur)andtheGambiaRivers.Itwasanimportantking

Boise State - HIST - 100

AmericasGreatDisastersMr.RyanGreatDisastersGreatDisasters ManyterriblethingshavehappenedtoAmericaandhercitizens Floods,Hurricanes,Fires,Blizzards,Explosions,Diseases Causesdeath,injuries,lossofhomes,money,jobs,communities,friendships Butmanygoo

Boise State - HIST - 100

Early Christian IrelandThe Arrival of Christianity fromAD 400 onwardsPalladius introducing Christianityto Ireland before St. PatrickDied432;feastdayformerlycelebratedonOctober7.ThestoryofPalladius,recordedbySaintProsperofAquitaine,iscaughtupinthato

Boise State - HIST - 100

EgyptEgyptAncientCivilizationsBy:Mrs.SandersHistoricalOverviewAncientEgyptwasthebirthplaceofoneoftheWorldsgreatestcivilizations.ItwasfarmoreadvancedthanEuropeantribesofthesametimeperiod,whowerestillintheStoneAge.LocatedinthenortheastcornerofAf

Boise State - HIST - 100

Englandin1060ShouldWilliamofNormandyattack?EdwardtheConfessorwasKingHehadtroublekeepingcontrolofthecountry.HELP!NobodywilldoasIsay!MostofEnglandwaspoorlydefended.Therewereveryfewcastles.VeryfewpeoplelivedinthenorthandwestofEngland.Everybod

Boise State - HIST - 100

EUROPEAN RULERS IN THE AGE OFABSOLUTISMBy CountryA TIME OF EMPIRES1533-1603ENGLANDElizabeth IE N GL A N DJames I The 1 StuartKingstJames sonE N GL A N DCharles ISPAINPhilip II SpainThe GoldenAgeFRANCELouis XIV The Sun KingLOUIS XIVPRUS

Boise State - HIST - 100

The Fall of Rome and thebeginning of the MiddleAges.Fall of the Roman EmpireRome was the mostpowerful empire theworld had ever seen.Its architecture wasHellenistic and its roadsystem was asimpressive as that ofthe Inca in S. AmericaRoman Empir

Boise State - HIST - 100

Mr. ClutterMr.Villegas Middle SchoolFebruary20071.MARCH7,19421.MARCH7,1942FirstBlackcadetsgraduatefromflyingschoolatTuskegee,Alabama.InJune1943,thefirstsquadronofBlackaviators,the99thPursuitSquadron,flewitsfirstcombatmission.2.NOVEMBER3,1942Wil

Boise State - HIST - 100

Black History Month2008Mr. ClutterVMS LibraryIntroductionCarter G. Woodson was an African American historian, author, journalist andthe Founder of Black History Month. He is considered the first to conduct ascholarly effort to popularize the value

Boise State - HIST - 100

George WashingtonThomas JeffersonTheodoreRooseveltAbrahamLincolnMount Rushmore Facts The carving started in 1927 and finished in1941few injuries but no deaths. They stand at over 5,500 feet Each presidents head is as tall as a 6 storybuilding

Boise State - HIST - 100

The French Revolution1789Causes of French RevolutionIdeas of liberty and equality fromthe American Revolution (note:Constitution was signed 2 yrsbefore in 1787)Enlightenment ideas of JohnLockeCauses of French Revolution Vastmajority of people w

Boise State - HIST - 100

Great DepressionBrother can you spare a dime?OBJ #1 - Describe the CAUSES and SPARK of the Great Depression. How didOverproduction affect both farmers and industry? What system collapsed and causedmillions to lose their savings? Explain how buying on

Boise State - HIST - 100

EarlyGreekScienceandPhilosophyThis Powerpoint is hosted on www.worldofteaching.comPlease visit for 100s more free powerpointsEarly GreeceGreece and Greek ColoniesRome and Roman ColoniesPhoenicia, Carthage and Punic ColoniesThalesofMiletus625BC F

Boise State - HIST - 100

H alloweenOrigins and TraditionsO rigins Halloween began two thousand years ago inI reland, England, and Northern France with theancient religion of the Celts (Paganism).T heycelebratedtheirNew stYearonNovember1 . This day marked the beginning of

Boise State - HIST - 100

Facts ofHalloweenHalloweenactuallyhasitsoriginsintheCatholicChurch.ItcomesfromacontractedcorruptionofAllHallowsstEve.November1 ,AllHallowsDay(orAllSaintsDay)isaCatholicdayofobservanceinhonorofsaints.In Mexico, theycelebrate El Dia de losMuerto

Boise State - HIST - 100

Photo Albumby UserThispowerpointwaskindlydonatedtowww.worldofteaching.comhttp:/www.worldofteaching.comIshometowelloverathousandpowerpointssubmittedbyteachers.Thisafreesite.PleasevisitandIhopeitwillhelpinyourteaching

Boise State - HIST - 100

How did Life in Egypt affectMedicine?Ancient MedicineAncient Medicine3000BC 500ADEgyptians 3000BC-1000BCGreeks 1000BC 500BCRomans 500BC 500 ADKey examine patterns and themesacross the period Focus: EgyptEgypt A Wealthy CountryWealthy countryP

Boise State - HIST - 100

People of the Stone AgeHunters and GatherersCh.1, Lesson 1Mr. Bennetts 6th GradeThe earliest humansprobably lived in Africa.They spread to therest of the world overthe next tens ofthousands of yearsas they hunted andgathered food tosurvive.Ge

Boise State - HIST - 100

Immigrants inAmericaMillions of immigrants moved tothe United States in the late1800s & early 1900s.Immigration Stations Once immigrants arrived in the U.S., theywent through immigration stations, such asEllis Island in New York Harbor.Government

Boise State - HIST - 100

StandardStandard10.3StudentsanalyzetheeffectsoftheIndustrialRevolutioninEngland,France,Germany,Japan,andtheUnitedStates.IndustrialRevolutionIndustrialRevolution TheIndustrialRevolutionwasthemajorshiftoftechnological,socioeconomicandculturalc

Boise State - HIST - 100

Industrial RevolutionBy J. CollinsIndustrial RevolutionThe IR is whenpeople stoppedmaking stuff athome and startedmaking stuff infactories.Cottage IndustryFactory systemCotton gin His cotton ginremoved theseeds out ofraw cotton.Steam Engin

Boise State - HIST - 100

IRELAND IN CONFLICT 1909 - 1922Test your knowledge of whos who in the Ireland in Conflicttopic with the following slideshowAs the images of historical personalities from the topic appear,try to work out who they are and what view they held over Irelan

Boise State - HIST - 100

Islam:History, values and cultureShahbaz YounisPRESENTATION OUTLINEIntroductory RemarksHistorical overviewIslam as a monotheistic religionthe QuranGod or Allahpillars and valuessocial code and reformsrelation with other faithsthe Sunni and Shi

Boise State - HIST - 100

JerusalemMount Zion, JerusalemAn early historyAncient Canaan 1700 - 1386B.C.EPharaoh Amenhotep ruled overEgypt and CanaanThe Pharoah Ramses III forcedthe Philistines to settle in CanaanIn 1750 B.C.E - The12 tribes ofIsrael settled in Egypt from

Boise State - HIST - 100

Kennedy AssassinationKennedyCutting through ConspiraciesNovember 22, 1963NovemberJFK was in DallasJFKtrying to getsupport for nextyears election.years Dallas had anDallasunfriendlyreputation towardspoliticians.politicians.Lee Harvey Oswal

Boise State - HIST - 100

Renaissance Man:Leonardo Da VinciMr. ClutterVMS LibrarySpring 200811/23/111Leonardo the ScientistStudied many topics suchas anatomy, zoology,botany, geology, optics,aerodynamics andhydrodynamics amongothersHe was fascinated by thestudy of p

Boise State - HIST - 100

Lincoln and the EmancipationProclamationProclamationRace Relations in the SouthRace1863-19121863-1912Unit 4What motives lay behind Lincolnsissuing of the EmancipationProclamation?Proclamation?Underlying questions:What was the EmancipationWha

Boise State - HIST - 100

Louis Pasteur & Germ TheoryBeliefs about disease in19thCentury People knew there was a link between dirtand disease, but could not explain the link. People explained disease as seeds bad seedsin the air known as miasma. 1850s &1860s breakthrough in

Boise State - HIST - 100

MagellanandCoronadoBy: Mrs. MaysFerdinand MagellanMagellan Ferdinand Magellan was a Portugesemaritime (sea) explorer who sailed for thecountry of Spain (just like Columbus andCortes). He led the 1st successful attemptto sail around the entire Ea

Boise State - HIST - 100

Dr. Martin Luther King, Jr.1929-1968Michael Luther King, Jr. wasborn on January 15th toschoolteacher, Alberta Kingand Baptist minister, MichaelLuther King residing at 501Auburn Avenue. His fatherlater changed both their namesto Martin Luther King

Boise State - HIST - 100

Fully qualified 14years at university, 7of those yrs he studiedmedicine, therealways men.Knowledge: 5Experience: 5Cost of treatment: 1Success rate: 5Knowledge: 4Experience: 4Cost of treatment: 3Success rate: 3He sells andmixesmedicine.Pres

Boise State - HIST - 100

MiddleAgesofM iddleAgesofEuropeBy:Mrs.Sanders6thGradeBecomingaBecomingaKnightCastlesClothingKnights&theirArmorWeaponsMedievalMonksKnightsoftheRoundTableJousts&TournamentsCathedralsTheCrusadesMonks&Nuns Aboystartsonhiswaytoknighthood

Boise State - HIST - 100

MiddleAgesofM iddleAgesofEuropeBy:Mrs.Sanders6thGradeBecomingaBecomingaKnightCastlesClothingKnights&theirArmorWeaponsMedievalMonksKnightsoftheRoundTableJousts&TournamentsCathedralsTheCrusadesMonks&Nuns Aboystartsonhiswaytoknighthood

Boise State - HIST - 100

Who were they?Where did they come from?What did they accomplish?Where did they go?The Minoans and MycenaeansMinoan civilization arose on the island of Crete.Legacy (or gift from the past) Their legacy was asmasters of the sea andgreat shipbuilder

Boise State - HIST - 100

Why did the Mormons Manage tosurvive Salt Lake City?The American WestThe Geographical Position of UtahBrigham Young Prior Planning Mormons had faith in theirleader Organisational Skills Setting up supply depotsand workshops No land ownership P

Boise State - HIST - 100

TheNormanYoke conquest castles war&waste forestlaw rebels&outlaws merrieEnglandHastings,13October1066ThedayEnglandacquiredanewroyaldynasty,anewaristocracy,anewChurch,anewlanguage,anewHaroldkilledDoverburntTheConquerorsfootprintsPlottedby

Boise State - HIST - 100

PEARLHARBORPEARLHARBORTHEDAYOFINFAMYDecember7,1941USSArizonaCausesCauses The U.S. demanded that Japan withdrawThefrom China and Indochinafrom Japan thought that attacking the U.S. wouldJapanprovide them an easy win, and a territorywith abund

Boise State - HIST - 100

The Drugs Revolution andthe Development ofPenicillinPenicillinPrevention and CurePreventionPrevention Germ theory VaccinationsSmallpox1906 Tuberculosis1913 DiphtheriaCureKoch discovered thatKochhe could stainbacteriabacteriaDevelopment o

Boise State - HIST - 100

2ndPresidentofthePhilippines1stPresidentoftheCommonwealthInofficeNovember15,1935August1,1944BornAugust19,1878)Baler,AuroraDiedAugust1,1944(aged65)SaranacLake,NewYork,UnitedStatesPoliticalpartySpouseOccupationReligionSignatureCoalicinNacion

Boise State - HIST - 100

ReaganomicsA fix to theunemployment andinflation of the time.ELECTING RONALDREAGANTWICE Ronald Reagan for Reagan/Bush '84http:/www.youtube.com/watch?v=SDMksN-ZTR4&feature= Ronald Reagan TV Ad: "Reaganomics"http:/www.youtube.com/watch`?v=GhgiOSgBE

Boise State - HIST - 100

Roman ContributionsRomanMr. ClutterLibrarianVillegas Middle SchoolRoman AqueductRomanAncient Rome had eleven majoraqueducts, built between 312 B.C.and 226 A.D.the longest (Anio Novus)was 59 miles long.Roman ArchitectRome Pantheon - DomeRound

Boise State - HIST - 100

TheRomanEmpireCaesarAugustus 63BC14AD Octavianwaswinnerof18yearscivilwar DesignatedheirofJuliusCaesar WasofthefamilyofCaesar(adopted)sohetookthenameCaesar GiventhenameAugustusbytheSenateCaesarAugustus BeginningtheEmpire MarkedbythereignofOctav

Boise State - HIST - 100

RomanRepublicFoundingofRomeTheRomanRace ThetaleofAeneas(TheAeneid) DidoatCarthageandothertravels(likeTheOdyssey) Historicalevidence Settlementsfrom11thCenturyBCFoundingofRomeTheRomanRace Wars against the Latins (like The Illiad) Rape of the Sabi

Boise State - HIST - 100

The13T he13OriginalsExploringthewho,when,where,andwhybehindthe13originalcoloniesofearlyAmerica.I nstructionalObjectives TLW:Identifythe13originalEnglishcolonies,whentheywerefounded,whoestablishedthem,andwhy. TLW:Completemapactivityrelatedtoestab