This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Statistical Techniques II
Matrix Algebra Handout (Part 1) Appendix 4 Supplemental
Page 165 A. MATRIX STRUCTURE AND NOTATION
1) A matrix is a rectangular arrangement of numbers. The matrix is usually denoted by a capital letter. A= LM4
1
D= M
MM3
N2 LM1 3OP
N7 9 Q OP
P
5P
0P
Q 2 4
6 0
0
3 2) The dimensions of a matrix are given by the number of rows and columns in the matrix (i.e. the
dimensions are r by c). For the matrices above,
A is 2 by 2
D is 4 by 3
3) The individual elements of a matrix can be referred to by specifying the row and column in which it
occurs. Lower case numbers are used to represent individual elements, and should match the upper
case letter used to denote matrix. For example, individual elements from matrices A and D above can
be referred to as,
a11 = 1
a21 = 7
d22 = 6
d12 = 2 B. TYPES OF MATRICES
1) Square matrix  the number of rows and columns are equal. Matrix A above is a square matrix (2 by
2), matrix D is not (4 by 3). A symmetric matrix is an important variation of the square matrix. In a
symmetric matrix, the value in position “ij” equals the value in position “ji” (where i j). For
example, if c31 = 5 then c13 is also 5.
2) Scalar  a single number can be thought of as a 1 by 1 matrix and is called a scalar.
3) Vector  a single column or single row of numbers is called a vector. The dimensions of a row
vector are (1 by c), where “c” is the number of columns, and the dimensions of a column vector (r by
1), where “r” is the number of rows.
4) Identity matrix  this special square matrix consists of all ones on the main diagonal, or principal
diagonal, and zeros in all the off diagonal positions. The following are examples of identity matrices, LM1
E = 0
MM0
N OP
P
1P
Q 0 0
1 0
0 LM1
0
F = M
MM0
N0 OP
P
0P
1P
Q 0 0 0
1 0 0
0 1
0 0 The diagonal matrix is a generalization of the identity matrix. A diagonal matrix can have any value on
the main diagonal, but also has zeros in the off diagonal positions. James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Algebra Handout (Part 1) Appendix 4 Supplemental
Page 166 C. MATRIX TRANSPOSE
The transpose of a matrix consists of a new matrix such that the rows of the original matrix become the
columns of the transpose matrix. The transpose matrix is denoted with the same letter as the original
matrix followed by a prime (e.g. the transpose of X is X). LM4
1
D= M
MM3
N2 OP
P
5P
0P
Q 2 4
6 0
0
3 LM4
D' = 2
MM4
N OP
PP
0Q 1 3 2
6 0 3
0 5 D. MATRIX ADDITION AND SUBTRACTION
Matrices to be added or subtracted must be of the same dimensions. Each element of the first matrix,
(a) is added (or subtracted) from the corresponding element of the second matrix, (b). LM1
A= 3
MM9
N OP
PP
0Q 2
4 LM 1
B= 1
MM4
N OP
PP
4Q LM 1 1
A+B = 3 1
MM9 4
N 4
4 LM
MM
N OP
PP
Q 2 4
2
44 = 4
04
5 OP
PP
4Q 2
8 E. MATRIX MULTIPLICATION
Multiplication by a scalar  in this type of multiplication each element of the matrix is simply
multiplied, element by element, by the scalar value. LM1
A= 3
MM9
N OP
P
0P
Q 2
4 LM1
A*B = 7* 3
MM9
N B = [7] OP
PP
Q LM
MM
N 2
7 14
4 = 21 28
0
63 0 OP
PP
Q Element by element multiplication  matrix multiplication is not usually done by matching each i,jth
element of one matrix with the corresponding ijth element of the second matrix. This is called
elementwise multiplication and it is not the normal mode of matrix multiplication and should not be
used unless specifically requested. The standard method of matrix multiplication requires that the number of columns in the first matrix
equal the number of rows in the second matrix. If the first matrix is (r by c) and the second is (r by c),
in order to multiply the matrices, c must equal r. The resulting matrix will have the dimensions (r by c).
Multiplication is accomplished by summing the cross products of each row of the first matrix and each
column of the second matrix. LM1
A = 3
MM9
N OP
P
0P
Q 2
4 X = LM1 2OP
N3 4 Q Since A is 3 rows by 2 columns, and X is 2 by 2, then the columns of the first matrix equals the rows
of the second matrix, and the matrices may be multiplied. LM1
A*X = 3
MM9
N OP L
PP MN
Q LM
MM
N OP
PP
Q LM
MM
N OP
PP
Q 2
(1*1) + (2 * 3) (1* 2) + (2 * 4)
5 10
1 2
4 *
= (3*1) + (4 * 3) (3* 2) + (4 * 4) = 15 10
3 4
0
(9 *1) + (0 * 3) (9 * 2) + (0 * 4)
9 18 OP
Q James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Algebra Handout (Part 1) Appendix 4 Supplemental
Page 167 the new dimensions for the product of A * X are, (3 x 2) (2 x 3) must be equal
x
new dimensions Note that though we can multiply A * X, we could not have done the multiplication the other way (i.e.
X * A), since the dimensions would not have matched. That is, we could premultiply by A, but could
not premultiply by X. F. SIMPLE MATRIX INVERSION (2 by 2 matrix only)
Matrices are not “divided”, but may be inverted. Instead of “dividing” A by B, one would multiply A
by the inverse of B. The inverse of a (2 by 2) matrix is given by, a b c d A–1 = A = d b 1 a×d  b×c c a The scalar value resulting from the calculation “(ad) – (bc)” is called the determinant. The matrix
cannot be inverted unless the inverse of the determinant exists (is defined). It will not exist in a case
such as the one below since (10) is not defined.
A = LM1 4OP
N2 8Q then 1
1
1 Determinant of A 1 8 2 4
0 a f a f This occurs in regression when two variables are linearly related.
An example of the inversion of a 2 * 2 matrix is given below.
B = LM2 3OP
N1 4Q B–1 = OP LM
Q N LM
a f a fN OP LM
Q N 4 3 1 4 3
0.8 0.6
1 0.2 0.4
2 4 1 3 1 2
5 1 2 OP
Q Note that a matrix times its inverse (i.e. B B–1) results in an identity matrix. By definition, the
inverse of a matrix G is a matrix which when multiplied by G produces an identity matrix, or GG–1=I. G. SIMPLE LINEAR REGRESSION
Solving a simple linear regression with matrices requires the same values used for an algebraic
solution from summation notation formulas. These are;
n Xi , n, i=1 n Yi ,
i=1 n Xi2 ,
i=1 n Yi2 ,
i=1 n X Y
i i i=1 where n is the size of the sample of data. To obtain these values in the matrix form we start with the
matrix equivalent of the individual values of X and Y, the raw data matrices. LM1
MM1
1
MM1
X
MM1
MM1
N1 OP
X
P
X P
X P
P
X P
P
X P
X P
Q
X1
2 3 4 5 6 7 LM Y OP
MMY PP
Y
MMY PP
Y
MMY PP
MMY PP
NY Q
1
2 3
4
5 6 7 James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Algebra Handout (Part 1) Appendix 4 Supplemental
Page 168 The column of ones is necessary, and represents the intercept. Omitting this column would force the
regression through the origin. The next step in the calculations is to obtain the XX, XY and YY
matrices. These calculations provide the sums of squares and cross products. LM1 X OP
MM1 X PP L
1 X
1 OM
PP = MM n
1 X
X PM
QM1 X P MM X
MM1 X PP N
MN1 X PQ
LMY OP
MMY PP L
O
Y
1 OM P M Y P
Y
X PM P M
QMY P = MM X Y PPP
Q
MMY PP N
MNY PQ
1 2 L1
XX M
NX 1 1
X2 1
1
X3 X 4 1
X5 3 1
X6 4 n 7 i 5 i 1 O
PP
X P
Q
n X P
i i 1
n 2
i i 1 6
7 1 2 X Y LM 1
NX 1 1
X2 1
X3 1
X4 1
X5 1
X6 n 3 i i 1
n 4 7 i 5 i i 1 6
7 YY Y1 Y2 Y3 Y4 Y5 Y6 Y1 Y 2 Y3 n
Y7 Y4 Yi2 Y5 i=1 Y6 Y 7 The regression coefficients, b0 and b1, are then given by, B = (XX)–1XY, where
(X X) 1 1 n X n X 2
i 2 i X i2 Xi X i n and since
1
Determinant XX 1 1 n X X 2
i i 2 1
nSXX where Sxx is the corrected sum of squares of X. Then X i2 nSXX
1
(X X) X i nSXX X i nSXX n nSXX James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Algebra Handout (Part 1) Appendix 4 Supplemental
Page 169 and the regression coefficients can be obtained by, LM X
aXXf XY MM nS X MN nS X i 2
i 1 XX XX i nSXX
n
nSXX OP
LM X Y
PP LM XYY OP = MM nS X PQ NM QP MN nS Y Yb1X X Y  Xi Yi
= i i
n
2 Xi 2 Xi n 2
i i i i XX i i i XX = Xi O
PP
PQ X Y P
i nSXX
n XiYi
nSXX i LM Y b X OP LY b XO Lb O
MM cX XhcY Yh PP = MM S PP = MNb PQ
N cX Xh Q N S Q
1 i 1 i XY 2 XX i 0
1 The remaining calculations usually needed to complete the compliment of calculations for the simple
linear regression is the sum of squared deviations or error term. The matrix formula is
SSE = YY – BXY = Y2 – [b0 b1] LM Y OP
NXYQ = Y2 – (b0*Y + b1*XY) = UCSSTotal – UCSSReg
These calculations produce the same algebraic equations for b0, b1, and SSE that are given in most
statistics texts. The advantage of using the matrix version of the formulas is that the matrix equations
given above will work equally well for multiple regression with two or more independent variables.
The ANOVA table calculated with matrix formulas is
Source
Regression
Error
Total Uncorrected
d.f.
2
n–2
n Sum of Squares
BXY
YY–BXY
YY Corrected
d.f.
1
n–2
n–1 where the correction factor is calculated as usual, CF bYg
n Sum of Squares
BXY – CF
YY – BXY
YY – CF 2 nY 2 . SSRegression B X Y  CF
, and is often expressed as a percent. SSTotal
Y Y  CF
Note that this calculation employs corrected sums of squares for both SSRegression and SSTotal. The value for R2 is calculated as The Mean Squares (MS) for the SSRegression and SSError are calculated by dividing the SS
(corrected sum of squares) by their d.f. (degrees of freedom). The test of hypotheses for [H0:= 1] is
then calculated as; BXYCF
MSRegression
dfReg
F=
=
MSTotal YYCF
dfError
or James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Algebra Handout (Part 1) t= bb 0g 1 Sb1 Appendix 4 Supplemental
Page 170 F value where S b is obtained from the VARIANCE COVARIANCE matrix.
1 The VARIANCE COVARIANCE matrix is calculated as from the (XX)–1 matrix. bX Xg LMNc
c
1 00 10 c01
c11 OP
Q where the cij values are called Gaussian multipliers. The VARIANCECOVARIANCE matrix is then
calculated from this matrix by multiplying by the MSError. b g LMNMSEc
MSEc MSE X X 1 00 10 MSEc01
MSEc11 OP
Q The individual values then provide the variances and covariances such that
MSE*c00 = Variance of b0 = VAR(b0)
MSE*c11 = Variance of b1 = VAR(b1), so Sb1 MSE*c11
MSE*c01 = MSE*c10 = Covariance of b0 and b0\1 = COV(b0,b1)
It is important to note that the variances and covariances calculated from the (XX)–1 are for the bi (i
estimates), not for the Xi values. Also, COV(b0,b1) COV(X0,X1). REFERENCE: Goodnight, J. H. 1978. The Sweep Operator: Its importance in statistical
computing. in Proc. Eleventh Annual Symposium on the INTERFACE. Gallant A. R. and T. M.
Gerig (ed), Inst. Statistics, N. C. State University, Raleigh, N. C. James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Algebra Handout (Part 1) Appendix 4 Supplemental
Page 171 Application of matrix procedures to multiple regression first requires calculation of the XX, XY and
YY matrices, where for dependent variable Y and independent variables X1 and X2. For a 2 factor
multiple regression, these matrices are;
n
n
n n n
X1i
X 2i
X 3i Yi i=1
i=1
i=1 i=1 n
n
n
n n 2 X1i X1i X 2i X1i X3i X1i X1i Yi n i=1
i=1
i=1 XY i=1 YY= Yi2 XX i=1
n
n
n n n i=1 X 22i X 2i X3i X 2i X1i X 2i X 2i Yi i=1
i=1
i=1 i=1 i=1 n
n
n
n n 2 X 3i X1i X 3i X 2i X 3i X 3i Yi X3i i=1
i=1
i=1 i=1 i=1 As with the simple linear regression, these sums, sums of squares and cross products are required by
any method of fitting multiple regression. Once these values are obtained, application of formulas for
an algebraic solution is relatively easy for a twofactor model. However, matrix procedures are more
easily expanded to more than two independent variables than are summation notation formulas.
The inversion technique we will use is called the sweepout technique, and it requires the application of
“row operations”. Row operations consist of (1) multiplying any row by a scalar value, and (2)
adding or subtracting any row from any other row. These are the only operations required to
complete the sweepout technique after the matrices have been obtained and augmented.
Obtaining a maximum of information from the technique requires reducing the XX matrix one column
at a time to an identity matrix. However, values of the regression coefficients, error sum of squares and
inverse matrix will be correct even of the row operations are not applied in a column by column
reduction.
By “sweeping” out each column of the XX matrix one by one to obtain an identity matrix, the
sequentially adjusted sums of squares error can also be obtained. This requires augmenting the XX
matrix with the XY matrix and an identity matrix prior to applying the row operations. The complete
augmented matrix is given below. The matrix has separate sections that are recognizable as matrices
seen earlier. This type of sectioned matrix is called a partitioned matrix.
X X X Y I
I
B (X X) 1
row operations
X Y Y Y 0
0 SSE
B
Sections of the matrix may be left off if less information is required. For example, if only the
regression coefficients are needed, then the sweepout technique need be applied only to the matrix, XX XY row operations I B ,
and if only the inverse is required, the only matrix needed is
X X I row operations I (X X) 1 . LM
N OP
Q LM
N OP
Q The regression coefficients and sum of squares error can be obtained by sweeping out the matrix,
XX XY
I B
row operations
.
XY YY
0 SSE
If the above matrix is swept out column by column, then it will also provide the sequentially adjusted
sums of squares. Only the use of the complete augmented matrix provides the inverted XX matrix
necessary to obtain the variance  covariance matrix, confidence limits and other types of sums of
squares.
The technique will be illustrated with an example using data from Snedecor and Cochran (1981; ex.
17.2.1). The example will employ the complete augmented matrix. The original data matrices are; LM
N OP
Q LM
N OP
Q James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Algebra Handout (Part 1) LM 17
XX = M188.2
MN 700 Appendix 4 OP
PP
31712 Q 188.2
700
3602.78 8585.1
8585.1 LM 1295 OP
XY= M16203.8P
MN 54081 PQ Supplemental
Page 172 YY= [103075] The augmented matrix to be swept is then, 17
188.2
700
1295 1 0 0 188.2 3602.78 8585.1 16203.8 0 1 0 700 8585.1 31712 54081 0 0 1 1295 16203.8 54081 103075 0 0 0 The first step in the sweepout technique is to multiply through the first row by the inverse of 17. This
will result in a value of 1 in the first row  first column. A multiple of this new first row is then
subtracted from each of the other rows (2, 3 and 4). The multiplier should be such that value(i,1)–
[value(1,1)*multiplier] = 0 for i ≠ 1.
The multiplier which accomplishes this is simply the value(i,1) since the new value(1,1) is unity (1).
Therefore, every value(i,j) will be processed in the same way. The calculations would be,
for row 2: value(2,j) – (value(1,j) * 118.2)
for row 3: value(3,j) – (value(1,j) * 700)
for row 4: value(4,j) – (value(1,j) * 1295)
After applying these transformations we obtain the following matrix, COLUMN 1 SWEEP 1 11.0706 41.1765 76.1765 0.05882 0 0 0 1519.30 835.688 1867.39 11.0706 1 0 0 835.688 2888.47 757.471 41.1765 0 1 0 1867.39 757.471 4426.47 76.1765 0 0 At this point the effect of X (the intercept) has been removed from the model. The value replacing
YY is 4426.471. This is the corrected sum of squares of Y (i.e. Y was 103075, and has now been
corrected for the mean, yielding 4426.47).
The sweepout now proceeds to the second column. A value of 1 is needed in the second column second row to proceed with the development of the identity matrix. This is obtained by multiplying
through the second row by the inverse of the value presently in that position (i.e. 1519.30). Then, the
appropriate multiple of the new row 2 is subtracted from each of the other rows. Note that the first
column remains unchanged since the value subtracted is always a multiple of zero. COLUMN 2 SWEEP 1 0 35.08709 62.5694 0.13949 0.00729 0 0 1 0.550050 1.22911 0.00729 0.00066 0 0 0 2428.800 269.686 35.0871 0.55005 1 0 0 269.686 2131.236 62.5694 1.22911 0 The sweep then proceeds with the third column. Once again a value of 1 is required in row 3, column
3, and all rows other than row 3 will have a multiple of row 3 subtracted from them.
James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Algebra Handout (Part 1) Appendix 4 Supplemental
Page 173 COLUMN 3 SWEEP 1 0 0 66.4654 0.646369 0.000660 0.014446 0 1 0 1.29019 0.000660 0.000783 0.000226 0 0 1 0.11104 0.014446 0.000226 0.000412 0 0 0 2101.291 66.46541 1.290191 0.11104 Once this swept out matrix has been obtained, most commonly desired calculations follow easily.
Some of these results are discussed below.
There are also several checks which can be done on the calculations. As the matrix is swept out, the
null matrix (matrix of zeroes in the original augmented matrix) is replaced by the negative values of
the regression coefficients if the calculations have been done correctly. As a second check, the product
of the original XX matrix and its inverse should produce an identity matrix. (i.e. XX * (XX)1 = I ) REGRESSION COEFFICIENTS
The regression coefficients are produced during the sweepout, replacing the XY matrix. The model for
the analysis above is,
ˆ
Y = b0 + b1X1i + b 2 X 2i
ˆ
Y = 66.4654 + 1.2902X1i + 0.1110X 2i SEQUENTIALLY ADJUSTED SUMS OF SQUARES
As each column is swept out, the sums of squares are “adjusted” for the factor removed. The first
sweep adjusts for the intercept (i.e. 1 = n) on the diagonal of XX, so the reduction in the Y is the
correction factor or the adjustment for the mean.
e.g. C.F. = 103075 4426.470 = 98648.530
The second sweep adjusts for the second term in the X matrix, usually X, and the reduction in the error
term is that sum of squares attributable to X (given that X is already in the model).
e.g. SS(XX) = 4426.470 2131.236 = 2295.234
The third sweep adjusts for X and the reduction in the sum of squares is attributable to X (given that X
and X are already in the model).
e.g. SS(XX X) = 2131.236 2101.291 = 29.945
Finally, the remaining sum of squares is the error sum of squares
SSE = 2101.291
Note that since the variables are adjusted sequentially, the sums of squares obtained are dependent on
the order in which the variables are entered. That is, if we had entered X first and X second, the sums
of squares attributable to these two variables would not be the same as the results obtained above.
Only the correction factor would be the same (since it would have been entered first in both models).
Each adjustment of the sum of squares takes one degree of freedom. The residual sum of squares has
(nk) degrees of freedom, where n is the number of observations, and k is the number of sweeps, or the
number of columns in the XX matrix. The mean square error is then,
SSE
2101.291
MSE =
=
= 150.092
(nk)
(17  3) PARTIAL SUMS OF SQUARES
Since the sequentially adjusted sums of squares are dependent on the order in which the variables are
entered, another value of interest is the partial sum of squares or the uniquely attributable sum of
squares. This is simply the sum of squares that would be accounted for by each variable if it had been
entered into the model in last place. This value could be obtained by reversing the sweep operation,
and observing the change in the sum of squares as each variable was swept back into the model.
The only change in sum of squares when a variable is swept back into the model is, bc,
James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Algebra Handout (Part 1) Appendix 4 Supplemental
Page 174 So this calculation will give the partial SS due to variable X without actually doing all the calculations
necessary to reverse the sweepout technique. The elements (c) are obtained from the (XX)1 matrix
and are called Gaussian multipliers.
The partial SS due to X above does not change since it was the variable in the last position. The partial
SS due to X would be calculated as,
(1.29019)2
SS(X1X 0 X 2 ) =
= 2125.913
(0.000783) VARIANCE COVARIANCE MATRIX
Another major result of the sweepout technique is the inverse of the XX matrix. Multiplying this
matrix by the mean square error (MSE) gives the variance  covariance matrix of the regression
coefficients.
e.g.
VarCov = MSE * (XX)1 = 0.64637 0.00066 0.01445 97.0149 0.0990 2.1683 150.092 0.00066 0.00078 0.00023 0.0990 0.1175 0.0340 0.01445 0.00023 0.00041 2.1683 0.0340 0.0618 so, Var(b0)=97.0149, Var(b1)=0.1175, Var(b2)=0.0618, Var(b12)=0.0340, etc.
ˆ
The variance  covariance matrix can also be used to obtain confidence intervals about estimates of Y
for particular values of X and X. The most versatile approach is to use matrix algebra in these
calculations. The equation is
S2ˆ = MSE (L(X X) 1L)
Y
ˆ
where L is a vector of values for X corresponding to Y. It may also be a vector of hypothesized X
values for which a variance is needed.
ˆ
For example, if we wish to predict the response ( Y) and its variance when X = 4 and X = 24, first we
would calculate the response,
ˆ
Y=66.4654+1.2902X1i +0.1110X 2i 66.4654+1.2902(4)+0.1110(24) 68.9622 Using L = [ 1 4 24 ], (note that a 1 is included for the intercept) the variance of the estimate is then, 0.64637 0.00066 0.01445 1 2
SY 150.092[1 4 24] 0.00066 0.00078 0.00023 4 = 24.6782
ˆ 0.01445 0.00023 0.00041 24 and the standard error is 2 4 .6 7 8 2 = 4.9677.
The sweepout technique is not the only method of matrix inversion. However, its application to the
augmented matrix described above is a relatively simple and versatile method of obtaining most of the
results commonly desired from a multiple regression analysis. REFERENCE: Goodnight, J. H. 1978. The Sweep Operator: Its importance in statistical
computing. in Proc. Eleventh Annual Symposium on the INTERFACE. Gallant A. R. and T. M.
Gerig (ed), Inst. Statistics, N. C. State University, Raleigh, N. C. James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Applications Appendix 5
Sweepout Example Supplemental
Page 175 Three factor multiple regression from Snedecor and Cochran (1967), table 13.10.1, page 405.
Y = estimated plant available phosphorus in the soil (20 C)
X1 = inorganic phosphorus
X2 = organic phosphorus soluble in K2CO3 and hydrolized by hypobromite
X3= organic phosphorus soluble in K2CO3 and NOT hydrolized by hypobromite
All least squares regression analyses start with the same three matrices.
X= X´X = Y´Y = 1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
18
215
758
2214 0.4
0.4
3.1
0.6
4.7
1.7
9.4
10.1
11.6
12.6
10.9
23.1
23.1
21.6
23.1
1.9
26.8
29.9
215
4321.02
10139.5
27645 53
23
19
34
24
65
44
31
29
58
37
46
50
44
56
36
58
51 158
163
37
157
59
123
46
117
173
112
111
114
134
73
168
143
202
124 Y= 758
10139.5
35076
96598 2214
27645
96598
307894 64
60
71
61
54
77
81
93
93
51
76
96
77
93
95
54
168
99
X´Y = 1463
20706.2
63825
187542 [ 131299 ] Create a fully augmented matrix of the form;
X´X
(X´Y)´ X´Y
Y´Y I
0 James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Applications Appendix 5
Sweepout Example Supplemental
Page 176 The resulting matrix contains;
X0 X1 X2 X3 X´Y c0 c1 c2 c3 n
X1
X2
X3 X1
X12
X1X2
X1X3 X2
X1X2
X22
X2X3 X3
X1X3
X2X3
X32 Y
X1Y
X2Y
X3Y 1
0
0
0 0
1
0
0 0
0
1
0 0
0
0
1 Y X1Y X2Y X3Y Y2 0 0 0 0 Numerically for this problem given previously the matrix is;
X0 X1 X2 X3 18
215
758
2214 215
4321.02
10139.5
27645 758
10139.5
35076
96598 2214
27645
96598
307894 1463 20706.2 63825 187542 X´Y c0 c1 c2 c3 1463
20706.2
63825
187542 1
0
0
0 0
1
0
0 0
0
1
0 0
0
0
1 131299 0 0 0 0 The first step (divide row 1 by value1,1) in the sweepout technique produces,
X0 X1 X2 X3 X´Y c0 c1 c2 c3 1
215 11.944444
4321.02 42.111111
10139.5 123
27645 81.277778
20706.2 0.055556
0 0
1 0
0 0
0 758 10139.5 35076 96598 63825 0 0 1 0 2214 27645 96598 307894 187542 0 0 0 1 1463 20706.2 63825 187542 131299 0 0 0 0 And after sweeping out the first column (subtracting a multiple of row 1 from all other rows) we
have;
X0 X1 X2 X3 X´Y c0 c1 c2 c3 1
0 11.944444
1752.964444 42.111111
1085.611111 123
1200 81.277778
3231.477778 0.055556
11.944444 0
1 0
0 0
0 0 1085.611111 3155.777778 3364 2216.444444 42.111111 0 1 0 0 1200 3364 35572 7593 123 0 0 1 0 3231.477778 2216.444444 7593 12389.61111 81.277778 0 0 0 We start the second column sweep by dividing row 2 by value2,2,
X0 X1 X2 X3 X´Y c0 c1 c2 c3 1
0 11.944444
1 42.111111
0.6193 123
0.684555 81.277778
1.843436 0.055556
0.006814 0
0.00057 0
0 0
0 0 1085.611111 3155.777778 3364 2216.444444 42.111111 0 1 0 0 1200 3364 35572 7593 123 0 0 1 0 3231.477778 2216.444444 7593 12389.61111 81.277778 0 0 0 James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Applications Appendix 5
Sweepout Example Supplemental
Page 177 and finish sweeping the second column to obtain;
X0 X1 X2 X3 X´Y c0 c1 c2 c3 1
0 0
1 34.713915
0.6193 114.823375
0.684555 59.258959
1.843436 0.136943
0.006814 0.006814
0.00057 0
0 0
0 0 0 2483.458674 2620.839842 215.189831 34.713915 0.6193 1 0 0 0 2620.839842 34750.53439 5380.87679 114.823375 0.684555 0 1 0 0 215.189831 5380.87679 6432.588616 59.258959 1.843436 0 0 The third column starts with,
X0 X1 X2 X3 X´Y c0 c1 c2 c3 1
0 0
1 34.713915
0.6193 114.823375
0.684555 59.258959
1.843436 0.136943
0.006814 0.006814
0.00057 0
0 0
0 0 0 1 1.055318 0.086649 0.013978 0.000249 0.000403 0 0 0 2620.839842 34750.53439 5380.87679 114.823375 0.684555 0 1 0 0 215.189831 5380.87679 6432.588616 59.258959 1.843436 0 0 and after being swept out produces,
X0 X1 X2 X3 X´Y c0 c1 c2 c3 1
0 0
1 0
0 78.189139
0.030996 56.251024
1.789774 0.622176
0.001843 0.001843
0.000725 0.013978
0.000249 0
0 0 0 1 1.055318 0.086649 0.013978 0.000249 0.000403 0 0 0 0 31984.71367 5153.782984 78.189139 0.030996 1.055318 1 0 0 0 5153.782984 6413.942579 56.251024 1.789774 0.086649 0 Finally the fourth column in the X´X matrix is started and swept out,
X0 X1 X2 X3 X´Y c0 c1 c2 c3 1
0 0
1 0
0 78.189139
0.030996 56.251024
1.789774 0.622176
0.001843 0.0018428
0.0007249 0.0139781
0.0002494 0
0 0 0 1 1.055318 0.086649 0.013978 0.0002494 0.0004027 0 0 0 0 1 0.161133 0.002445 0.0000010 0.0000330 0.000031 0 0 0 5153.782984 6413.942579 56.251024 1.7897741 0.0866492 0 c1 c2 and the final result is;
X0 X1 X2 X3 X´Y c0 c3 1
0
0
0 0
1
0
0 0
0
1
0 0
0
0
1 43.652198
1.78478
0.083397
0.161133 0.813316
0.001919
0.011398
0.002445 0.0019185
0.0007249
0.0002483
0.0000010 0.0113982
0.0002483
0.0004375
0.0000330 0.002445
0.000001
0.000033
0.000031 0 0 0 0 5583.499658 43.652198 1.7847797 0.0833971 0.161133 James P. Geaghan  Copyright 2011 Statistical Techniques II
Matrix Applications Appendix 5
Sweepout Example Supplemental
Page 178 The resulting matrix is of the form;
I
0 (X´X)1
B´ B
SSE and contains the values
X0 X1 X2 X3 X´Y c0 c1 c2 c3 1
0
0
0 0
1
0
0 0
0
1
0 0
0
0
1 b0
b1
b2
b3 c00
c10
c20
c30 c01
c11
c21
c31 c02
c12
c22
c32 c03
c13
c23
c33 0 0 0 0 SSE b0 b1 b2 b3 The solution to the regression equation is then,
Yi = 43.652 + 1.785X1i  0.083X2i + 0.161X3i + e
The sums of squares are given by the sequential reduction in the YY matrix
MATRIX Y´Y VALUE INTERPRETATION of the
REPLACEMENT VALUE DIFFERENCE from
PREVIOUS VALUE Original
Col 1 sweep
Col 2 sweep
Col 3 sweep
Col 4 Sweep 131299
12389.6111
6432.5886
6413.9426
5583.4997 Y2 (uncorrected)
Y2(Y)2/n = SSYX0
SSYX0,X1
SSYX0,X1,X2
SSYX0,X1,X2,X3 = SSE 118909.3840
5957.0225
18.6460
830.4429 INTERPRETATIO
N
of the
DIFFERENCE (Y)2/n = C.F.
SeqSSX1
SeqSSX2
SeqSSX3 Partial sums of squares, or fully adjusted sums of squares, are given by bk 2
PARTIAL SS =
ckk
Partial SSX1 = b12/c11 = 1.78482 / 0.0007249 = 4394.1523
Partial SSX2 = b22/c22 = 0.083402 / 0.0004375 = 15.8979
Partial SSX3 = b32/c33 = 0.16112 / 0.00003127 = 830.4429
Recall that I number the positions in the X´X matrix differently, from k = 0, 1, ... , p (where p is
the number of parameters excluding the intercept) instead of starting at 1 as other matrices.
This is done in order to be able to associate the matrix position with the regression coefficient
subscript. James P. Geaghan  Copyright 2011 ...
View
Full
Document
This note was uploaded on 12/29/2011 for the course EXST 7015 taught by Professor Wang,j during the Fall '08 term at LSU.
 Fall '08
 Wang,J

Click to edit the document details