*This preview shows
page 1. Sign up
to
view the full content.*

**Unformatted text preview: **i i i “main”
2007/2/16
page 189
i CHAPTER 3
Determinants
Mathematics is the gate and key to the sciences. — Roger Bacon In this chapter, we introduce a basic tool in applied mathematics, namely the determinant of a square matrix. The determinant is a number, associated with an n × n matrix
A, whose value characterizes when the linear system Ax = b has a unique solution
(or, equivalently, when A−1 exists). Determinants enjoy a wide range of applications,
including coordinate geometry and function theory.
Sections 3.1–3.3 give a detailed introduction to determinants, their properties, and
their applications. Alternatively, Section 3.4, “Summary of Determinants,” can provide a
nonrigorous and much more abbreviated introduction to the fundamental results required
in the remainder of the text. We will see in later chapters that determinants are invaluable
in the theory of eigenvalues and eigenvectors of a matrix, as well as in solution techniques
for linear systems of differential equations. 3.1 The Deﬁnition of the Determinant
We will give a criterion shortly (Theorem 3.2.4) for the invertibility of a square matrix A
in terms of the determinant of A, written det(A), which is a number determined directly
from the elements of A. This criterion will provide a ﬁrst extension of the Invertible
Matrix Theorem introduced in Section 2.8.
To motivate the deﬁnition of the determinant of an n × n matrix A, we begin with
the special cases n = 1, n = 2, and n = 3.
Case 1: n = 1. According to Theorem 2.6.5, the 1 × 1 matrix A = [a11 ] is invertible
if and only if rank(A) = 1, if and only if the 1 × 1 determinant, det(A), deﬁned by
det (A) = a11
is nonzero.
189 i i i i i i i “main”
2007/2/16
page 190
i 190 CHAPTER 3 Determinants Case 2: n = 2. According to Theorem 2.6.5, the 2 × 2 matrix
A= a11 a12
a21 a22 is invertible if and only if rank(A) = 2, if and only if the row-echelon form of A has two
nonzero rows. Provided that a11 = 0, we can reduce A to row-echelon form as follows: a12
1
∼
a12 a21 .
0 a22 −
a11 a11 a12
a21 a22 a11 1. A12 − a21
a11 a12 a21
= 0, or that a11 a22 − a12 a21 = 0.
a11
Thus, for A to be invertible, it is necessary that the 2 × 2 determinant, det(A), deﬁned
by For A to be invertible, it is necessary that a22 − det (A) = a11 a22 − a12 a21 (3.1.1) be nonzero. We will see in the next section that this condition is also sufﬁcient for the
2 × 2 matrix A to be invertible.
Case 3: n = 3. According to Theorem 2.6.5, the 3 × 3 matrix a11 a12 a13
A = a21 a22 a23 a31 a32 a33 is invertible if and only if rank(A) = 3, if and only if the row-echelon form of A has
three nonzero rows. Reducing A to row-echelon form as in Case 2, we ﬁnd that it is
necessary that the 3 × 3 determinant deﬁned by
det (A) = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a11 a23 a32 − a12 a21 a33 − a13 a22 a31
(3.1.2)
be nonzero. Again, in the next section we will prove that this condition on det(A) is also
sufﬁcient for the 3 × 3 matrix A to be invertible.
To generalize the foregoing formulas for the determinant of an n × n matrix A,
we take a closer look at their structure. Each determinant above consists of a sum of n!
products, where each product term contains precisely one element from each row and
each column of A. Furthermore, each possible choice of one element from each row and
each column of A does in fact occur as a term of the summation. Finally, each term is
assigned a plus or a minus sign. Based on these observations, the appropriate way in
which to deﬁne det(A) for an n × n matrix would seem to be to add up all possible
products consisting of one element from each row and each column of A, with some
condition on which products are taken with a plus sign and which with a minus sign. To
describe this condition, we digress to discuss permutations. i i i i i i i “main”
2007/2/16
page 191
i 3.1 The Deﬁnition of the Determinant 191 Permutations
Consider the ﬁrst n positive integers 1, 2, 3, . . . , n. Any arrangement of these integers
in a speciﬁc order, say, (p1 , p2 , . . . , pn ), is called a permutation.
Example 3.1.1 There are precisely six distinct permutations of the integers 1, 2 and 3:
(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1). More generally, we have the following result:
Theorem 3.1.2 There are precisely n! distinct permutations of the integers 1, 2, . . . , n.
The proof of this result is left as an exercise.
The elements in the permutation (1, 2, . . . , n) are said to be in their natural increasing order. We now introduce a number that describes how far a given permutation is
from its natural order. For i = j , the pair of elements pi and pj in the permutation
(p1 , p2 , . . . , pn ) are said to be inverted if they are out of their natural order; that is,
if pi > pj with i < j . If this is the case, we say that (pi , pj ) is an inversion. For
example, in the permutation (4, 2, 3, 1), the pairs (4, 2), (4, 3), (4, 1), (2, 1), and (3, 1)
are all out of their natural order. Consequently, there are a total of ﬁve inversions in this
permutation. In general we let N(p1 , p2 , . . . , pn ) denote the total number of inversions
in the permutation (p1 , p2 , . . . , pn ). Example 3.1.3 Find the number of inversions in the permutations (1, 3, 2, 4, 5) and (2, 4, 5, 3, 1). Solution: The only pair of elements in the permutation (1, 3, 2, 4, 5) that is out of
natural order is (3, 2), so N(1, 3, 2, 4, 5) = 1.
The permutation (2, 4, 5, 3, 1) has the following pairs of elements out of natural
order: (2, 1), (4, 3), (4, 1), (5, 3), (5, 1), and (3, 1). Thus, N(2, 4, 5, 3, 1) = 6.
It can be shown that the number of inversions gives the minimum number of adjacent
interchanges of elements in the permutation that are required to restore the permutation to
its natural increasing order. This justiﬁes the claim that the number of inversions describes
how far from natural order a given permutation is. For example, N(3, 2, 1) = 3, and the
permutation (3, 2, 1) can be restored to its natural order by the following sequence of
adjacent interchanges:
(3, 2, 1) → (3, 1, 2) → (1, 3, 2) → (1, 2, 3).
The number of inversions enables us to distinguish two different types of permutations as follows. DEFINITION 3.1.4
1. If N(p1 , p2 , . . . , pn ) is an even integer (or zero), we say (p1 , p2 , . . . , pn )
is an even permutation. We also say that (p1 , p2 , . . . , pn ) has even parity.
2. If N(p1 , p2 , . . . , pn ) is an odd integer, we say (p1 , p2 , . . . , pn ) is an
odd permutation. We also say that (p1 , p2 , . . . , pn ) has odd parity. i i i i i i i “main”
2007/2/16
page 192
i 192 CHAPTER 3 Determinants Example 3.1.5 The permutation (4, 1, 3, 2) has even parity, since we have N(4, 1, 3, 2) = 4, whereas
(3, 2, 1, 4) is an odd permutation since N(3, 2, 1, 4) = 3.
We associate a plus or a minus sign with a permutation, depending on whether it has
even or odd parity, respectively. The sign associated with the permutation (p1 , p2 , . . . , pn )
can be speciﬁed by the indicator σ (p1 , p2 , . . . , pn ), deﬁned in terms of the number of
inversions as follows: σ (p1 , p2 , . . . , pn ) = +1
−1 if (p1 , p2 , . . . , pn ) has even parity,
if (p1 , p2 , . . . , pn ) has odd parity. Hence,
σ (p1 , p2 , . . . , pn ) = (−1)N (p1 ,p2 ,...,pn ) . Example 3.1.6 It follows from Example 3.1.3 that
σ (1, 3, 2, 4, 5) = (−1)1 = −1,
whereas
σ (2, 4, 5, 3, 1) = (−1)6 = 1.
The proofs of some of our later results will depend upon the next theorem. Theorem 3.1.7 If any two elements in a permutation are interchanged, then the parity of the resulting
permutation is opposite to that of the original permutation. Proof We ﬁrst show that interchanging two adjacent terms in a permutation changes
its parity. Consider an arbitrary permutation (p1 , . . . , pk , pk +1 , . . . , pn ), and suppose
we interchange the adjacent elements pk and pk +1 . Then
• If pk > pk +1 , then
N(p1 , p2 , . . . , pk +1 , pk , . . . , pn ) = N(p1 , p2 , . . . , pk , pk +1 , . . . , pn ) − 1,
• If pk < pk +1 , then
N(p1 , p2 , . . . , pk +1 , pk , . . . , pn ) = N(p1 , p2 , . . . , pk , pk +1 , . . . , pn ) + 1,
so that the parity is changed in both cases.
Now suppose we interchange the elements pi and pk in the permutation (p1 , p2 , . . . ,
pi , . . . , pk , . . . , pn ). Note that k − i > 0. We can accomplish this by successively
interchanging adjacent elements. In moving pk to the i th position, we perform k − i
interchanges involving adjacent terms, and the resulting permutation is
(p1 , p2 , . . . , pk , pi , . . . , pk −1 , pk +1 , . . . , pn ).
Next we move pi to the k th position. A moment’s thought shows that this requires
(k − i) − 1 interchanges of adjacent terms. Thus, the total number of adjacent interchanges
involved in interchanging the elements pi and pk is 2(k − i) − 1, which is always i i i i i i i “main”
2007/2/16
page 193
i 3.1 The Deﬁnition of the Determinant 193 an odd integer. Since each adjacent interchange changes the parity, the permutation
resulting from an odd number of adjacent interchanges has opposite parity to the original
permutation.
At this point, we are ready to see how permutations can facilitate the deﬁnition
of the determinant. From the expression (3.1.2) for the 3 × 3 determinant, we see that
the row indices of each term have been arranged in their natural increasing order and
that the column indices are each a permutation (p1 , p2 , p3 ) of 1, 2, 3. Further, the sign
attached to each term coincides with the sign of the permutation of the corresponding
column indices; that is, σ (p1 , p2 , p3 ). These observations motivate the following general
deﬁnition of the determinant of an n × n matrix: DEFINITION 3.1.8
Let A = [aij ] be an n × n matrix. The determinant of A, denoted det(A), is deﬁned
as follows:
det (A) = σ (p1 , p2 , . . . , pn )a1p1 a2p2 a3p3 · · · anpn , (3.1.3) where the summation is over the n! distinct permutations (p1 , p2 , . . . , pn ) of the
integers 1, 2, 3, . . . , n. The determinant of an n × n matrix is said to have order n.
We sometimes denote det(A) by
a11 a12
a21 a22
..
..
..
an1 an2 . . . a1 n
. . . a2 n
. ..
..
..
. . . ann Thus, for example, from (3.1.1), we have
a11 a12
= a11 a22 − a12 a21 .
a21 a22
Example 3.1.9 Use Deﬁnition 3.1.8 to derive the expression for the determinant of order 3. Solution: When n = 3, (3.1.3) reduces to
det (A) = σ (p1 , p2 , p3 )a1p1 a2p2 a3p3 , where the summation is over the 3! = 6 permutations of 1, 2, 3. It follows that the six
terms in this summation are
a11 a22 a33 , a11 a23 a32 , a12 a21 a33 , a12 a23 a31 , a13 a21 a32 , a13 a22 a31 , so that
det (A) = σ (1, 2, 3)a11 a22 a33 + σ (1, 3, 2)a11 a23 a32 + σ (2, 1, 3)a12 a21 a33
+ σ (2, 3, 1)a12 a23 a31 + σ (3, 1, 2)a13 a21 a32 + σ (3, 2, 1)a13 a22 a31 .
To obtain the values of each σ (p1 , p2 , p3 ), we determine the parity for each permutation
(p1 , p2 , p3 ). We ﬁnd that
σ (1, 2, 3) = +1,
σ (2, 3, 1) = +1, σ (1, 3, 2) = −1,
σ (3, 1, 2) = +1, σ (2, 1, 3) = −1,
σ (3, 2, 1) = −1. i i i i i i i “main”
2007/2/16
page 194
i 194 CHAPTER 3 Determinants Hence, a11 a12 a13 a11 a12 a21 a22 a23 a21 a22 a31 a32 a33 a31 a11 a12 a13
det (A) = a21 a22 a23
a31 a32 a33 a32 Figure 3.1.1: A schematic for
obtaining the determinant of a
3 × 3 matrix A = [aij ]. Example 3.1.10 = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a11 a23 a32 − a12 a21 a33 − a13 a22 a31 . A simple schematic for obtaining the terms in the determinant of order 3 is given in
Figure 3.1.1. By taking the product of the elements joined by each arrow and attaching
the indicated sign to the result, we obtain the six terms in the determinant of the 3 × 3
matrix A = [aij ]. Note that this technique for obtaining the terms in a determinant does
not generalize to determinants of n × n matrices with n > 3.
Evaluate
(a) | − 3|. 3 −2
.
14 (b) (c) 1 2 −3
4 −1 2 .
031 Solution:
(a) | − 3| = −3. In the case of a 1 × 1 matrix, the reader is cautioned not to confuse
the vertical bars notation for the determinant with absolute value bars.
(b) 3 −2
= (3)(4) − (−2)(1) = 14.
14 (c) In this case, the schematic in Figure 3.1.1 is
1 2 −3
4 −1 2
031 12
4 −1
03 so that
1 2 −3
4 −1 2 = (1)(−1)(1) + (2)(2)(0)
031
+ (−3)(4)(3) − (0)(−1)(−3) − (3)(2)(1) − (1)(4)(2) = −51. We now to turn to some geometric applications of the determinant. Geometric Interpretation of the Determinants of Orders Two
and Three
If a and b are two vectors in space, we recall that their dot product is the scalar
a · b = ||a|| ||b|| cos θ, (3.1.4) where θ is the angle between a and b, and ||a|| and ||b|| denote the lengths of a and b,
respectively. On the other hand, the cross product of a and b is the vector
a × b = ||a|| ||b|| sin θ n, (3.1.5) i i i i i i i “main”
2007/2/16
page 195
i 3.1 The Deﬁnition of the Determinant 195 where n denotes a unit vector1 that is perpendicular to the plane of a and b and chosen in
such a way that {a, b, n} is a right-handed set of vectors. If i, j, k denote the unit vectors
pointing along the positive x -, y - and z-axes, respectively, of a rectangular Cartesian
coordinate system and a = a1 i + a2 j + a3 k, b = b1 i + b2 j + b3 k, then Equation (3.1.5)
can be expressed in component form as
a × b = (a2 b3 − a3 b2 )i + (a3 b1 − a1 b3 )j + (a1 b2 − a2 b1 )k. (3.1.6) This can be remembered most easily in the compact form
ijk
a × b = a1 a2 a3 ,
b1 b2 b3
whose validity is readily checked by using the schematic in Figure 3.1.1. We will use
the equations above to establish the following theorem.
Theorem 3.1.11
1. The area of a parallelogram with sides determined by the vectors a = a1 i + a2 j
and b = b1 i + b2 j is
Area = | det (A)|,
a1 a2
.
b1 b2 where A = 2. The volume of a parallelepiped determined by the vectors a = a1 i + a2 j + a3 k,
b = b1 i + b2 j + b3 k, c = c1 i + c2 j + c3 k is Volume = | det (A)|, a1 a2 a3
where A = b1 b2 b3 .
c1 c2 c3
Before presenting the proof of this theorem, we make some remarks and give two
examples. Remarks
1. The vertical bars appearing in the formulas in Theorem 3.1.11 denote the absolute
value of the number det (A).
2. We see from the expression for the volume of a parallelepiped that the condition
for three vectors to lie in the same plane (i.e., the parallelepiped has zero volume)
is that det(A) = 0. This will be a useful result in the next chapter. Example 3.1.12 Find the area of the parallelogram containing the points (0, 0), (1, 2), (3, 4) and (4, 6). Solution: The sides of the parallelogram are determined by the vectors a = i + 2j
and b = 3i + 4j. According to part 1 of Theorem 3.1.11, the area of the parallelogram is
det 12
34 = |(1)(4) − (2)(3)| = | − 2| = 2. 1 A unit vector is a vector of length 1. i i i i i i i “main”
2007/2/16
page 196
i 196 CHAPTER 3 Determinants Example 3.1.13 Determine whether or not the vectors a = i + 2j + 3k, b = 4i + 5j + 6k, and c =
−5i + (−7)j + (−9)k lie in a single plane in 3-space. Solution: By Remark 2 above, it sufﬁces to determine whether or not the volume of
the parallelepiped determined by the three vectors is zero or not. To do this, we use part
2 of Theorem 3.1.11: 123
Volume = det 4 5 6 −5 −7 −9
(1)(5)(−9) + (2)(6)(−5) + (3)(4)(−7)
= 0,
−(−5)(5)(3) − (−7)(6)(1) − (−9)(4)(2) = which shows that the three vectors do lie in a single plane.
Now we turn to the Proof of Theorem 3.1.11:
1. The area of the parallelogram is
area = (length of base) × (perpendicular height).
From Figure 3.1.2, this can be written as
Area = ||a||h = ||a|| ||b|| | sin θ | = ||a × b||. (3.1.7) y b a h
x Figure 3.1.2: Determining the area of a parallelogram. Since the k components of a and b, a3 and b3 , are both zero (since the vectors lie
in the xy -plane), substitution from Equation (3.1.6) yields
Area = ||(a1 b2 − a2 b1 )k|| = |a1 b2 − a2 b1 | = | det (A)|.
2. The volume of the parallelepiped is
Volume = (area of base) × (perpendicular height).
The base is determined by the vectors b and c (see Figure 3.1.3), and its area
can be written as ||b × c||, in similar fashion to what was done in (3.1.7). From
Figure 3.1.3 and Equation (3.1.4), we therefore have i i i i i i i “main”
2007/2/16
page 197
i 3.1 The Deﬁnition of the Determinant 197 Volume = ||b × c||h = ||b × c|| ||a||| cos ψ | = ||b × c|| |a · n|,
where n is a unit vector that is perpendicular to the plane containing b and c. We
can now use Equations (3.1.5) and (3.1.6) to obtain
Volume = ||b × c|| ||a|| | cos ψ | = |a · (b × c)|
= (a1 i + a2 j + a3 k) · (b2 c3 − b3 c2 )i + (b3 c1 − b1 c3 )j + (b1 c2 − b2 c1 )k
= |a1 (b2 c3 − b3 c2 ) + a2 (b3 c1 − b1 c3 ) + a3 (b1 c2 − b2 c1 )|
= | det (A)|,
as required.
z a
h c u b c y x Figure 3.1.3: Determining the volume of a parallelepiped. Exercises for 3.1 Key Terms True-False Review Permutation, Inversion, Parity, Determinant, Order, Dot
product, Cross product. For Questions 1–8, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false. Skills
• Be able to compute determinants by using Deﬁnition 3.1.8.
• Be able to list permutations of 1, 2, . . . , n.
• Be able to ﬁnd the number of inversions of a given
permutation and thus determine its parity. 1. If A is a 2 × 2 lower triangular matrix, then det(A) is
the product of the elements on the main diagonal of A.
2. If A is a 3 × 3 upper triangular matrix, then det (A) is
the product of the elements on the main diagonal of A. • Be able to compute the area of a parallelogram with
sides determined by vectors in R2 . 3. The volume of the parallelepiped whose sides are determined by the vectors a, b, and c is given by det(A),
where A = [a, b, c]. • Be able to compute the volume of a parallelogram with
sides determined by vectors in R3 . 4. There are the same number of permutations of
{1, 2, 3, 4} of even parity as there are of odd parity. i i i i i i i “main”
2007/2/16
page 198
i 198 CHAPTER 3 Determinants 5. If A and B are 2 × 2 matrices, then det(A + B) =
det(A)+ det(B).
6. The determinant of a matrix whose elements are all
positive must be positive.
7. A matrix containing a row of zeros must have zero
determinant.
8. Three vectors v1 , v2 , and v3 in R3 are coplanar if and
only if the determinant of the 3 × 3 matrix [v1 , v2 , v3 ]
is zero. Problems 002
13. A = 0 −4 1 .
−1 5 −7 123 4
0 5 6 7 14. A = 0 0 8 9 .
0 0 0 10 0020
5 0 0 0 15. A = 0 0 0 3 .
0200
For Problems 16–21, evaluate the given determinant. For Problems 1–6, determine the parity of the given permutation.
1. (2, 1, 3, 4). 3. (1, 4, 3, 5, 2).
4. (5, 4, 3, 2, 1). 18. 5. (1, 5, 2, 4, 3).
6. (2, 4, 6, 1, 3, 5).
7. Use the deﬁnition of a determinant to derive the general expression for the determinant of A if
A= a11 a12
.
a21 a22 For Problems 8–15, evaluate the determinant of the given
matrix.
8. A = 1 −1
.
23 9. A = 2 −1
.
6 −3 10. A = −4 10
.
−1 8 1
11. A = 2
0 2
12. A = 4
9 π π2
.
16. √
2 2π
2 3 −1
17. 1 4 1 .
31 6 2. (1, 3, 2, 4). −1 0
3 6 .
2 −1 15
2 3 .
51 32 6
2 1 −1 .
−1 1 4 236
19. 0 1 2 .
150
√
2
e −1
√π e
20.
67 1/30 2001 .
π π2 π3
e 2 t e 3t e − 4 t
21. 2e2t 3e3t −4e−4t .
4e2t 9e3t 16e−4t
In Problems 22–23, we explore a relationship between determinants and solutions to a differential equation. The 3 × 3
matrix consisting of solutions to a differential equation and
their derivatives is called the Wronskian and, as we will
see in later chapters, plays a pivotal role in the theory of
differential equations.
22. Verify that y1 (x) = cos 2x , y2 (x) = sin 2x , and
y3 (x) = ex are solutions to the differential equation
y − y + 4y − 4y = 0 ,
y1 y2 y3
and show that y1 y2 y3 is nonzero on any interval.
y1 y2 y3 i i i i i i i “main”
2007/2/16
page 199
i 3.1 23. (a) Verify that y1 (x) = ex , y2 (x) = cosh x , and
y3 (x) = sinh x are solutions to the differential
equation
y − y − y + y = 0,
y1 y2 y3
and show that y1 y2 y3 is identically zero.
y1 y2 y3
(b) Determine nonzero constants d1 , d2 , and d3 such
that
d1 y1 + d2 y2 + d3 y3 = 0.
24. (a) Write all 24 distinct permutations of the integers
1, 2, 3, 4.
(b) Determine the parity of each permutation in part
(a).
(c) Use parts (a) and (b) to derive the expression for
a determinant of order 4.
For Problems 25–27, use the previous problem to compute
the determinant of A. 1 −1 0 1
3 0 2 5 25. A = 2 1 0 3 .
9 −1 2 1 11 0 1 3 1 −2 3 26. A = 2 3 1 2 .
−2 3 5 −2 0123
2 0 3 4 27. A = 3 4 0 5 .
4560
28. Use Problem 27 to ﬁnd the determinant of A, where 01230
2 0 3 4 0 A = 3 4 0 5 0. 4 5 6 0 0
00007
a11 a12
and c is a constant, verify that
a21 a22
det (cA) = c2 det (A). 29. (a) If A = The Deﬁnition of the Determinant 199 (b) Use the deﬁnition of a determinant to prove that
if A is an n × n matrix and c is a constant, then
det (cA) = cn det (A).
For Problems 30–33, determine whether the given expression is a term in the determinant of order 5. If it is, determine
whether the permutation of the column indices has even or
odd parity and hence ﬁnd whether the term has a plus or a
minus sign attached to it.
30. a11 a25 a33 a42 a54 .
31. a11 a23 a34 a43 a52 .
32. a13 a25 a31 a44 a42 .
33. a11 a32 a24 a43 a55 .
For Problems 34–37, determine the values of the indices p
and q such that the following are terms in a determinant of
order 4. In each case, determine the number of inversions
in the permutation of the column indices and hence ﬁnd the
appropriate sign that should be attached to each term.
34. a13 ap4 a32 a2q .
35. a21 a3q ap2 a43 .
36. a3q ap4 a13 a42 .
37. apq a34 a13 a42 .
38. The alternating symbol ij k is deﬁned by 1,
if (ij k) is an even permutation of 1, 2, 3, = −1, if (ij k) is an odd permutation of 1, 2, 3,
ij k 0,
otherwise. (a) Write all nonzero
1 ≤ k ≤ 3. ij k , for 1 ≤ i ≤ 3, 1 ≤ j ≤ 3, (b) If A = [aij ] is a 3 × 3 matrix, verify that
3 3 3 det (A) = ij k a1i a2j a3k .
i =1 j =1 k =1 39. If A is the general n × n matrix, determine the sign
attached to the term
a1n a2 n−1 a3 n−2 · · · an1 , which arises in det(A). i i i i i i i “main”
2007/2/16
page 200
i 200 CHAPTER 3 Determinants 40. Use some form of technology to evaluate the determinants in Problems 16–21. 41. Let A be an arbitrary 4 × 4 matrix. By experimenting
with various elementary row operations, conjecture
how elementary row operations applied to A affect
the value of det(A). 3.2 42. Verify that y1 (x) = e−2x cos 3x , y2 (x) =
sin 3x , and y3 (x) = e−4x are solutions to the
differential equation
e −2 x y + 8y + 29y + 52y = 0,
y1 y2 y3
and show that y1 y2 y3 is nonzero on any interval.
y1 y2 y3 Properties of Determinants
For large values of n, evaluating a determinant of order n using the deﬁnition given in
the previous section is not very practical, since the number of terms is n! (for example, a
determinant of order 10 contains 3,628,800 terms). In the next two sections, we develop
better techniques for evaluating determinants. The following theorem suggests one way
to proceed. Theorem 3.2.1 If A is an n × n upper or lower triangular matrix, then
n det (A) = a11 a22 a33 · · · ann = aii .
i =1 Proof We use the deﬁnition of the determinant to prove the result in the upper triangular
case. From Equation (3.1.3),
det (A) = σ (p1 , p2 , . . . , pn )a1p1 a2p2 a3p3 . . . anpn . (3.2.1) If A is upper triangular, then aij = 0 whenever i > j , and therefore the only nonzero
terms in the preceding summation are those with pi ≥ i for all i . Since all the pi must
be distinct, the only possibility is (by applying pi ≥ i to i = n, n − 1, . . . , 2, 1 in turn)
pi = i, i = 1, 2, . . . , n, and so Equation (3.2.1) reduces to the single term
det (A) = σ (1, 2, . . . , n)a11 a22 · · · ann .
Since σ (1, 2, . . . , n) = 1, it follows that
det (A) = a11 a22 · · · ann .
The proof in the lower triangular case is left as an exercise (Problem 47).
Example 3.2.2 According to the previous theorem,
2 5 −1 3
0 −1 0 4
= (2)(−1)(7)(5) = −70.
0 0 78
0 0 05 i i i i i i i “main”
2007/2/16
page 201
i 3.2 Properties of Determinants 201 Theorem 3.2.1 shows that it is easy to compute the determinant of an upper or lower
triangular matrix. Recall from Chapter 2 that any matrix can be reduced to row-echelon
form by a sequence of elementary row operations. In the case of an n × n matrix, any
row-echelon form will be upper triangular. Theorem 3.2.1 suggests, therefore, that we
should consider how elementary row operations performed on a matrix A alter the value
of det (A). Elementary Row Operations and Determinants
Let A be an n × n matrix.
P1. If B is the matrix obtained by permuting two rows of A, then
det (B) = − det (A).
P2. If B is the matrix obtained by multiplying one row of A by any2 scalar k , then
det (B) = k det (A).
P3. If B is the matrix obtained by adding a multiple of any row of A to a different row
of A, then
det (B) = det (A).
The proofs of these properties are given at the end of this section. Remark The main use of P2 is that it enables us to factor a common multiple of the
entries of a particular row out of the determinant. For example, if
A= −1 4
3 −2 and B= −5 20
,
3 −2 where B is obtained from A by multiplying the ﬁrst row of A by 5, then we have
det (B) = 5 det (A) = 5[(−1)(−2) − (3)(4)] = 5(−10) = −50. We now illustrate how the foregoing properties P1–P3, together with Theorem 3.2.1,
can be used to evaluate a determinant. The basic idea is the same as that for Gaussian
elimination. We use elementary row operations to reduce the determinant to upper triangular form and then use Theorem 3.2.1 to evaluate the resulting determinant.
Warning: When using the properties P1–P3 to simplify a determinant, one must remember to take account of any change that arises in the value of the determinant from the
operations that have been performed on it. Example 3.2.3 2 −1 3 7
1 −2 4 3
Evaluate
.
3 4 2 −1
2 −2 8 −4
2 This statement is even true if k = 0. i i i i i i i “main”
2007/2/16
page 202
i 202 CHAPTER 3 Determinants Solution: We have 2 −1 3 7
1 −2 4 3 1
=2
3 4 2 −1
2 −2 8 −4 2 −1 3 7
1 −2 4 3 2
= −2
3 4 2 −1
1 −1 4 −2 1 −2 4 3
2 −1 3 7 3
= −2
3 4 2 −1
1 −1 4 −2 1 −2
4
3
01
0 −5 5
4
=2
= 20
0 10 −10 −10
0 3 −5
1 1 −2
4
3
0 3 −5
1
0 10 −10 −10
01
0 −5 1 −2 4 3
0 1 0 −5 6
= 20
0 1 −1 −1
0 3 −5 1 1 −2 4 3
0 1 0 −5
0 0 −1 4
0 0 −5 16 1 −2 4 3
0 1 0 −5
= 20
= 80.
0 0 −1 4
0 0 0 −4
7 1
1. M4 ( 2 ) 5. 2. P12
1
M3 ( 10 ) 3. A12 (−2), A13 (−3), A14 (−1)
6. A23 (−1), A24 (−3) 4. P24 7. A34 (−5) Theoretical Results for n × n Matrices and n × n Linear Systems
In Section 2.8, we established several conditions on an n × n matrix A that are equivalent to saying that A is invertible. At this point, we are ready to give one additional
characterization of invertible matrices in terms of determinants.
Theorem 3.2.4 Let A be an n×n matrix with real elements. The following conditions on A are equivalent.
(a) A is invertible.
(g) det (A) = 0. Proof Let A∗ denote the reduced row-echelon form of A. Recall from Chapter 2 that A
is invertible if and only if A∗ = In . Since A∗ is obtained from A by performing a sequence
of elementary row operations, properties P1–P3 of determinants imply that det(A) is just
a nonzero multiple of det(A∗ ). If A is invertible, then det(A∗ ) = det(In ) = 1, so that
det(A) is nonzero.
Conversely, if det(A) = 0, then det(A∗ ) = 0. This implies that A∗ = In , hence A
is invertible.
According to Theorem 2.5.9 in the previous chapter, any linear system Ax = b has
either no solution, exactly one solution, or inﬁnitely many solutions. Recall from the
Invertible Matrix Theorem that the linear system Ax = b has a unique solution for every
b in Rn if and only if A is invertible. Thus, for an n × n linear system, Theorem 3.2.4
tells us that, for each b in Rn , the system Ax = b has a unique solution x if and only if
det(A) = 0.
Next, we consider the homogeneous n × n linear system Ax = 0.
Corollary 3.2.5 The homogeneous n × n linear system Ax = 0 has an inﬁnite number of solutions if and
only if det (A) = 0, and has only the trivial solution if and only if det (A) = 0. i i i i i i i “main”
2007/2/16
page 203
i 3.2 Properties of Determinants 203 Proof The system Ax = 0 clearly has the trivial solution x = 0 under any circumstances. By our remarks above, this must be the unique solution if and only if det(A) = 0.
The only other possibility, which occurs if and only if det(A) = 0, is that the system has
inﬁnitely many solutions. Remark The preceding corollary is very important, since we are often interested only
in determining the solution properties of a homogeneous linear system and not actually in
ﬁnding the solutions themselves. We will refer back to this corollary on many occasions
throughout the remainder of the text. Example 3.2.6 Verify that the matrix 1 −1 3
A = 2 4 −2 357 is invertible. What can be concluded about the solution to Ax = 0? Solution: It is easily shown that det(A) = 52 = 0. Consequently, A is invertible. It
follows from Corollary 3.2.5 that the homogeneous system Ax = 0 has only the trivial
solution (0, 0, 0). Example 3.2.7 Verify that the matrix 10 1
A = 0 1 0
−3 0 −3 is not invertible and determine a set of real solutions to the system Ax = 0. Solution: By the row operation A13 (3), we see that A is row equivalent to the upper
triangular matrix 101
B = 0 1 0.
000
By Theorem 3.2.1, det(B) = 0, and hence B and A are not invertible. We illustrate
Corollary 3.2.5 by ﬁnding an inﬁnite number of solutions (x1 , x2 , x3 ) to Ax = 0. Working
with the upper triangular matrix B , we may set x3 = t , a free parameter. The second row
of the matrix system requires that x2 = 0 and the ﬁrst row requires that x1 + x3 = 0, so
x1 = −x3 = −t . Hence, the set of solutions is {(−t, 0, t) : t ∈ R}. Further Properties of Determinants
In addition to elementary row operations, the following properties can also be useful in
evaluating determinants.
Let A and B be n × n matrices.
P4. det(AT ) = det(A). i i i i i i i “main”
2007/2/16
page 204
i 204 CHAPTER 3 Determinants P5. Let a1 , a2 , . . . , an denote the row vectors of A. If the i th row vector of A is the
sum of two row vectors, say ai = bi + ci , then det(A) = det(B)+ det(C), where a1
a1
.
.
.
.
.
. ai −1 ai −1 B = bi and
C = ci . ai +1 ai +1 .
.
.
.
.
.
an
an
The corresponding property is also true for columns.
P6. If A has a row (or column) of zeros, then det(A) = 0.
P7. If two rows (or columns) of A are the same, then det(A) = 0.
P8. det(AB) = det(A)det(B).
The proofs of these properties are given at the end of the section. The main importance of P4 is the implication that any results regarding determinants that hold for
the rows of a matrix also hold for the columns of a matrix. In particular, the properties
P1–P3 regarding the effects that elementary row operations have on the determinant
can be translated to corresponding statements on the effects that “elementary column
operations” have on the determinant. We will use the notations
CPij , CMi (k), and CAij (k) to denote the three types of elementary column operations.
Example 3.2.8 Use only column operations to evaluate
3
6
9
15 Solution:
3
6
9
15 6 −1 2
10 3 4
.
20 5 4
34 3 8 We have 6 −1 2
10 3 4 1
= 3 · 22
20 5 4
34 3 8 1
2
3
5 3 −1 1
5 32 2
= 12
10 5 2
17 3 4 1 00 0
2 −1 5 0 3
= 12
3 1 8 −1
5 2 8 −1 1 00 0
2 −1 0 0
3 1 13 −1
5 2 18 −1 1 0 00
2 −1 0 0
= 12
= 12(−5) = −60,
3 1 13 0
5
5 2 18 13
4 where we have once more used Theorem 3.2.1.
1
1
1. CM1 ( 1 ), CM2 ( 2 ), CM4 ( 2 )
3
3. CA23 (5) 2. CA12 (−3), CA13 (1), CA14 (−1)
1
4. CA34 ( 13 ) i i i i i i i “main”
2007/2/16
page 205
i 3.2 Properties of Determinants 205 The property that often gives the most difﬁculty is P5. We explicitly illustrate its
use with an example.
Example 3.2.9 Use property P5 to express
a1 + b1 c1 + d1
a2 + b2 c2 + d2
as a sum of four determinants. Solution: Applying P5 to row 1 yields:
a1 + b1 c1 + d1
a1
c1
b1
d1
=
+
.
a2 + b2 c2 + d2
a2 + b2 c2 + d2
a2 + b2 c2 + d2 Now we apply P5 to row 2 of both of the determinants on the right-hand side to obtain
a1 + b1 c1 + d1
ac
ac
bd
bd
= 1 1 + 1 1 + 1 1 + 1 1.
a2 + b2 c2 + d2
a2 c2
b2 d2
a2 c2
b2 d2
Notice that we could also have applied P5 to the columns of the given determinant.
Warning In view of P5, it may be tempting to believe that if A, B , and C are n × n
matrices such that A = B + C , then det (A) = det (B) + det (C). This is not true!
Examples abound to show the failure of this equation. For instance, if we take B = I2
and C = −I2 , then det (A) = det (02 ) = 0, while det (B) = det (C) = 1. Thus,
det (B) + det (C) = 1 + 1 = 2 = 0.
Next, we supply some examples of the last two properties, P7 and P8.
Example 3.2.10 Evaluate
1 2 −3
−2 4 6
(a)
−3 −6 9
2 11 −6 1
2
.
3
4 2 − 4x −4 2
(b) 5 + 3x 3 −3 .
1 − 2x −2 1 Solution:
(a) We have
1 2 −3
−2 4 6
−3 −6 9
2 11 −6 1
1 2 11
21
−2 4 −2 2
= −3
= 0,
3
−3 −6 −3 3
4
2 11 2 4 since the ﬁrst and third columns of the latter matrix are identical (see P7).
1
1. CM3 (− )
3 i i i i i i i “main”
2007/2/16
page 206
i 206 CHAPTER 3 Determinants (b) Applying P5 to the ﬁrst column, we have
2 − 4x −4 2
2 −4 2
−4x −4 2
5 + 3x 3 −3 = 5 3 −3 + 3x 3 −3
1 − 2x −2 1
1 −2 1
−2x −2 1
1 −2 1
−4 −4 2
= 2 5 3 −3 + x 3 3 −3 = 0 + 0 = 0,
1 −2 1
−2 −2 1
since the ﬁrst and third rows of the ﬁrst matrix agree, and the ﬁrst and second
columns of the second matrix agree. Example 3.2.11 If
A= sin φ cos φ
− cos φ sin φ and B= cos θ − sin θ
sin θ cos θ , show that det (AB) = 1. Solution: Using P8, we have det (AB) = det (A) det (B) = (sin2 φ + cos2 φ)(cos2 θ + sin2 θ) = 1 · 1 = 1. Example 3.2.12 Find all x satisfying
x2 x 1
1 1 1 = 0.
4 21 Solution: If we expanded this determinant according to Deﬁnition 3.1.8 (or using the
schematic in Figure 3.1.1), then we would have a quadratic equation in x . Thus, there are
at most two distinct values of x that satisfy the equation. By inspection, the determinant
vanishes when x = 1 (since the ﬁrst two rows of the matrix coincide in this case), and it
vanishes when x = 2 (since the ﬁrst and third rows of the matrix coincide in this case).
Consequently, the two values of x satisfying the given equation are x = 1 and x = 2. Proofs of the Properties of Determinants
We now prove the properties P1–P8. Proof of P1: Let B be the matrix obtained by interchanging row r with row s in A.
Then the elements of B are related to those of A as follows: aij if i = r, s, bij = asj if i = r , arj if i = s .
Thus, from Deﬁnition 3.1.8,
det (B) =
= σ (p1 , p2 , · · · , pr , · · · , ps , · · · , pn )b1p1 b2p2 · · · brpr · · · bsps · · · bnpn
σ (p1 , p2 , · · · , pr , · · · , ps , · · · , pn )a1p1 a2p2 · · · aspr · · · arps · · · anpn . i i i i i i i “main”
2007/2/16
page 207
i 3.2 Properties of Determinants 207 Interchanging pr and ps in σ (p1 , p2 , . . . , pr , . . . , ps , . . . , pn ) and recalling from Theorem 3.1.7 that such an interchange has the effect of changing the parity of the permutation,
we obtain
σ (p1 , p2 , · · · , ps , · · · , pr , · · · , pn )a1p1 a2p2 · · · arps · · · aspr · · · anpn , det (B) = − where we have also rearranged the terms so that the row indices are in their natural
increasing order. The sum on the right-hand side of this equation is just det(A), so that
det (B) = − det (A). Proof of P2: Let B be the matrix obtained by multiplying the i th row of A through
by any scalar k . Then bij = kaij for each j . Then
det (B) =
= σ (p1 , p2 , · · · , pn )b1p1 b2p2 · · · bnpn
σ (p1 , p2 , · · · , pn )a1p1 a2p2 · · · (kaipi ) · · · anpn = k det (A). We prove properties P5 and P7 next, since they simplify the proof of P3. Proof of P5: The elements of A are
akj = akj ,
bij + cij , if k = i ,
if k = i . Thus, from Deﬁnition 3.1.8,
det (A) = σ (p1 , p2 , · · · , pn )a1p1 a2p2 · · · anpn = σ (p1 , p2 , · · · , pn )a1p1 a2p2 · · · ai −1pi −1 (bipi + cipi )ai +1pi +1 · · · anpn = σ (p1 , p2 , · · · , pn )a1p1 a2p2 · · · ai −1pi −1 bipi ai +1pi +1 · · · anpn + σ (p1 , p2 , · · · , pn )a1p1 a2p2 · · · ai −1pi −1 cipi ai +1pi +1 · · · anpn = det (B) + det (C). Proof of P7: Suppose rows i and j in A are the same. Then if we interchange these
rows, the matrix, and hence its determinant, are unaltered. However, according to P1,
the determinant of the resulting matrix is − det (A). Therefore,
det (A) = − det (A),
which implies that
det (A) = 0. Proof of P3: Let A = [a1 , a2 , . . . , an ]T , and let B be the matrix obtained from A
when k times row j of A is added to row i of A. Then
B = [a1 , a2 , . . . , ai + k aj , . . . , an ]T i i i i i i i “main”
2007/2/16
page 208
i 208 CHAPTER 3 Determinants so that, using P5,
det (B) = det ([a1 , a2 , . . . , ai + k aj , . . . , an ]T )
= det ([a1 , a2 , . . . , an ]T ) + det ([a1 , a2 , . . . , k aj , . . . , an ]T ).
By P2, we can factor out k from row i of the second determinant on the right-hand side.
If we do this, it follows that row i and row j of the resulting determinant are the same,
and so, from P7, the value of the second determinant is zero. Thus,
det (B) = det ([a1 , a2 , . . . , an ]T ) = det (A),
as required. Proof of P4: Using Deﬁnition 3.1.8, we have
det (AT ) = σ (p1 , p2 , . . . , pn )ap1 1 ap2 2 ap3 3 · · · apn n . (3.2.2) Since (p1 , p2 , . . . , pn ) is a permutation of 1, 2, . . . , n, it follows that, by rearranging
terms,
ap1 1 ap2 2 ap3 3 · · · apn n = a1q1 a2q2 a3q3 · · · anqn , (3.2.3) for appropriate values of q1 , q2 , . . . , qn . Furthermore,
N(p1 , . . . , pn ) = # of interchanges in changing (1, 2, . . . , n) to (p1 , p2 , . . . , pn )
= # of interchanges in changing (p1 , p2 , . . . , pn ) to (1, 2, . . . , n)
and by (3.2.3), this number is
= # of interchanges in changing (1, 2, . . . , n) to (q1 , q2 , . . . , qn )
= N(q1 , . . . , qn ).
Thus,
σ (p1 , p2 , . . . , pn ) = σ (q1 , q2 , . . . , qn ). (3.2.4) Substituting Equations (3.2.3) and (3.2.4) into Equation (3.2.2), we have
det (AT ) = σ (q1 , q2 , . . . , qn )a1q1 a2q2 a3q3 · · · anqn = det (A). Proof of P6: Since each term σ (p1 , p2 , . . . , pn )a1p1 a2p2 · · · anpn in the formula for
det(A) contains a factor from the row (or column) of zeros, each such term is zero. Thus,
det(A) = 0.
Proof of P8: Let E denote an elementary matrix. We leave it as an exercise (Problem 51) to verify that −1, if E permutes rows, det (E) = +1, if E adds a multiple of one row to another row, k, if E scales a row by k .
It then follows from properties P1–P3 that in each case
det (EA) = det (E) det (A). (3.2.5) Now consider a general product AB . We need to distinguish two cases. i i i i i i i “main”
2007/2/16
page 209
i 3.2 Properties of Determinants 209 Case 1: If A is not invertible, then from Corollary 2.6.12, so is AB . Consequently, applying
Theorem 3.2.4,
det (AB) = 0 = det (A) det (B).
Case 2: If A is invertible, then from Section 2.7, we know that it can be expressed as
the product of elementary matrices, say, A = E1 E2 · · · Er . Hence, repeatedly
applying (3.2.5) gives
det (AB) = det (E1 E2 · · · Er B) = det (E1 ) det (E2 · · · Er B)
= det (E1 ) det (E2 ) · · · det (Er ) det (B)
= det (E1 E2 · · · Er ) det (B) = det (A) det (B). Exercises for 3.2 Skills 6. If A and B are n × n matrices, then det(AB) =
det(BA). • Be able to compute the determinant of an upper or
lower triangular matrix “at a glance” (Theorem 3.2.1). Problems • Know the effects that elementary row operations have
on the determinant of a matrix. For Problems 1–12, reduce the given determinant to upper
triangular form and then evaluate. • Likewise, be comfortable with the effects that column
operations have on the determinant of a matrix.
• Be able to use the determinant to decide if a matrix is
invertible (Theorem 3.2.4).
• Know how the determinant is affected by matrix multiplication and by matrix transpose. True-False Review
For Questions 1–6, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false.
1. If each element of an n × n matrix is doubled, then the
determinant of the matrix also doubles.
2. Multiplying a row of an n × n matrix through by a
scalar c has the same effect on the determinant as multiplying a column of the matrix through by c.
3. If A is an n × n matrix, then det(A5 ) = (det A)5 .
4. If A is a real n × n matrix, then det(A2 ) cannot be
negative.
x2 x
5. The matrix
y2 y
x = 0 or y = 0. is not invertible if and only if 1 23
1. 2 6 4 .
3 −5 2
2. 2 −1 4
3 21 .
−2 1 4 21 3
3. −1 2 6 .
4 1 12
0 1 −2
4. −1 0 3 .
2 −3 0
37 1
5. 5 9 −6 .
21 3
1 −1 2 4
3 124
6.
.
−1 1 3 2
2 142
2
26
7.
2
1 32
104
56
40 1
4
26 −13
.
2
7
1
5 i i i i i i i “main”
2007/2/16
page 210
i 210 CHAPTER 3 0 1 −1
−1 0 1
8.
1 −1 0
−1 −1 −1
2
3
9.
4
5 1
0
1
2 3
1
4
5 Determinants 1
1
.
1
0 5
2
.
3
3 20. Determine all values of the constant k for which the
given system has a unique solution
x1 + kx2 = b1 ,
kx1 + 4x2 = b2 .
21. Determine all values of the constant k for which the
given system has an inﬁnite number of solutions.
x1 + 2x2 + kx3 = 0,
2x1 − kx2 + x3 = 0,
3x1 + 6x2 + x3 = 0. 2 −1 3
4
7 12
3
10.
.
−2 4 8
6
6 −6 18 −24
7 −1 3 4
14 2 4 6
11.
.
21 1 3 4
−7 4 5 8
3
1
12. 4
3
8 7 12 3
1 −1 0 1
8 −1 6 6 .
7 09 4
16 −1 8 12 22. Determine all values of k for which the given system
has an inﬁnite number of solutions.
x1 + 2x2 + x3 = kx1 ,
2x1 + x2 + x3 = kx2 ,
x1 + x2 + 2x3 = kx3 .
23. Determine all values of k for which the given system
has a unique solution.
x1 + kx2
= 2,
kx1 + x2 + x3 = 1,
x1 + x2 + x3 = 1. For Problems 13–19, use Theorem 3.2.4 to determine
whether the given matrix is invertible or not.
13.
14. 15. 16. 17. 18. 19. 21
.
32
−1 1
.
1 −1 2 6 −1 3 5 1 .
20 1 −1 2 3 5 −2 1 .
8 −2 5 1 0 2 −1 3 −2 1 4 2 1 6 2 .
1 −3 4 0 11 1 1 −1 1 −1 1 1 1 −1 −1 .
−1 1 1 −1 1 2 −3 5 −1 2 −3 6 2 3 −1 4 .
1 −2 3 −6 1 −1 2
A = 3 1 4,
0 13 24. If ﬁnd det(A), and use properties of determinants to ﬁnd
det(AT ) and det(−2A).
25. If
A= 1 −1
23 and B= 12
,
−2 4 evaluate det (AB) and verify P8.
26. If
A= cosh x sinh x
sinh x cosh x and B = cosh y sinh y
,
sinh y cosh y evaluate det(AB).
For Problems 27–29, use properties of determinants to show
that det(A) = 0 for the given matrix A. 32 1
27. A = 6 4 −1 .
96 2 i i i i i i i “main”
2007/2/16
page 211
i 3.2 1 −3 1
28. A = 2 −1 7 .
3 1 13 1 + 3a 1 3
29. A = 1 + 2a 1 2 .
2
20 (b) Does your answer to (a) change if we instead consider the volume of the parallelepiped determined
by the column vectors of the matrix A? Why or
why not?
ab
cd and assume det(A) = 1. Find det(B).
30. B = 3c 3d
.
4a 4b 31. B = −2a −2c
.
3a + b 3c + d 32. B = −b
−a
.
d − 4b c − 4a abc
A = d e f ghi and assume det(A) = −6. Find det(B). −4d −4e −4f
33. B = g + 5a h + 5b i + 5c .
a
b
c d
e
f
34. B = −3a −3b −3c .
g − 4d h − 4e i − 4f 2a
2d
2g
35. B = b − c e − f h − i .
c−a f −d i−g
For Problems 36–40, let A and B be 4 × 4 matrices such that
det(A) = 5 and det(B) = 3. Compute the determinant of
the given matrix. (c) For what value(s) of k , if any, is A invertible?
42. Without expanding the determinant, determine all values of x for which det(A) = 0 if 1 −1 x
A = 2 1 x2 .
4 −1 x 3
43. Use only properties P5, P1, and P2 to show that
αx − βy βx − αy
αβ
= (x 2 + y 2 )
.
βx + αy αx + βy
βα
44. Use only properties P5, P1, and P2 to ﬁnd the value of
αβγ such that
a1 + βb1 b1 + γ c1 c1 + αa1
a2 + βb2 b2 + γ c2 c2 + αa2 = 0
a3 + βb3 b3 + γ c3 c3 + αa3
for all values of ai , bi , ci .
45. Use only properties P3 and P7 to prove property P6.
46. An n × n matrix A that satisﬁes AT = A−1 is called an
orthogonal matrix. Show that if A is an orthogonal
matrix, then det(A) = ±1.
47. (a) Use the deﬁnition of a determinant to prove that
if A is an n × n lower triangular matrix, then
n det (A) = a11 a22 a33 · · · ann = aii .
i =1 36. AB T .
37. A2 B 5 . (b) Evaluate the following determinant by ﬁrst reducing it to lower triangular form and then using the
result from (a): 38. (A−1 B 2 )3 .
39. ((2B)−1 (AB)T ).
40. (5A)(2B).
41. Let 211 (a) In terms of k , ﬁnd the volume of the parallelepiped
determined by the row vectors of the matrix A. For Problems 30–32, let A = For Problems 33–35, let Properties of Determinants 124
A = 3 1 6.
k32 2 −1 3 5
1 221
.
3 014
1 201
48. Use determinants to prove that if A is invertible and B
and C are matrices with AB = AC , then B = C . i i i i i i i “main”
2007/2/16
page 212
i 212 CHAPTER 3 Determinants 49. If A and S are n × n matrices with S invertible, show
that det (S −1 AS) = det (A). [Hint: Since S −1 S = In ,
how are det (S −1 ) and det (S) related?]
50. If det(A3 ) = 0, is it possible for A to be invertible?
Justify your answer.
51. Let E be an elementary matrix. Verify the formula for
det(E) given in the text at the beginning of the proof
of P8. (b) Verify property P2 of determinants in the case
when row 1 of A is divided by k .
(c) Verify property P3 of determinants in the case
when k times row 2 is added to row 1.
57. 58. 52. Show that
xy1
x1 y1 1 = 0
x2 y2 1
represents the equation of the straight line through the
distinct points (x1 , y1 ) and (x2 , y2 ).
59. 54. If A is an n × n skew-symmetric matrix and n is odd,
prove that det(A) = 0.
55. Let A = [a1 , a2 , . . . , an ] be an n × n matrix, and let
b = c1 a1 + c2 a2 + · · · + cn an , where c1 , c2 , . . . , cn
are constants. If Bk denotes the matrix obtained from
A by replacing the k th column vector by b, prove that
det (Bk ) = ck det (A),
56. k = 1, 2, . . . , n. Let A be the general 4 × 4 matrix.
(a) Verify property P1 of determinants in the case
when the ﬁrst two rows of A are permuted. 3.3 Determine all values of a for which 1234a
2 1 2 3 4 3 2 1 2 3 4 3 2 1 2
a4321
is invertible. 53. Without expanding the determinant, show that
1 x x2
1 y y 2 = (y − z)(z − x)(x − y).
1 z z2 For a randomly generated 5 × 5 matrix, verify that
det(AT ) = det(A). If 14 1
A = 3 2 1,
3 4 −1 determine all values of the constant k for which the
linear system (A − kI3 )x = 0 has an inﬁnite number
of solutions, and ﬁnd the corresponding solutions.
60. Use the determinant to show that 1234
2 1 2 3 A=
3 2 1 2
4321
is invertible, and use A−1 to solve Ax = b if b =
[3, 7, 1, −4]T . Cofactor Expansions
We now obtain an alternative method for evaluating determinants. The basic idea is that
we can reduce a determinant of order n to a sum of determinants of order n−1. Continuing
in this manner, it is possible to express any determinant as a sum of determinants of
order 2. This method is the one most frequently used to evaluate a determinant by hand,
although the procedure introduced in the previous section whereby we use elementary
row operations to reduce the matrix to upper triangular form involves less work in general.
When A is invertible, the technique we derive leads to formulas for both A−1 and the
unique solution to Ax = b. We ﬁrst require two preliminary deﬁnitions. DEFINITION 3.3.1
Let A be an n × n matrix. The minor, Mij , of the element aij , is the determinant of
the matrix obtained by deleting the i th row vector and j th column vector of A. i i i i i i i “main”
2007/2/16
page 213
i 3.3 Cofactor Expansions 213 Remark Notice that if A is an n × n matrix, then Mij is a determinant of order n − 1.
By convention, if n = 1, we deﬁne the “empty” determinant M11 to be 1.
Example 3.3.2 If a11 a12 a13
A = a21 a22 a23 ,
a31 a32 a33 then, for example,
M23 = Example 3.3.3 a11 a12
a31 a32 and M31 = a12 a13
.
a22 a23 Determine the minors M11 , M23 , and M31 for 21 3
A = −1 4 −2 .
31 5 Solution: Using Deﬁnition 3.3.1, we have M11 = 4 −2
= 22,
15 M23 = 21
= −1,
31 M31 = 13
= −14.
4 −2 DEFINITION 3.3.4
Let A be an n × n matrix. The cofactor, Cij , of the element aij , is deﬁned by
Cij = (−1)i +j Mij ,
where Mij is the minor of aij .
From Deﬁnition 3.3.4, we see that the cofactor of aij and the minor of aij are the
same if i + j is even, and they differ by a minus sign if i + j is odd. The appropriate
sign in the cofactor Cij is easy to remember, since it alternates in the following manner:
+
−
+
.
.
. Example 3.3.5 −
+
−
.
.
. +
−
+
.
.
. −
+
−
.
.
. + ···
− ···
+ ··· .
.
.
. Determine the cofactors C11 , C23 , and C31 for the matrix in Example 3.3.3. Solution: We have already obtained the minors M11 , M23 , and M31 in Example 3.3.3,
so it follows that
C11 = +M11 = 22, C23 = −M23 = 1, C31 = +M31 = −14. i i i i i i i “main”
2007/2/16
page 214
i 214 CHAPTER 3 Determinants Example 3.3.6 If A = a11 a12
, verify that det (A) = a11 C11 + a12 C12 .
a21 a22 Solution: In this case,
C11 = + det [a22 ] = a22 , C12 = − det [a12 ] = −a12 , so that
a11 C11 + a12 C12 = a11 a22 + a12 (−a21 ) = det (A).
The preceding example is a special case of the following important theorem. Theorem 3.3.7 (Cofactor Expansion Theorem)
Let A be an n × n matrix. If we multiply the elements in any row (or column) of A by
their cofactors, then the sum of the resulting products is det (A). Thus,
1. If we expand along row i ,
n det (A) = ai 1 Ci 1 + ai 2 Ci 2 + · · · + ain Cin = aik Cik .
k =1 2. If we expand along column j ,
n det (A) = a1j C1j + a2j C2j + · · · + anj Cnj = akj Ckj .
k =1 The expressions for det(A) appearing in this theorem are known as cofactor expansions. Notice that a cofactor expansion can be formed along any row or column of A.
Regardless of the chosen row or column, the cofactor expansion will always yield the
determinant of A. However, sometimes the calculation is simpler if the row or column
of expansion is wisely chosen. We will illustrate this in the examples below. The proof
of the Cofactor Expansion Theorem will be presented after some examples. Example 3.3.8 Use the Cofactor Expansion Theorem along (a) row 1, (b) column 3 to ﬁnd
2 34
1 −1 1 .
6 30 i i i i i i i “main”
2007/2/16
page 215
i 3.3 Cofactor Expansions 215 Solution:
(a) We have
2 34
−1 1
11
1 −1
1 −1 1 = 2
−3
+4
= −6 + 18 + 36 = 48.
30
60
63
6 30
(b) We have
2 34
1 −1
23
1 −1 1 = 4
−1
+ 0 = 36 + 12 + 0 = 48.
63
63
6 30
Notice that (b) was easier than (a) in the previous example, because of the zero in
column 3. Whenever one uses the cofactor expansion method to evaluate a determinant,
it is usually best to select a row or column containing as many zeros as possible in order
to minimize the amount of computation required.
Example 3.3.9 Evaluate
0
5
7
6 Solution:
we have
0
5
7
6 3 −1 0
0 82
.
2 54
1 70 In this case, it is easiest to use either row 1 or column 4. Choosing row 1, 3 −1 0
582
502
0 82
= −3 7 5 4 + (−1) 7 2 4
2 54
670
610
1 70
= −3 [2 (49 − 30) − 4 (35 − 48) + 0] − [5 (0 − 4) − 0 + 2 (7 − 12)]
= −240. In evaluating the determinants of order 3 on the right side of the ﬁrst equality, we have
used cofactor expansion along column 3 and row 1, respectively. For additional practice,
the reader may wish to verify our result here by cofactor expansion along a different row
or column.
Now we turn to the Proof of the Cofactor Expansion Theorem: It follows from the deﬁnition of the
determinant that det(A) can be written in the form
ˆ
ˆ
ˆ
det (A) = ai 1 Ci 1 + a12 Ci 2 + · · · + ain Cin (3.3.1) ˆ
where the coefﬁcients Cij contain no elements from row i or column j . We must show
that
ˆ
Cij = Cij
where Cij is the cofactor of aij .
Consider ﬁrst a11 . From Deﬁnition 3.1.8, the terms of det(A) that contain a11 are
given by
a11
σ (1, p2 , p3 , . . . , pn )a2p2 a3p3 · · · anpn , i i i i i i i “main”
2007/2/16
page 216
i 216 CHAPTER 3 Determinants where the summation is over the (n − 1)! distinct permutations of 2, 3, . . . , n. Thus,
ˆ
C11 = σ (1, p2 , p3 , . . . , pn )a2p2 a3p3 · · · anpn . However, this summation is just the minor M11 , and since C11 = M11 , we have shown
the coefﬁcient of a11 in det(A) is indeed the cofactor C11 .
Now consider the element aij . By successively interchanging adjacent rows and
columns of A, we can move aij into the (1, 1) position without altering the relative
positions of the other rows and columns of A. We let A denote the resulting matrix.
Obtaining A from A requires i − 1 row interchanges and j − 1 column interchanges.
Therefore, the total number of interchanges required to obtain A from A is i + j − 2.
Consequently,
det (A) = (−1)i +j −2 det (A ) = (−1)i +j det (A ).
Now for the key point. The coefﬁcient of aij in det(A) must be (−1)i +j times the
coefﬁcient of aij in det(A ). But, aij occurs in the (1, 1) position of A , and so, as we
have previously shown, its coefﬁcient in det(A ) is M11 . Since the relative positions of
the remaining rows in A have not altered, it follows that M11 = Mij , and therefore
the coefﬁcient of aij in det(A ) is Mij . Consequently, the coefﬁcient of aij in det(A) is
(−1)i +j Mij = Cij . Applying this result to the elements ai 1 , ai 2 , . . . , ain and comparing
with (3.3.1) yields
ˆ
Cij = Cij ,
j = 1, 2, . . . , n,
which establishes the theorem for expansion along a row. The result for expansion along
a column follows directly, since det(AT ) = det(A).
We now have two computational methods for evaluating determinants: the use of
elementary row operations given in the previous section to reduce the matrix in question
to upper triangular form, and the Cofactor Expansion Theorem. In evaluating a given
determinant by hand, it is usually most efﬁcient (and least error prone) to use a combination of the two techniques. More speciﬁcally, we use elementary row operations to
set all except one element in a row or column equal to zero and then use the Cofactor
Expansion Theorem on that row or column. We illustrate with an example.
Example 3.3.10 Evaluate
2
1
−1
1 Solution:
2
1
−1
1 18
41
21
3 −1 6
3
.
4
2 We have 18
41
21
3 −1 6
0 −7 6 0
−7 6 0
−7
60
311 4 1 32
3
4
=
= − 6 2 7 = − −1 −12 0 = 90.
4
0627
−1 −2 −1
−1 −2 −1
2
0 −1 −2 −1 1. A21 (−2), A23 (1), A24 (−1)
2. Cofactor expansion along column 1
3. A32 (7)
4. Cofactor expansion along column 3 i i i i i i i “main”
2007/2/16
page 217
i 3.3 Example 3.3.11 Cofactor Expansions 217 Determine all values of k for which the system
10x1 + kx2 − x3 = 0,
kx1 + x2 − x3 = 0,
2x1 + x2 − 3x3 = 0,
has nontrivial solutions. Solution: We will apply Corollary 3.2.5. The determinant of the matrix of coefﬁcients
of the system is
10 k −1
10
k
−1
k − 10 1 − k
1
2
det (A) = k 1 −1 = k − 10 1 − k 0 = −
−28 1 − 3k
2 1 −3
−28 1 − 3k 0
= − [(k − 10)(1 − 3k) − (−28)(1 − k)] = 3k 2 − 3k − 18 = 3(k 2 − k − 6)
= 3(k − 3)(k + 2).
1. A12 (−1), A13 (−3) 2. Cofactor expansion along column 3. From Corollary 3.2.5, the system has nontrivial solutions if and only if det(A) = 0; that
is, if and only if k = 3 or k = −2. The Adjoint Method for A−1
We next establish two corollaries to the Cofactor Expansion Theorem that, in the case
of an invertible matrix A, lead to a method for expressing the elements of A−1 in terms
of determinants.
Corollary 3.3.12 If the elements in the i th row (or column) of an n × n matrix A are multiplied by the
cofactors of a different row (or column), then the sum of the resulting products is zero.
That is,
1. If we use the elements of row i and the cofactors of row j ,
n aik Cj k = 0, i = j. (3.3.2) k =1 2. If we use the elements of column i and the cofactors of column j ,
n aki Ckj = 0, i = j. (3.3.3) k =1 Proof We prove (3.3.2). Let B be the matrix obtained from A by adding row i to row
j (i = j ) in the matrix A. By P3, det(B) = det(A). Cofactor expansion of B along row
j gives
n det (A) = det (B) = n (aj k + aik )Cj k =
k =1 n aj k Cj k +
k =1 aik Cj k .
k =1 i i i i i i i “main”
2007/2/16
page 218
i 218 CHAPTER 3 Determinants That is,
n det (A) = det (A) + aik Cj k ,
k =1 since by the Cofactor Expansion Theorem the ﬁrst summation on the right-hand side is
simply det (A). It follows immediately that
n aik Cj k = 0, i = j. k =1 Equation (3.3.3) can be proved similarly (Problem 47).
The Cofactor Expansion Theorem and the above corollary can be combined into the
following corollary.
Corollary 3.3.13 Let A be an n × n matrix. If δij is the Kronecker delta symbol (see Deﬁnition 2.2.19),
then
n n aik Cj k = δij det (A),
k =1 aki Ckj = δij det (A). (3.3.4) k =1 The formulas in (3.3.4) should be reminiscent of the index form of the matrix
product. Combining this with the fact that the Kronecker delta gives the elements of the
identity matrix, we might suspect that (3.3.4) is telling us something about the inverse
of A. Before establishing that this suspicion is indeed correct, we need a deﬁnition. DEFINITION 3.3.14
If every element in an n × n matrix A is replaced by its cofactor, the resulting matrix
is called the matrix of cofactors and is denoted MC . The transpose of the matrix of
T
cofactors, MC , is called the adjoint of A and is denoted adj(A). Thus, the elements
of adj(A) are
adj(A)ij = Cj i . Example 3.3.15 Determine adj(A) if Solution: 2 0 −3
A = −1 5 4 .
3 −2 0 We ﬁrst determine the cofactors of A: C11 = 8,
Thus, C12 = 12, C13 = −13, C21 = 6, C22 = 9,
C31 = 15, C32 = −5, C33 = 10. C23 = 4, 8 12 −13
4,
MC = 6 9
15 −5 10 i i i i i i i “main”
2007/2/16
page 219
i 3.3 so that Cofactor Expansions 219 8 6 15
T
adj(A) = MC = 12 9 −5 .
−13 4 10 We can now prove the next theorem.
Theorem 3.3.16 (The Adjoint Method for Computing A−1 )
If det (A) = 0, then
A−1 = 1
adj(A).
det (A) 1
adj(A). Then we must establish that AB = In = BA. But,
det (A)
using the index form of the matrix product, Proof Let B =
n (AB)ij = n aik bkj =
k =1 aik ·
k =1 1
1
· adj(A)kj =
det (A)
det (A) n aik Cj k = δij ,
k =1 where we have used Equation (3.3.4) in the last step. Consequently, AB = In . We leave
it as an exercise (Problem 53) to verify that BA = In also.
Example 3.3.17 For the matrix in Example 3.3.15,
det (A) = 55,
so that
A− 1 8 6 15
1
12 9 −5 .
=
55 −13 4 10 For square matrices of relatively small size, the adjoint method for computing A−1
is often easier than using elementary row operations to reduce A to upper triangular
form.
In Chapter 7, we will ﬁnd that the solution of a system of differential equations can
be expressed naturally in terms of matrix functions. Certain problems will require us to
ﬁnd the inverse of such matrix functions. For 2 × 2 systems, the adjoint method is very
quick.
Example 3.3.18 Find A−1 if A = Solution: e 2 t e −t
.
3e2t 6e−t In this case,
det (A) = (e2t )(6e−t ) − (3e2t )(e−t ) = 3et , and
adj(A) = 6e−t −e−t
−3e2t
e 2t , 2 e −2 t − 1 e −2 t
3
1t
−et
3e . so that
A−1 = i i i i i i i “main”
2007/2/16
page 220
i 220 CHAPTER 3 Determinants Cramer’s Rule
We now derive a technique that enables us, in the case when det(A) = 0, to express the
unique solution of an n × n linear system
Ax = b
directly in terms of determinants. Let Bk denote the matrix obtained by replacing the k th
column vector of A with b. Thus, a11 a12 . . . b1 . . . a1n a21 a22 . . . b2 . . . a2n Bk = . .
.
. .
.
.
. .
..
.
.
an1 an2 . . . bn . . . ann
The key point to notice is that the cofactors of the elements in the k th column of Bk
coincide with the corresponding cofactors of A. Thus, expanding det(Bk ) along the k th
column using the Cofactor Expansion Theorem yields
n det (Bk ) = b1 C1k + b2 C2k + · · · + bn Cnk = bi Cik , k = 1, 2, . . . , n, (3.3.5) i =1 where the Cij are the cofactors of A. We can now prove Cramer’s rule.
Theorem 3.3.19 (Cramer’s Rule)
If det (A) = 0, the unique solution to the n × n system Ax = b is (x1 , x2 , . . . , xn ),
where
det (Bk )
,
det (A) xk = k = 1, 2, . . . , n. (3.3.6) Proof If det(A) = 0, then the system Ax = b has the unique solution
x = A−1 b, (3.3.7) where, from Theorem 3.3.16, we can write
A−1 =
If we let 1
adj(A).
det (A) x1 x2 x= . .
. (3.3.8) and xn b1 b2 b= . .
.
bn and recall that adj(A)ij = Cj i , then substitution from (3.3.8) into (3.3.7) and use of the
index form of the matrix product yields
n xk =
i =1 (A−1 )ki bi = n
i =1 1
adj(A)ki bi
det (A) i i i i i i i “main”
2007/2/16
page 221
i 3.3 = 1
det (A) Cofactor Expansions 221 n k = 1, 2, . . . , n. Cik bi ,
i =1 Using (3.3.5), we can write this as
xk = det (Bk )
,
det (A) k = 1, 2, . . . , n as required. Remark In general, Cramer’s rule requires more work than the Gaussian elimination
method, and it is restricted to n × n systems whose coefﬁcient matrix is invertible.
However, it is a powerful theoretical tool, since it gives us a formula for the solution of
an n × n system, provided det(A) = 0.
Example 3.3.20 Solve
3x1 + 2x2 − x3 = 4,
x1 + x2 − 5x3 = −3,
−2x1 − x2 + 4x3 = 0. Solution: The following determinants are easily evaluated:
det (A) = det (B2 ) = 3 2 −1
1 1 −5 = 8,
−2 −1 4 4 2 −1
det (B1 ) = −3 1 −5 = 17,
0 −1 4 3 4 −1
1 −3 −5 = −6,
−2 0 4 det (B3 ) = Inserting these results into (3.3.6) yields x1 =
3
that the solution to the system is ( 17 , − 4 , 7 ).
8
8 17
8 , x2 324
1 1 −3 = 7.
−2 −1 0
3
= − 6 = − 4 , and x3 = 7 , so
8
8 Exercises for 3.3 Key Terms
Minor, Cofactor, Cofactor expansion, Matrix of cofactors,
Adjoint, Cramer’s rule. Skills
• Be able to compute the minors and cofactors of a
matrix.
• Understand the difference between Mij and Cij .
• Be able to compute the determinant of a matrix via
cofactor expansion.
• Be able to compute the matrix of cofactors and the
adjoint of a matrix. • Be able to use the adjoint of an invertible matrix A to
compute A−1 .
• Be able to use Cramer’s rule to solve a linear system
of equations. True-False Review
For Questions 1–7, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false.
1. The (2, 3)-minor of a matrix is the same as the (2, 3)cofactor of the matrix.
2. We have A · adj(A) = det (A) · In for all n × n matrices
A. i i i i i i i “main”
2007/2/16
page 222
i 222 CHAPTER 3 Determinants 3. Cofactor expansion of a matrix along any row or column will yield the same result, although the individual
terms in the expansion along different rows or columns
can vary.
4. If A is an n × n matrix and c is a scalar, then
adj(cA) = c · adj(A).
5. If A and B are 2 × 2 matrices, then
adj(A + B) = adj(A) + adj(B).
6. If A and B are 2 × 2 matrices, then
adj(AB) = adj(A) · adj(B).
7. For every n, adj(In ) = In . Problems
For Problems 1–3, determine all minors and cofactors of the
given matrix. 1
2. A = 3
2 2
3. A = 0
4 −1 2
−1 4 .
15 10 3
−1 0 .
15 4. If 0 2 −3
9. −2 0 5 , row 3.
3 −5 0
1 −2 3 0
4 0 7 −2
10.
, column 4.
0134
1 5 −2 0
For Problems 11–19, evaluate the given determinant using
the techniques of this section.
1 0 −2
11. 3 1 −1 .
72 5
−1 2 3
0 14.
2 −1 3 2 −1 3
13. 5 2 1 .
3 −3 7
14. 1
3
A=
7
5 3 −1 2
4 1 2
,
1 4 6
0 12 determine the minors M12 , M31 , M23 , M42 , and the
corresponding cofactors.
For Problems 5–10, use the Cofactor Expansion Theorem
to evaluate the given determinant along the speciﬁed row or
column.
5. 31 4
8. 7 1 2 , column 1.
2 3 −5 12. 1 −3
.
24 1. A = 2 1 −4
7. 7 1 3 , row 2.
1 5 −2 1 −2
, row 1.
13 −1 2 3
6. 1 4 −2 , column 3.
31 4 0 −2 1
2 0 −3 .
−1 3 0 1
0
15.
−1
0 0 −1 0
1 0 −1
.
0 −1 0
101 2 −1 3 1
1 4 −2 3
16.
.
0 2 −1 0
1 3 −2 4
352
6
2 3 5 −5
17.
.
7 5 −3 −16
9 −6 27 −12
2 −7 4 3
5 5 −3 7
18.
.
6 2 63
4 2 −4 5 i i i i i i i “main”
2007/2/16
page 223
i 3.3 2
0
19. 0
1
3 0 −1 3 0
3 0 12
1 3 04.
0 1 −1 0
0 2 05 20. If 0xyz −x 0 1 −1 A= −y −1 0 1 ,
−z 1 −1 0 show that det(A) = (x + y + z)2 .
21. (a) Consider the 3 × 3 Vandermonde determinant
V (r1 , r2 , r3 ) deﬁned by
111
V (r1 , r2 , r3 ) = r1 r2 r3 .
222
r1 r2 r3
Show that
V (r1 , r2 , r3 ) = (r2 − r1 )(r3 − r1 )(r3 − r2 ).
(b) More generally, show that the n × n Vandermonde
determinant
1
r1
2
V (r1 , r2 , . . . , rn ) = r1
.
.
. 1 ...
r2 . . .
2
r2 . . .
.
.
. 1
rn
2
rn
.
.
. n
n
n
r1 −1 r2 −1 . . . rn −1 has value
V (r1 , r2 , . . . , rn ) = (rm − ri ).
1≤i<m≤n For Problems 22–31, ﬁnd (a) det(A), (b) the matrix of cofactors MC , (c) adj(A), and, if possible, (d) A−1 .
22. A = 31
.
45 23. A = −1 −2
.
41 52
24. A =
.
−15 −6 2 −3 0
25. A = 2 1 5 .
0 −1 2 Cofactor Expansions 223 −2 3 −1
26. A = 2 1 5 .
02 3 1 −1 2
27. A = 3 −1 4 .
5 17 0 12
28. A = −1 −1 3 .
1 −2 1 2 −3 5
29. A = 1 2 1 .
0 7 −1 11 1 1 −1 1 −1 1 30. A = 1 1 −1 −1 .
−1 1 1 −1 103 5 −2 1 1 3 31. A = 3 9 0 2 .
2 0 3 −1 1
−2x
2x 2
32. Let A = 2x 1 − 2x 2 −2x .
2x
1
2x 2
(a) Show that det(A) = (1 + 2x 2 )3 .
(b) Use the adjoint method to ﬁnd A−1 .
In Problems 33–35, ﬁnd the speciﬁed element in the inverse
of the given matrix. Do not use elementary row operations. 111
33. A = 1 2 2 ; (3, 2)-element.
123 2 0 −1
34. A = 2 1 1 ; (3, 1)-element.
3 −1 0 1 0 10 2 −1 1 3 35. A = 0 1 −1 2 ; (2, 3)-element.
−1 1 2 0
In Problems 36–38, ﬁnd A−1 .
36. A = 3et e2t
.
2e t 2e 2t 37. A = et sin 2t −e−t cos 2t
.
et cos 2t e−t sin 2t i i i i i i i “main”
2007/2/16
page 224
i 224 CHAPTER 3 Determinants et tet e−2t
38. A = et 2tet e−2t .
et tet 2e−2t 46. Find all solutions to the system
(b + c)x1 + a(x2 + x3 ) = a,
(c + a)x1 + b(x3 + x1 ) = b,
(a + b)x1 + c(x1 + x2 ) = c, 123
A = 3 4 5,
456 39. If compute the matrix product A · adj(A). What can you
conclude about det(A)?
For Problems 40–43, use Cramer’s rule to solve the given
linear system. where a, b, c are constants. Make sure you consider all
cases (that is, those when there is a unique solution,
an inﬁnite number of solutions, and no solutions).
47. Prove Equation (3.3.3).
48. Let A be a randomly generated invertible 4 × 4 matrix. Verify the Cofactor Expansion Theorem for expansion along row 1. 40. 2x1 − 3x2 = 2,
x1 + 2x2 = 4.
49. 41. 3x1 − 2x2 + x3 = 4,
x1 + x2 − x3 = 2,
x1
+ x3 = 1. Let A be a randomly generated 4 × 4 matrix. Verify
Equation (3.3.3) when i = 2 and j = 4. 50. x1 − 3x2 + x3 = 0,
x1 + 4x2 − x3 = 0,
2x1 + x2 − 3x3 = 0. Let A be a randomly generated 5 × 5 matrix. Determine adj(A) and compute A · adj(A). Use your result
to determine det(A). 51. 42. 43. x1 − 2x2 + 3x3 − x4
+ x3
2x1
x1 + x2
− x4
x2 − 2x3 + x4 =
=
=
= 1,
2,
0,
3. 1.21x1 + 3.42x2 + 2.15x3 = 3.25,
5.41x1 + 2.32x2 + 7.15x3 = 4.61,
21.63x1 + 3.51x2 + 9.22x3 = 9.93.
Round answers to two decimal places. 44. Use Cramer’s rule to determine x1 and x2 if
et x1 + e−2t x2 = 3 sin t,
et x1 − 2e−2t x2 = 4 cos t. 52. 45. Determine the value of x2 such that
x1
2x1
x1
3x1 +
+
+
+ 4x2
9x2
5x2
14x2 −
−
+
+ 3.4 2x3
3x3
x3
7x3 +
−
−
− Solve the system of equations x4
2x4
x4
2x4 =
=
=
= 2,
5,
3,
6. Use Cramer’s rule to solve the system Ax = b if 12344
68
2 1 2 3 4 −72 A = 3 2 1 2 3,
and
b = −87 . 4 3 2 1 2 79 44321
43 53. Verify that BA = In in the proof of Theorem 3.3.16. Summary of Determinants
The primary aim of this section is to serve as a stand-alone introduction to determinants
for readers who desire only a cursory review of the major facts pertaining to determinants.
It may also be used as a review of the results derived in Sections 3.1–3.3. Formulas for the Determinant
The determinant of an n × n matrix A, denoted det(A), is a scalar whose value can be
obtained in the following manner.
1. If A = [a11 ], then det(A) = a11 . i i i i i i i “main”
2007/2/16
page 225
i 3.4 Summary of Determinants 225 a11 a12
, then det(A) = a11 a22 − a12 a21 .
a21 a22
3. For n > 2, the determinant of A can be computed using either of the following
formulas: 2. If A = det (A) = ai 1 Ci 1 + ai 2 Ci 2 + · · · + ain Cin ,
det (A) = a1j C1j + a2j C2j + · · · + anj Cnj , (3.4.1)
(3.4.2) where Cij = (−1)i +j Mij , and Mij is the determinant of the matrix obtained by deleting
the i th row and j th column of A. The formulas (3.4.1) and (3.4.2) are referred to as
cofactor expansion along the i th row and cofactor expansion along the j th column,
respectively. The determinants Mij and Cij are called the minors and cofactors of A,
respectively. We also denote det(A) by
a11 a12 . . . a1n
a21 a22 . . . a2n
..
..
..
.
..
.
an1 an2 . . . ann
As an example, consider the general 3 × 3 matrix a11 a12 a13
A = a21 a22 a23 .
a31 a32 a33
Using cofactor expansion along row 1, we have
det (A) = a11 C11 + a12 C12 + a13 C13 . (3.4.3) We next compute the required cofactors:
C11 = +M11 = a22 a23
= a22 a33 − a23 a32 ,
a32 a33 C12 = −M12 = −
C13 = +M13 = a21 a23
= −(a21 a33 − a23 a31 ),
a31 a33 a21 a22
= a21 a32 − a22 a31 .
a31 a32 Inserting these expressions for the cofactors into Equation (3.4.3) yields
det (A) = a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 ),
which can be written as
det (A) = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a11 a23 a32 − a12 a21 a33 − a13 a22 a31 .
a11 a12 a13 a11 a12 a21 a22 a23 a21 a22 a31 a32 a33 a31 a32 Figure 3.4.1: A schematic for
obtaining the determinant of a
3 × 3 matrix A = [aij ]. Although we chose to use cofactor expansion along the ﬁrst row to obtain the preceding
formula, according to (3.4.1) and (3.4.2), the same result would have been obtained if we
had chosen to expand along any row or column of A. A simple schematic for obtaining
the terms in the determinant of a 3 × 3 matrix is given in Figure 3.4.1. By taking the
product of the elements joined by each arrow and attaching the indicated sign to the
result, we obtain the six terms in the determinant of the 3 × 3 matrix A = [aij ]. Note
that this technique for obtaining the terms in a 3 × 3 determinant does not generalize to
determinants of larger matrices. i i i i i i i “main”
2007/2/16
page 226
i 226 CHAPTER 3 Determinants Example 3.4.1 Evaluate
2 −1 1
3 42.
7 58 Solution: In this case, the schematic given in Figure 3.4.1 is
2 −1 1 2 −1
3 423 4
7 587 5 so that
2 −1 1
3 4 2 = (2)(4)(8) + (−1)(2)(7) + (1)(3)(5) − (7)(4)(1) − (5)(2)(2) − (8)(3)(−1)
7 58
= 41. Properties of Determinants
Let A and B be n × n matrices. The determinant has the following properties:
P1. If B is obtained by permuting two rows (or columns) of A, then
det (B) = − det (A).
P2. If B is obtained by multiplying any row (or column) of A by a scalar k , then
det (B) = k det (A).
P3. If B is obtained by adding a multiple of any row (or column) of A to another row
(or column) of A, then
det (B) = det (A).
P4. det(AT ) = det(A).
P5. Let a1 , a2 , . . . , an denote the row vectors of A. If the i th row vector of A is the
sum of two row vectors, say ai = bi + ci , then
det (A) = det (B) + det (C),
where
B = [a1 , a2 , . . . , ai −1 , bi , ai +1 , . . . , an ]T
and
C = [a1 , a2 , . . . , ai −1 , ci , ai +1 , . . . , an ]T .
The corresponding property for columns is also true.
P6. If A has a row (or column) of zeros, then det(A) = 0.
P7. If two rows (or columns) of A are the same, then det(A) = 0.
P8. det(AB) = det(A)det(B). i i i i i i i “main”
2007/2/16
page 227
i 3.4 Summary of Determinants 227 The ﬁrst three properties tell us how elementary row operations and elementary
column operations performed on a matrix A alter the value of det(A). They can be very
helpful in reducing the amount of work required to evaluate a determinant, since we can
use elementary row operations to put several zeros in a row or column of A and then use
cofactor expansion along that row or column. We illustrate with an example.
Example 3.4.2 Evaluate
2
−1
5
−2 1 32
1 −2 2
.
1 −2 1
3 11 Solution: Before performing a cofactor expansion, we ﬁrst use elementary row operations to simplify the determinant: 21 32
0 3 −1 6 −1 1 −2 2 1 −1 1 −2 2 5 1 −2 1 ∼ 0 6 −12 11 −2 3 1 1
01
5 −3
According to P3, the determinants of the two matrices above are the same. To evaluate
the determinant of the matrix on the right, we use cofactor expansion along the ﬁrst
column.
0 3 −1 6
3 −1 6
−1 1 −2 2
= −(−1) 6 −12 11
0 6 −12 11
1
5 −3
01
5 −3
To evaluate the determinant of the 3 × 3 matrix on the right, we can use the schematic
given in Figure 3.4.1, or, we can continue to use elementary row operations to introduce
zeros into the matrix:
3 −1 6
0 −16 15
−16 15
2
6 −12 11 = 0 −42 29 =
= 166.
−42 29
1
5 −3
1
5 −3
Here, we have reduced the 3 × 3 determinant to a 2 × 2 determinant by using cofactor
expansion along the ﬁrst column of the 3 × 3 matrix.
1. A21 (2), A23 (5), A24 (−2) 2. A31 (−3), A32 (−6) Basic Theoretical Results
The determinant is a useful theoretical tool in linear algebra. We list next the major
results that will be needed in the remainder of the text.
1. The volume of the parallelepiped determined by the vectors
a = a1 i + a2 j + a3 k, b = b1 i + b2 j + b3 k, c = c1 i + c2 j + c3 k is
Volume = | det (A)|, a1 a2 a3
where A = b1 b2 b3 .
c1 c2 c3 i i i i i i i “main”
2007/2/16
page 228
i 228 CHAPTER 3 Determinants 2. An n × n matrix is invertible if and only if det(A) = 0.
3. An n × n linear system Ax = b has a unique solution if and only if det(A) = 0.
4. An n × n homogeneous linear system Ax = 0 has an inﬁnite number of solutions
if and only if det(A) = 0.
We see, for example, that according to (2), the matrices in Examples 3.4.1 and 3.4.2 are
both invertible.
If A is an n × n matrix with det(A) = 0, then the following two methods can be
derived for obtaining the inverse of A and for ﬁnding the unique solution to the linear
system Ax = b, respectively.
1. Adjoint Method for A−1 : If A is invertible, then
A−1 = 1
adj(A),
det (A) where adj(A) denotes the transpose of the matrix obtained by replacing each element in
A by its cofactor.
2. Cramer’s Rule: If det(A) = 0, then the unique solution to Ax = b is x =
(x1 , x2 , . . . , xn ), where
xk = Example 3.4.3 det (Bk )
,
det (A) k = 1, 2, . . . , n, and Bk denotes the matrix obtained when the k th column vector of A is replaced by b. 2 −1 1
Use the adjoint method to ﬁnd A−1 if A = 3 4 2 .
7 58 Solution: We have already shown in Example 3.4.1 that det(A) = 41, so that A is
invertible. Replacing each element in A with its cofactor yields the matrix of cofactors 22 −10 −13
9 −17 ,
MC = 13
−6 −1 11
so that 22 13 −6
T
9 −1 .
adj(A) = MC = −10
−13 −17 11 Consequently, A−1 = 22
41 1
adj(A) = − 10 41
det (A) 13
41
9
41 − 13 − 17
41
41
Example 3.4.4 6
− 41 1
− 41 . 11
41 Use Cramer’s rule to solve the linear system
2x1 − x2 + x3 = 2,
3x1 + 4x2 + 2x3 = 5,
7x1 + 5x2 + 8x3 = 3. i i i i i i i “main”
2007/2/16
page 229
i 3.4 Solution: Summary of Determinants 229 The matrix of coefﬁcients is 2 −1 1
A = 3 4 2.
7 58 We have already shown in Example 3.4.1 that det(A) = 41. Consequently, Cramer’s
rule can indeed be applied. In this problem, we have
2 −1 1
det (B1 ) = 5 4 2 = 91,
3 58
221
det (B2 ) = 3 5 2 = 22,
738
2 −1 2
det (B3 ) = 3 4 5 = −78.
7 53
It therefore follows from Cramer’s rule that
x1 = det (B1 )
91
=
,
det (A)
41 x2 = det (B2 )
22
=
,
det (A)
41 x3 = det (B3 )
78
=− .
det (A)
41 Exercises for 3.4 Skills
• Be able to compute the determinant of an n × n matrix.
• Know the effects that elementary row operations and
elementary column operations have on the determinant of a matrix.
• Be able to use the determinant to decide if a matrix is
invertible.
• Know how the determinant is affected by matrix multiplication and by matrix transpose.
• Be able to compute the adjoint of a matrix and use it
to ﬁnd A−1 for an invertible matrix A. Problems
For Problems 1–7, evaluate the given determinant.
1. 5 −1
.
37 35 7
2. −1 2 4 .
6 3 −2 3. 514
613 .
14 2 7 2.3 1.5 7.9
4. 4.2 3.3 5.1 .
6.8 3.6 5.7
abc
5. b c a .
cab
3 5 −1 2
2 1 52
6.
.
3 2 57
1 −1 2 1
7 12
2 −2 4
7.
3 −1 5
18 9 27 3
6
.
4
54 i i i i i i i “main”
2007/2/16
page 230
i 230 CHAPTER 3 Determinants For Problems 8–12, ﬁnd det(A). If A is invertible, use the
adjoint method to ﬁnd A−1 . 9. 10. 11. 12. 1
A = 2
3 3
A = 2
3 2
A = 4
6 5
3
A=
1
5 4 13
5
15. A = 2 −1 5 , b = 7 .
2 31
2 23
3 1 .
12 53 6
3
16. A = 2 4 −7 , b = −1 .
25 9
4 47
6 1 .
14 −1 57
−3 2 .
9 11
−1 2
−1 4
−1 2
9 −3 3.1 3.5 7.1
3.6
17. A = 2.2 5.2 6.3 , b = 2.5 .
1.4 8.1 0.9
9.3
18. If A is an invertible n × n matrix, prove that 1
5
.
1
2 det (A−1 ) = For Problems 13–17, use Cramer’s rule to determine the
unique solution to the system Ax = b for the given matrix
and vector.
13. A = 35
,b =
62 e −t
.
3e−t ,b = 35
.
27 8. A = cos t
sin t
sin t − cos t 14. A = 19. Let A and B be 3 × 3 matrices with det(A) = 3 and
det(B) = −4. Determine
det (2A), det (A−1 ), det (B 5 ), 4
.
9 3.5 1
.
det (A) det (B −1 AB). det (AT B), Chapter Review
This chapter has laid out a basic introduction to the theory of determinants. Determinants and Elementary Row Operations
For a square matrix A, one approach for computing the determinant of A, det(A), is
to use elementary row operations to reduce A to row-echelon form. The effects of the
various types of elementary row operations on det(A) are as follows:
• Pij : permuting two rows of A alters the determinant by a factor of −1.
• Mi (k): multiplying the i th row of A by k multiplies the determinant of the matrix
by a factor of k .
• Aij (k): adding a multiple of one row of A to another has no effect whatsoever on
det(A).
A crucial fact in this approach is the following:
Theorem 3.5.1 If A is an n × n upper (or lower) triangular matrix, its determinant is
det (A) = a11 a22 · · · ann .
Therefore, since the row-echelon form of A is upper triangular, we can compute
det(A) by using Theorem 3.5.1 and by keeping track of the elementary row operations
involved in the row-reduction process. i i i i i i i “main”
2007/2/16
page 231
i 3.5 Chapter Review 231 Cofactor Expansion
Another way to compute det(A) is via the Cofactor Expansion Theorem: For n ≥ 2, the
determinant of A can be computed using either of the following formulas:
det (A) = ai 1 Ci 1 + ai 2 Ci 2 + · · · + ain Cin ,
det (A) = a1j C1j + a2j C2j + · · · + anj Cnj , (3.5.1)
(3.5.2) where Cij = (−1)i +j Mij , and Mij is the determinant of the matrix obtained by deleting
the i th row and j th column of A. The formulas (3.5.1) and (3.5.2) are referred to as
cofactor expansion along the i th row and cofactor expansion along the j th column,
respectively. The determinants Mij and Cij are called the minors and cofactors of A,
respectively. Adjoint Method and Cramer’s Rule
If A is an n × n matrix with det(A) = 0, then the following two methods can be derived
for obtaining the inverse of A and for ﬁnding the unique solution to the linear system
Ax = b, respectively.
1. Adjoint Method for A−1 : If A is invertible, then
A−1 = 1
adj(A),
det (A) where adj(A) denotes the transpose of the matrix obtained by replacing each element in A by its cofactor.
2. Cramer’s Rule: If det(A) = 0, then the unique solution to Ax = b is x =
(x1 , x2 , . . . , xn ), where
xk = det (Bk )
,
det (A) k = 1, 2, . . . , n, and Bk denotes the matrix obtained when the k th column vector of A is replaced
by b. Additional Problems
For Problems 1–6, evaluate the determinant of the given matrix A by using (a) the deﬁnition, (b) elementary row operations to reduce A to an upper triangular matrix, and (c) the
Cofactor Expansion Theorem.
1. A = −7 −2
.
1 −5 2. A = 66
.
−2 1 −1 4 1
3. A = 0 2 2 .
2 2 −3 2 3 −5
4. A = −4 0 2 .
6 −3 3 3 −1 −2 1
0 0 1 4 5. A = 0 2 1 −1 .
0 0 0 −4 0 0 0 −2 0 0 −5 1 6. A = 0 1 −4 1 .
−3 −3 −3 −3 i i i i i i i “main”
2007/2/16
page 232
i 232 CHAPTER 3 Determinants For Problems 7–10, suppose that abc
A = d e f , and det (A) = 4.
ghi
Compute the determinant of each matrix below. g
h
i
7. −4a −4b −4c .
2d 2e 2f a − 5d b − 5e c − 5f
.
3h
3i
8. 3g
−d + 3g −e + 3h −f + 3i 3b
3e
3h
9. c − 2a f − 2d i − 2g .
−a
−d
−g a−d b−e c−f
2h
2i .
10. 3 2g
−d −e −f
For Problems 11–14, suppose that A and B are 4 × 4 invertible matrices. If det (A) = −2 and det (B) = 3, compute
each determinant below.
11. det (AB).
12. det (B 2 A−1 ).
13. det (((A−1 B)T )(2B −1 )).
14. det ((−A)3 (2B 2 )).
15. Let 21
1 05
1 2 −1
A=
, B = 5 −2 , C = 3 −1 4 .
21 4
47
2 −2 6
Determine, if possible,
det (A),
det (C T ),
det (B T AT ), det (B),
det (AB),
det (BAC), det (C),
det (BA),
det (ACB). 16. Let
A= 12
,
34 and B= 54
.
11 Use the adjoint method to ﬁnd B −1 and then determine
(A−1 B T )−1 .
For Problems 17–21, use the adjoint method to determine
A−1 for the given matrix A. 2 −1 1
17. A = 0 5 −1 .
113 0 −3 2 2
0 1 1 1 18. A = 1 2 3 −4 .
1 00 5 0001 0 1 3 −3 19. A = −2 −3 −5 2 .
4 −4 4 6 5 8 16
8 .
20. A = 4 1
−4 −4 −11 266
21. A = 2 7 6 .
277
22. Add one row to the matrix
A= 4 −1 0
5 14 so as to create a 3 × 3 matrix B with det(B) = 10.
23. True or False: Given any real number r and any 3 × 3
matrix A whose entries are all nonzero, it is always
possible to change at most one entry of A to get a
matrix B with det(B) = r . 124
24. Let A = 3 1 6 .
k32
(a) Find all value(s) of k for which the matrix A fails
to be invertible.
(b) In terms of k , determine the volume of the parallelepiped determined by the row vectors of the
matrix A. Is that the same as the volume of the
parallelepiped determined by the column vectors
of the matrix A? Explain how you know this without any calculation.
25. Repeat the preceding problem for the matrix k+1 2 1
A = 0 3 k .
1 11
26. Repeat the preceding problem for the matrix 2 k − 3 k2
A = 2 1 4 .
1k
0 i i i i i i i “main”
2007/2/16
page 233
i 3.5 Chapter Review 29. −3x1 + x2 = 3,
x1 + 2x2 = 1. 28. A real n × n matrix A is called orthogonal if AAT =
AT A = In . If A is an orthogonal matrix, prove that
det(A) = ±1. 30. 2x1 − x2 + x3 = 2,
4x1 + 5x2 + 3x3 = 0,
4x1 − 3x2 + 3x3 = 2. For Problems 29– 31, use Cramer’s rule to solve the given
linear system. 31. 233 3x1 + x2 + 2x3 = −1,
2x1 − x2 + x3 = −1,
5x2 + 5x3 = −5. 27. Let A and B be n × n matrices such that AB = −BA.
Use determinants to prove that if n is odd, then A and
B cannot both be invertible. Project: Volume of a Tetrahedron
In this project, we use determinants and vectors to derive the formula for the volume
of a tetrahedron with vertices A = (x1 , y1 , z1 ), B = (x2 , y2 , z2 ), C = (x3 , y3 , z3 ), and
D = (x4 , y4 , z4 ).
Let h denote the distance from A to the plane determined by B, C , and D . From
geometry, the volume of the tetrahedron is given by
Volume = 1 h(area of triangle BCD).
3 (3.5.3) (a) Express the area of triangle BCD in terms of a cross product of vectors.
(b) Use trigonometry to express h in terms of the distance from A to B and the angle
−
→
between the vector AB and the segment connecting A to the base BCD at a right
angle.
(c) Combining (a) and (b) with the volume of the tetrahedron given above, express the
volume of the tetrahedron in terms of dot products and cross products of vectors.
(d) Following the proof of part 2 of Theorem 3.1.11, express the volume of the tetrahedron in terms of a determinant with entries in terms of the xi , yi , and zi for
1 ≤ i ≤ 4.
(e) Show that the expression in part (d) is the same as
x1 y1
1 x2 y2
Volume =
6 x3 y3
x4 y4 z1
z2
z3
z4 1
1
.
1
1 (3.5.4) (f) For each set of four points below, determine the volume of the tetrahedron with
those points as vertices by using (3.5.3) and by using (3.5.4). Both formulas should
yield the same answer.
(i) (0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, 1).
(ii) (−1, 1, 2), (0, 3, 3), (1, −1, 2), (0, 0, 1). i i i i i i i “main”
2007/2/16
page 234
i CHAPTER 4
Vector Spaces
To criticize mathematics for its abstraction is to miss the point entirely. Abstraction
is what makes mathematics work. — Ian Stewart The main aim of this text is to study linear mathematics. In Chapter 2 we studied
systems of linear equations, and the theory underlying the solution of a system of linear
equations can be considered as a special case of a general mathematical framework for
linear problems. To illustrate this framework, we discuss an example.
Consider the homogeneous linear system Ax = 0, where 1 −1 2
A = 2 −2 4 .
3 −3 6
It is straightforward to show that this system has solution set
S = {(r − 2s, r, s) : r, s ∈ R}.
Geometrically we can interpret each solution as deﬁning the coordinates of a point in
space or, equivalently, as the geometric vector with components
v = (r − 2s, r, s).
Using the standard operations of vector addition and multiplication of a vector by a real
number, it follows that v can be written in the form
v = r(1, 1, 0) + s(−2, 0, 1).
We see that every solution to the given linear problem can be expressed as a linear
combination of the two basic solutions (see Figure 4.0.1):
v1 = (1, 1, 0) and v2 = (−2, 0, 1). 234
i i i i i i i “main”
2007/2/16
page 235
i 4.1 Vectors in Rn 235 x3
v2
v ( 2, 0, 1) rv1 + sv2
x2 x1 v1 (1, 1, 0) Figure 4.0.1: Two basic solutions to Ax = 0 and an example of an arbitrary solution to the
system. We will observe a similar phenomenon in Chapter 6, when we establish that every
solution to the homogeneous second-order linear differential equation
y + a1 y + a2 y = 0
can be written in the form
y(x) = c1 y1 (x) + c2 y2 (x),
where y1 (x) and y2 (x) are two nonproportional solutions to the differential equation on
the interval of interest.
In each of these problems, we have a set of “vectors” V (in the ﬁrst problem the
vectors are ordered triples of numbers, whereas in the second, they are functions that
are at least twice differentiable on an interval I ) and a linear vector equation. Further, in
both cases, all solutions to the given equation can be expressed as a linear combination
of two particular solutions.
In the next two chapters we develop this way of formulating linear problems in terms
of an abstract set of vectors, V , and a linear vector equation with solutions in V . We will
ﬁnd that many problems ﬁt into this framework and that the solutions to these problems
can be expressed as linear combinations of a certain number (not necessarily two) of basic
solutions. The importance of this result cannot be overemphasized. It reduces the search
for all solutions to a given problem to that of ﬁnding a ﬁnite number of solutions. As
speciﬁc applications, we will derive the theory underlying linear differential equations
and linear systems of differential equations as special cases of the general framework.
Before proceeding further, we give a word of encouragement to the more applicationoriented reader. It will probably seem at times that the ideas we are introducing are rather
esoteric and that the formalism is pure mathematical abstraction. However, in addition
to its inherent mathematical beauty, the formalism incorporates ideas that pervade many
areas of applied mathematics, particularly engineering mathematics and mathematical
physics, where the problems under investigation are very often linear in nature. Indeed,
the linear algebra introduced in the next two chapters should be considered an extremely
important addition to one’s mathematical repertoire, certainly on a par with the ideas of
elementary calculus. 4.1 Vectors in Rn
In this section, we use some familiar ideas about geometric vectors to motivate the more
general and abstract idea of a vector space, which will be introduced in the next section. We begin by recalling that a geometric vector can be considered mathematically
as a directed line segment (or arrow) that has both a magnitude (length) and a direction
attached to it. In calculus courses, we deﬁne vector addition according to the parallelogram law (see Figure 4.1.1); namely, the sum of the vectors x and y is the diagonal of i i i i i i i “main”
2007/2/16
page 236
i 236 CHAPTER 4 Vector Spaces the parallelogram formed by x and y. We denote the sum by x + y. It can then be shown
geometrically that for all vectors x, y, z,
x+y = y+x
y (4.1.1) x + (y + z) = (x + y) + z. x (4.1.2) and y
x These are the statements that the vector addition operation is commutative and associaFigure 4.1.1: Parallelogram law tive. The zero vector, denoted 0, is deﬁned as the vector satisfying
of vector addition. x + 0 = x, (4.1.3) for all vectors x. We consider the zero vector as having zero magnitude and arbitrary
direction. Geometrically, we picture the zero vector as corresponding to a point in space.
Let −x denote the vector that has the same magnitude as x, but the opposite direction.
Then according to the parallelogram law of addition,
x + (−x) = 0. kx, k 0 x
kx, k Figure 4.1.2: Scalar
multiplication of x by k . 0 (4.1.4) The vector −x is called the additive inverse of x. Properties (4.1.1)–(4.1.4) are the
fundamental properties of vector addition.
The basic algebra of vectors is completed when we also deﬁne the operation of
multiplication of a vector by a real number. Geometrically, if x is a vector and k is
a real number, then k x is deﬁned to be the vector whose magnitude is |k | times the
magnitude of x and whose direction is the same as x if k > 0, and opposite to x if
k < 0. (See Figure 4.1.2.) If k = 0, then k x = 0. This scalar multiplication operation
has several important properties that we now list. Once more, each of these can be
established geometrically using only the foregoing deﬁnitions of vector addition and
scalar multiplication.
For all vectors x and y, and all real numbers r, s and t ,
1x
(st)x
r(x + y)
(s + t)x =
=
=
= x,
s(t x),
r x + r y,
s x + t x. (4.1.5)
(4.1.6)
(4.1.7)
(4.1.8) It is important to realize that, in the foregoing development, we have not deﬁned a
“multiplication of vectors.” In Chapter 3 we discussed the idea of a dot product and cross
product of two vectors in space (see Equations (3.1.4) and (3.1.5)), but for the purposes
of discussing abstract vector spaces we will essentially ignore the dot product and cross
product. We will revisit the dot product in Section 4.11, when we develop inner product
spaces.
We will see in the next section how the concept of a vector space arises as a direct
generalization of the ideas associated with geometric vectors. Before performing this
abstraction, we want to recall some further features of geometric vectors and give one
speciﬁc and important extension.
We begin by considering vectors in the plane. Recall that R2 denotes the set of all
ordered pairs of real numbers; thus,
R2 = {(x, y) : x ∈ R, y ∈ R}.
The elements of this set are called vectors in R2 , and we use the usual vector notation
to denote these elements. Geometrically we identify the vector v = (x, y) in R2 with i i i i i i i “main”
2007/2/16
page 237
i Vectors in Rn 4.1 237 y (x, y)
(0, y) v
x
(x, 0) Figure 4.1.3: Identifying
vectors in R2 with geometric
vectors in the plane. the geometric vector v directed from the origin of a Cartesian coordinate system to
the point with coordinates (x, y). This identiﬁcation is illustrated in Figure 4.1.3. The
numbers x and y are called the components of the geometric vector v. The geometric
vector addition and scalar multiplication operations are consistent with the addition and
scalar multiplication operations deﬁned in Chapter 2 via the correspondence with row
(or column) vectors for R2 :
If v = (x1 , y1 ) and w = (x2 , y2 ), and k is an arbitrary real number, then
v + w = (x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ),
k v = k(x1 , y1 ) = (kx1 , ky1 ). (4.1.9)
(4.1.10) These are the algebraic statements of the parallelogram law of vector addition and
the scalar multiplication law, respectively. (See Figure 4.1.4.) Using the parallelogram
law of vector addition and Equations (4.1.9) and (4.1.10), it follows that any vector
v = (x, y) can be written as
v = x i + y j = x(1, 0) + y(0, 1),
where i = (1, 0) and j = (0, 1) are the unit vectors pointing along the positive x - and
y -coordinate axes, respectively. y (x1 x2, y1 y2) (x2, y2)
w v w
(x1, y1) v kv (kx1, ky1)
x Figure 4.1.4: Vector addition and scalar multiplication in R2 . The properties (4.1.1)–(4.1.8) are now easily veriﬁed for vectors in R2 . In particular,
the zero vector in R2 is the vector
0 = (0, 0).
Furthermore, Equation (4.1.9) implies that
(x, y) + (−x, −y) = (0, 0) = 0,
so that the additive inverse of the general vector v = (x, y) is −v = (−x, −y).
It is straightforward to extend these ideas to vectors in 3-space. We recall that
R3 = {(x, y, z) : x ∈ R, y ∈ R, z ∈ R}.
As illustrated in Figure 4.1.5, each vector v = (x, y, z) in R3 can be identiﬁed with the
geometric vector v that joins the origin of a Cartesian coordinate system to the point
with coordinates (x, y, z). We call x , y , and z the components of v. i i i i i i i “main”
2007/2/16
page 238
i 238 CHAPTER 4 Vector Spaces
z (0, 0, z)
(x, y, z)
v (0, y, 0)
y (x, 0, 0) (x, y, 0) x Figure 4.1.5: Identifying vectors in R3 with geometric vectors in space. Recall that if v = (x1 , y1 , z1 ), w = (x2 , y2 , z2 ), and k is an arbitrary real number,
then addition and scalar multiplication were given in Chapter 2 by
v + w = (x1 , y1 , z1 ) + (x2 , y2 , z2 ) = (x1 + x2 , y1 + y2 , z1 + z2 ),
k v = k(x1 , y1 , z1 ) = (kx1 , ky1 , kz1 ). (4.1.11)
(4.1.12) Once more, these are, respectively, the component forms of the laws of vector
addition and scalar multiplication for geometric vectors. It follows that an arbitrary
vector v = (x, y, z) can be written as
v = x i + y j + zk = x(1, 0, 0) + y(0, 1, 0) + z(0, 0, 1),
where i = (1, 0, 0), j = (0, 1, 0), and k = (0, 0, 1) denote the unit vectors which point
along the positive x -, y -, and z-coordinate axes, respectively.
We leave it as an exercise to check that the properties (4.1.1)–(4.1.8) are satisﬁed
by vectors in R3 , where
0 = (0, 0, 0),
and the additive inverse of v = (x, y, z) is −v = (−x, −y, −z).
We now come to our ﬁrst major abstraction. Whereas the sets R2 and R3 and their
associated algebraic operations arise naturally from our experience with Cartesian geometry, the motivation behind the algebraic operations in Rn for larger values of n does
not come from geometry. Rather, we can view the addition and scalar multiplication
operations in Rn for n > 3 as the natural extension of the component forms of addition
and scalar multiplication in R2 and R3 in (4.1.9)–(4.1.12). Therefore, in Rn we have that
if v = (x1 , x2 , . . . , xn ), w = (y1 , y2 , . . . , yn ), and k is an arbitrary real number, then
v + w = (x1 + y1 , x2 + y2 , . . . , xn + yn ),
k v = (kx1 , kx2 , . . . , kxn ). (4.1.13)
(4.1.14) Again, these deﬁnitions are direct generalizations of the algebraic operations deﬁned
in R2 and R3 , but there is no geometric analogy when n > 3. It is easily established that
these operations satisfy properties (4.1.1)–(4.1.8), where the zero vector in Rn is
0 = (0, 0, . . . , 0),
and the additive inverse of the vector v = (x1 , x2 , . . . , xn ) is
−v = (−x1 , −x2 , . . . , −xn ).
The veriﬁcation of this is left as an exercise. i i i i i i i “main”
2007/2/16
page 239
i 4.1 Example 4.1.1 Vectors in Rn 239 If v = (1.2, 3.5, 2, 0) and w = (12.23, 19.65, 23.22, 9.76), then
v + w = (1.2, 3.5, 2, 0) + (12.23, 19.65, 23.22, 9.76) = (13.43, 23.15, 25.22, 9.76)
and
2.35v = (2.82, 8.225, 4.7, 0). Exercises for 4.1 Key Terms
Vectors in Rn , Vector addition, Scalar multiplication, Zero
vector, Additive inverse, Components of a vector. Skills
• Be able to perform vector addition and scalar multiplication for vectors in Rn given in component form.
• Understand the geometric perspective on vector addition and scalar multiplication in the cases of R2 and
R3 .
• Be able to formally verify the axioms (4.1.1)–(4.1.8)
for vectors in Rn . True-False Review
For Questions 1–12, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false.
1. The vector (x, y) in R2 is the same as the vector
(x, y, 0) in R3 .
2. Each vector (x, y, z) in R3 has exactly one additive
inverse. 6. If s and t are scalars and x and y are vectors in Rn ,
then (s + t)(x + y) = s x + t y.
7. For every vector x in Rn , the vector 0x is the zero
vector of Rn .
8. The parallelogram whose sides are determined by vectors x and y in R2 have diagonals determined by the
vectors x + y and x − y.
9. If x is a vector in the ﬁrst quadrant of R2 , then any
scalar multiple k x of x is still a vector in the ﬁrst quadrant of R2 .
√
10. The vector 5i − 6j + 2k in R3 is the same as
√
(5, −6, 2).
11. Three vectors x, y, and z in R3 always determine a
3-dimensional solid region in R3 .
12. If x and y are vectors in R2 whose components are even
integers and k is a scalar, then x + y and k x are also
vectors in R2 whose components are even integers. Problems
1. If x = (3, 1), y = (−1, 2), determine the vectors
v1 = 2x, v2 = 3y, v3 = 2x + 3y. Sketch the corresponding points in the xy -plane and the equivalent
geometric vectors. 3. The solution set to a linear system of 4 equations and
6 unknowns consists of a collection of vectors in R6 . 2. If x = (−1, −4) and y = (−5, 1), determine the vectors v1 = 3x, v2 = −4y, v3 = 3x + (−4)y. Sketch the
corresponding points in the xy -plane and the equivalent geometric vectors. 4. For every vector (x1 , x2 , . . . , xn ) in Rn , the vector
(−1) · (x1 , x2 , . . . , xn ) is an additive inverse. 3. If x = (3, −1, 2, 5), y = (−1, 2, 9, −2), determine
v = 5x + (−7)y and its additive inverse. 5. A vector whose components are all positive is called
a “positive vector.” 4. If x = (1, 2, 3, 4, 5) and z = (−1, 0, −4, 1, 2), ﬁnd y
in R5 such that 2x + (−3)y = −z. i i i i i i i “main”
2007/2/16
page 240
i 240 CHAPTER 4 Vector Spaces 5. Verify the commutative law of addition for vectors in
R4 .
6. Verify the associative law of addition for vectors in
R4 . 8. Show with examples that if x is a vector in the ﬁrst
quadrant of R2 (i.e., both coordinates of x are positive) and y is a vector in the third quadrant of R2 (i.e.,
both coordinates of y are negative), then the sum x + y
could occur in any of the four quadrants. 7. Verify properties (4.1.5)–(4.1.8) for vectors in R3 . 4.2 Deﬁnition of a Vector Space
In the previous section, we showed how the set Rn of all ordered n-tuples of real numbers, together with the addition and scalar multiplication operations deﬁned on it, has
the same algebraic properties as the familiar algebra of geometric vectors. We now push
this abstraction one step further and introduce the idea of a vector space. Such an abstraction will enable us to develop a mathematical framework for studying a broad class
of linear problems, such as systems of linear equations, linear differential equations, and
systems of linear differential equations, which have far-reaching applications in all areas
of applied mathematics, science, and engineering.
Let V be a nonempty set. For our purposes, it is useful to call the elements of V
vectors and use the usual vector notation u, v, . . . , to denote these elements. For example,
if V is the set of all 2 × 2 matrices, then the vectors in V are 2 × 2 matrices, whereas
if V is the set of all positive integers, then the vectors in V are positive integers. We
will be interested only in the case when the set V has an addition operation and a scalar
multiplication operation deﬁned on its elements in the following senses:
Vector Addition: A rule for combining any two vectors in V . We will use the usual
+ sign to denote an addition operation, and the result of adding the vectors u and v will
be denoted u + v.
Real (or Complex) Scalar Multiplication: A rule for combining each vector in V
with any real (or complex) number. We will use the usual notation k v to denote the result
of scalar multiplying the vector v by the real (or complex) number k .
To combine the two types of scalar multiplication, we let F denote the set of scalars
for which the operation is deﬁned. Thus, for us, F is either the set of all real numbers or
the set of all complex numbers. For example, if V is the set of all 2 × 2 matrices with
complex elements and F denotes the set of all complex numbers, then the usual operation
of matrix addition is an addition operation on V , and the usual method of multiplying
a matrix by a scalar is a scalar multiplication operation on V . Notice that the result of
applying either of these operations is always another vector (2 × 2 matrix) in V .
As a further example, let V be the set of positive integers, and let F be the set of all
real numbers. Then the usual operations of addition and multiplication within the real
numbers deﬁne addition and scalar multiplication operations on V . Note in this case,
however, that the scalar multiplication operation, in general, will not yield another vector
in V , since when we multiply a positive integer by a real number, the result is not, in
general, a positive integer.
We are now in a position to give a precise deﬁnition of a vector space. i i i i i i i “main”
2007/2/16
page 241
i 4.2 Deﬁnition of a Vector Space 241 DEFINITION 4.2.1
Let V be a nonempty set (whose elements are called vectors) on which are deﬁned
an addition operation and a scalar multiplication operation with scalars in F . We call
V a vector space over F , provided the following ten conditions are satisﬁed:
A1. Closure under addition: For each pair of vectors u and v in V , the sum u + v
is also in V . We say that V is closed under addition.
A2. Closure under scalar multiplication: For each vector v in V and each scalar k
in F , the scalar multiple k v is also in V . We say that V is closed under scalar
multiplication.
A3. Commutativity of addition: For all u, v ∈ V , we have
u + v = v + u.
A4. Associativity of addition: For all u, v, w ∈ V , we have
(u + v) + w = u + (v + w).
A5. Existence of a zero vector in V : In V there is a vector, denoted 0, satisfying
v + 0 = v, for all v ∈ V . A6. Existence of additive inverses in V : For each vector v In V , there is a vector,
denoted −v, in V such that
v + (−v) = 0.
A7. Unit property: For all v ∈ V , 1v = v. A8. Associativity of scalar multiplication: For all v ∈ V and all scalars r, s ∈ F ,
(rs)v = r(s v).
A9. Distributive property of scalar multiplication over vector addition: For all u,
v ∈ V and all scalars r ∈ F ,
r(u + v) = r u + r v.
A10. Distributive property of scalar multiplication over scalar addition: For all
v ∈ V and all scalars r, s ∈ F ,
(r + s)v = r v + s v. Remarks
1. A key point to note is that in order to deﬁne a vector space, we must start with all
of the following:
(a) A nonempty set of vectors V .
(b) A set of scalars F (either R or C). i i i i i i i “main”
2007/2/16
page 242
i 242 CHAPTER 4 Vector Spaces (c) An addition operation deﬁned on V .
(d) A scalar multiplication operation deﬁned on V .
Then we must check that the axioms A1–A10 are satisﬁed.
2. Terminology: A vector space over the real numbers will be referred to as a real
vector space, whereas a vector space over the complex numbers will be called a
complex vector space.
3. As indicated in Deﬁnition 4.2.1, we will use boldface to denote vectors in a general
vector space. In handwriting, it is strongly advised that vectors be denoted either
→
as v or as v . This will avoid any confusion between vectors in V and scalars in F .
∼ 4. When we deal with a familiar vector space, we will use the usual notation for
vectors in the space. For example, as seen below, the set Rn of ordered n-tuples is
a vector space, and we will denote vectors here in the form (x1 , x2 , . . . , xn ), as in
the previous section. As another illustration, it is shown below that the set of all
real-valued functions deﬁned on an interval is a vector space, and we will denote
the vectors in this vector space by f, g, . . . . Examples of Vector Spaces
1. The set of all real numbers, together with the usual operations of addition and
multiplication, is a real vector space.
2. The set of all complex numbers is a complex vector space when we use the usual
operations of addition and multiplication by a complex number. It is also possible
to restrict the set of scalars to R, in which case the set of complex numbers becomes
a real vector space.
3. The set Rn , together with the operations of addition and scalar multiplication
deﬁned in (4.1.13) and (4.1.14), is a real vector space. As we saw in the previous
section, the zero vector in Rn is the n-tuple of zeros (0, 0, . . . , 0), and the additive
inverse of the vector v = (x1 , x2 , . . . , xn ) is −v = (−x1 , −x2 , . . . , −xn ).
Strictly speaking, for each of the examples above it is necessary to verify all of
the axioms A1–A10 of a vector space. However, in these examples, the axioms hold
immediately as well-known properties of real and complex numbers and n-tuples.
Example 4.2.2 Let V be the set of all 2 × 2 matrices with real elements. Show that V , together with the
usual operations of matrix addition and multiplication of a matrix by a real number, is a
real vector space. Solution: We must verify the axioms A1–A10. If A and B are in V (that is, A and B
are 2 × 2 matrices with real entries), then A + B and kA are in V for all real numbers
k . Consequently, V is closed under addition and scalar multiplication, and therefore
Axioms A1 and A2 of the vector space deﬁnition hold.
A3. Given two 2 × 2 matrices
A= a1 a2
a3 a4 and B= b1 b2
,
b3 b4 we have
A+B =
= a1 a2
a3 a4 + b1 b2
b3 b4 b1 + a1 b2 + a2
b3 + a3 b4 + a4 =
= a1 + b1 a2 + b2
a3 + b3 a4 + b4
b1 b2
b3 b4 + a1 a2
a3 a4 = B + A. i i i i i i i “main”
2007/2/16
page 243
i 4.2 Deﬁnition of a Vector Space 243 A4. Given three 2 × 2 matrices
A= a1 a2
,
a3 a4 b1 b2
,
b3 b4 B= c1 c2
,
c3 c4 C= we have
a1 a2
a3 a4 (A + B) + C = b1 b2
b3 b4 + c1 c2
c3 c4 + = a1 + b1 a2 + b2
a3 + b3 a4 + b4 = (a1 + b1 ) + c1 (a2 + b2 ) + c2
(a3 + b3 ) + c3 (a4 + b4 ) + c4 = a1 + (b1 + c1 ) a2 + (b2 + c2 )
a3 + (b3 + c3 ) a4 + (b4 + c4 ) = a1 a2
a3 a4 + = a1 a2
a3 a4 + b1 b2
b3 b4 + A+ 00
00 = A. + c1 c2
c3 c4 b1 + c1 b2 + c2
b3 + c3 b4 + c4
c1 c2
c3 c4 = A + (B + C). A5. If A is any matrix in V , then Thus, 02 is the zero vector in V .
ab
cd A6. The additive inverse of A = is −A = −a −b
, since
−c −d a + (−a) b + (−b)
c + (−c) d + (−d) A + (−A) = 00
00 = = 02 . A7. If A is any matrix in V , then
1A = A,
thus verifying the unit property.
A8. Given a matrix A =
(rs)A = ab
cd and scalars r and s , we have (rs)a (rs)b
(rs)c (rs)d = r (sa) r(sb)
r(sc) r(sd) =r s a sb
sc sd = r(sA), as required.
A9. Given matrices A = r(A + B) = r
=r
= a1 a2
a3 a4 a1 a2
a3 a4 + b1 b2
b3 b4 and B = and a scalar r , we have b1 b2
b3 b4 a1 + b1 a2 + b2
a3 + b3 a4 + b4 = r a1 + rb1 ra2 + rb2
ra3 + rb3 ra4 + rb4 r (a1 + b1 ) r(a2 + b2 )
r(a3 + b3 ) r(a4 + b4 )
= r a1 ra2
ra3 ra4 + r b1 rb2
rb3 rb4 = rA + rB. i i i i i i i “main”
2007/2/16
page 244
i 244 CHAPTER 4 Vector Spaces A10. Given A, r , and s as in A8 above, we have
(r + s)A =
= (r + s)a (r + s)b
(r + s)c (r + s)d = r a rb
rc rd = rA + sA, + s a sb
sc sd r a + sa rb + sb
rc + sc rd + sd as required.
Thus V , together with the given operations, is a real vector space. Remark In a manner similar to the previous example, it is easily established that the
set of all m × n matrices with real entries is a real vector space when we use the usual
operations of addition of matrices and multiplication of matrices by a real number. We
will denote the vector space of all m × n matrices with real elements by Mm×n (R), and
we denote the vector space of all n × n matrices with real elements by Mn (R).
Example 4.2.3 Let V be the set of all real-valued functions deﬁned on an interval I . Deﬁne addition and
scalar multiplication in V as follows. If f and g are in V and k is any real number, then
f + g and kf are deﬁned by
(f + g)(x) = f (x) + g(x)
(kf )(x) = kf (x) for all x ∈ I,
for all x ∈ I. Show that V , together with the given operations of addition and scalar multiplication, is
a real vector space. Solution: It follows from the given deﬁnitions of addition and scalar multiplication
that if f and g are in V , and k is any real number, then f + g and kf are both real-valued
functions on I and are therefore in V . Consequently, the closure axioms A1 and A2 hold.
We now check the remaining axioms.
A3. Let f and g be arbitrary functions in V . From the deﬁnition of function addition,
we have
(f + g)(x) = f (x) + g(x) = g(x) + f (x) = (g + f )(x), y for all x ∈ I. (The middle step here follows from the fact that f (x) and g(x) are
real numbers associated with evaluating f and g at the input x , and real number
addition commutes.) Consequently, f + g = g + f (since the values of f + g
and g + f agree for every x ∈ I ), and so addition in V is commutative. y f (x) y
I y A4. Let f, g, h ∈ V . Then for all x ∈ I , we have O(x)
x —f (x) Figure 4.2.1: In the vector
space of all functions deﬁned on
an interval I , the additive inverse
of a function f is obtained by
reﬂecting the graph of f about the
x -axis. The zero vector is the zero
function O(x). [(f + g) + h](x) = (f + g)(x) + h(x) = [f (x) + g(x)] + h(x)
= f (x) + [g(x) + h(x)] = f (x) + (g + h)(x)
= [f + (g + h)](x).
Consequently, (f +g)+h = f +(g +h), so that addition in V is indeed associative.
A5. If we deﬁne the zero function, O , by O(x) = 0, for all x ∈ I , then
(f + O)(x) = f (x) + O(x) = f (x) + 0 = f (x),
for all f ∈ V and all x ∈ I , which implies that f + O = f . Hence, O is the zero
vector in V . (See Figure 4.2.1.) i i i i i i i “main”
2007/2/16
page 245
i 4.2 Deﬁnition of a Vector Space 245 A6. If f ∈ V , then −f is deﬁned by (−f )(x) = −f (x) for all x ∈ I , since
[f + (−f )](x) = f (x) + (−f )(x) = f (x) − f (x) = 0
for all x ∈ I . This implies that f + (−f ) = O .
A7. Let f ∈ V . Then, by deﬁnition of the scalar multiplication operation, for all x ∈ I ,
we have
(1f )(x) = 1f (x) = f (x).
Consequently, 1f = f .
A8. Let f ∈ V , and let r, s ∈ R. Then, for all x ∈ I ,
[(rs)f ](x) = (rs)f (x) = r [sf (x)] = r [(sf )(x)].
Hence, the functions (rs)f and r(sf ) agree on every x ∈ I , and hence (rs)f =
r(sf ), as required.
A9. Let f, g ∈ V and let r ∈ R. Then, for all x ∈ I ,
[r(f + g)] (x) = r [(f + g)(x)] = r [f (x) + g(x)] = rf (x) + rg(x)
= (rf )(x) + (rg)(x) = (rf + rg)(x).
Hence, r(f + g) = rf + rg .
A10. Let f ∈ V , and let r, s ∈ R. Then for all x ∈ I ,
[(r +s)f ](x) = (r +s)f (x) = rf (x)+sf (x) = (rf )(x)+(sf )(x) = (rf +sf )(x),
which proves that (r + s)f = rf + sf .
Since all parts of Deﬁnition 4.2.1 are satisﬁed, it follows that V , together with the
given operations of addition and scalar multiplication, is a real vector space. Remark As the previous two examples indicate, a full veriﬁcation of the vector space
deﬁnition can be somewhat tedious and lengthy, although it is usually straightforward.
Be careful to not leave out any important steps in such a veriﬁcation. The Vector Space Cn
We now introduce the most important complex vector space. Let Cn denote the set of all
ordered n-tuples of complex numbers. Thus,
Cn = {(z1 , z2 , . . . , zn ) : z1 , z2 , . . . , zn ∈ C}.
We refer to the elements of Cn as vectors in Cn . A typical vector in Cn is (z1 , z2 , . . . , zn ),
where each zk is a complex number.
Example 4.2.4 The following are examples of vectors in C2 and C4 , respectively:
u = (2.1 − 3i, −1.5 + 3.9i),
v = (5 + 7i, 2 − i, 3 + 4i, −9 − 17i).
In order to obtain a vector space, we must deﬁne appropriate operations of “vector
addition” and “multiplication by a scalar” on the set of vectors in question. In the case of
Cn , we are motivated by the corresponding operations in Rn and thus deﬁne the addition i i i i i i i “main”
2007/2/16
page 246
i 246 CHAPTER 4 Vector Spaces and scalar multiplication operations componentwise. Thus, if u = (u1 , u2 , . . . , un ) and
v = (v1 , v2 , . . . , vn ) are vectors in Cn and k is an arbitrary complex number, then
u + v = (u1 + v1 , u2 + v2 , . . . , un + vn ),
k u = (ku1 , ku2 , . . . , kun ). Example 4.2.5 If u = (1 − 3i, 2 + 4i), v = (−2 + 4i, 5 − 6i), and k = 2 + i , ﬁnd u + k v. Solution: We have u + k v = (1 − 3i, 2 + 4i) + (2 + i)(−2 + 4i, 5 − 6i)
= (1 − 3i, 2 + 4i) + (−8 + 6i, 16 − 7i) = (−7 + 3i, 18 − 3i).
It is straightforward to show that Cn , together with the given operations of addition and
scalar multiplication, is a complex vector space. Further Properties of Vector Spaces
The main reason for formalizing the deﬁnition of an abstract vector space is that any
results that we can prove based solely on the deﬁnition will then apply to all vector
spaces we care to examine; that is, we do not have to prove separate results for geometric
vectors, m × n matrices, vectors in Rn or Cn , or real-valued functions, and so on. The
next theorem lists some results that can be proved using the vector space axioms.
Theorem 4.2.6 Let V be a vector space over F .
1. The zero vector is unique.
2. 0u = 0 for all u ∈ V .
3. k 0 = 0 for all scalars k ∈ F .
4. The additive inverse of each element in V is unique.
5. For all u ∈ V , −u = (−1)u.
6. If k is a scalar and u ∈ V such that k u = 0, then either k = 0 or u = 0. Proof 1. Suppose there were two zero vectors in V , denoted 01 and 02 . Then, for
any v ∈ V , we would have
v + 01 = v (4.2.1) v + 02 = v. (4.2.2) and We must prove that 01 = 02 . But, applying (4.2.1) with v = 02 , we have
02 = 02 + 01
= 01 + 02
= 01 (Axiom A3)
(from (4.2.2) with v = 01 ). Consequently, 01 = 02 , so the zero vector is unique in a vector space. i i i i i i i “main”
2007/2/16
page 247
i 4.2 247 Deﬁnition of a Vector Space 2. Let u be an arbitrary element in a vector space V . Since 0 = 0 + 0, we have
0u = (0 + 0)u = 0u + 0u,
by Axiom A10. Now Axiom A6 implies that the vector −(0u) exists, and adding
it to both sides of the previous equation yields
0u + [−(0u)] = (0u + 0u) + [−(0u)].
Thus, since addition in a vector space is associative (Axiom A4),
0u + [−(0u)] = 0u + (0u + [−(0u)]).
Applying Axiom A6 on both sides and then using Axiom A5, this becomes
0 = 0u + 0 = 0u ,
and this completes the veriﬁcation of (2).
3. Using the fact that 0 = 0 + 0 (by Axiom A5), the proof here proceeds along the
same lines as the proof of result 2. We leave the veriﬁcation to the reader as an
exercise (Problem 21 ).
4. Let u ∈ V be an arbitrary vector, and suppose that there were two additive inverses,
say v and w, for u. According to Axiom A6, this implies that
u+v =0 (4.2.3) u + w = 0. (4.2.4) and We wish to show that v = w. Now, Axiom A6 implies that a vector −v exists, so
adding it on the right to both sides of (4.2.3) yields
(u + v) + (−v) = 0 + (−v) = −v.
Applying Axioms A4 and A6 on the left side, we simplify this to
u = −v.
Substituting this into (4.2.4) yields
−v + w = 0.
Adding v to the left of both sides and applying Axioms A4 and A6 once more
yields v = w, as desired.
5. To verify that −u = (−1)u for all u ∈ V , we note that
0 = 0u = (1 + (−1))u = 1u + (−1)u = u + (−1)u,
where we have used property 2 and Axioms A10 and A7. The equation above
proves that (−1)u is an additive inverse of u, and by the uniqueness of additive
inverses that we just proved, we conclude that (−1)u = −u, as desired.
Finally, we leave the proof of result 6 in Theorem 4.2.6 as an exercise (Problem 22). i i i i i i i “main”
2007/2/16
page 248
i 248 CHAPTER 4 Vector Spaces Remark The proof of Theorem 4.2.6 involved a number of tedious and seemingly
obvious steps. It is important to remember, however, that in an abstract vector space we are
not allowed to rely on past experience in deriving results for the ﬁrst time. For instance,
the statement “0 + 0 = 0” may seem intuitively clear, but in our newly developed
mathematical structure, we must appeal speciﬁcally to the rules A1–A10 given for a
vector space. Hence, the statement “0 + 0 = 0” should be viewed as a consequence of
Axiom A5 and nothing else. Once we have proved these basic results, of course, then
we are free to use them in any vector space context where they are needed. This is the
whole advantage to working in the general vector space setting.
We end this section with a list of the most important vector spaces that will be
required throughout the remainder of the text. In each case the addition and scalar multiplication operations are the usual ones associated with the set of vectors.
• Rn , the (real) vector space of all ordered n-tuples of real numbers.
• Cn , the (complex) vector space of all ordered n-tuples of complex numbers.
• Mm×n (R), the (real) vector space of all m × n matrices with real elements.
• Mn (R), the (real) vector space of all n × n matrices with real elements.
• C k (I ), the vector space of all real-valued functions that are continuous and have
(at least) k continuous derivatives on I . We will show that this set of vectors is a
(real) vector space in the next section.
• Pn , the (real) vector space of all real-valued polynomials of degree ≤ n with real
coefﬁcients. That is,
Pn = {a0 + a1 x + a2 x 2 + · · · + an x n : a0 , a1 , . . . , an ∈ R}.
We leave the veriﬁcation that Pn is a (real) vector space as an exercise (Problem 23). Exercises for 4.2 Key Terms
Vector space (real or complex), Closure under addition, Closure under scalar multiplication, Commutativity of addition,
Associativity of addition, Existence of zero vector, Existence
of additive inverses, Unit property, Associativity of scalar
multiplication, Distributive properties, Examples: Rn , Cn ,
Mn (R), C k (I ), Pn . Skills
• Be able to deﬁne a vector space. Speciﬁcally, be able
to identify and list the ten axioms A1–A10 governing
the vector space operations. • Know each of the standard examples of vector spaces
given at the end of the section, and know how to perform the vector operations in these vector spaces.
• Be able to check whether or not each of the axioms A1–
A10 holds for speciﬁc examples V . This includes, if
possible, closure of V under vector addition and scalar
multiplication, as well as identiﬁcation of the zero vector and the additive inverse of each vector in the set
V.
• Be able to prove basic properties that hold generally
for vector spaces V (see Theorem 4.2.6). i i i i i i i “main”
2007/2/16
page 249
i 4.2 Deﬁnition of a Vector Space 249 True-False Review (a) Is the zero vector from M2 (R) in S ? For Questions 1–8, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false. (b) Give an explicit example illustrating that S is not
closed under matrix addition.
(c) Is S closed under scalar multiplication? Justify
your answer. 2. If v is a vector in a vector space V , and r and s are
scalars such that r v = s v, then r = s . 7. Let N = {1, 2, . . . } denote the set of all positive integers. Give three reasons why N, together with the
usual operations of addition and scalar multiplication,
is not a real vector space. 3. The set Z of integers, together with the usual operations of addition and scalar multiplication, forms a
vector space. 8. We have deﬁned the set R2 = {(x, y) : x, y ∈ R},
together with the addition and scalar multiplication
operations as follows: 1. The zero vector in a vector space V is unique. 4. If x and y are vectors in a vector space V , then the
additive inverse of x + y is (−x) + (−y).
5. The additive inverse of a vector v in a vector space V
is unique.
6. The set {0}, with the usual operations of addition and
scalar multiplication, forms a vector space. (x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ),
k(x1 , y1 ) = (kx1 , ky1 ).
Give a complete veriﬁcation that each of the vector
space axioms is satisﬁed. 7. The set {0, 1}, with the usual operations of addition
and scalar multiplication, forms a vector space. 9. Determine the zero vector in the vector space
M2×3 (R), and the additive inverse of a general element. (Note that the vector space axioms A1–A4 and
A7–A10 follow directly from matrix algebra.) 8. The set of positive real numbers, with the usual operations of addition and scalar multiplication, forms a
vector space. 10. Generalize the previous exercise to ﬁnd the zero vector and the additive inverse of a general element of
Mm×n (R).
11. Let P denote the set of all polynomials whose degree
is exactly 2. Is P a vector space? Justify your answer. Problems
For Problems 1–5, determine whether the given set of vectors
is closed under addition and closed under scalar multiplication. In each case, take the set of scalars to be the set of all
real numbers.
1. The set of all rational numbers.
2. The set of all upper triangular n × n matrices with real
elements.
3. The set of all solutions to the differential equation
y +9y = 4x 2 . (Do not solve the differential equation.)
4. The set of all solutions to the differential equation
y + 9y = 0. (Do not solve the differential equation.)
5. The set of all solutions to the homogeneous linear system Ax = 0. 12. On R+ , the set of positive real numbers, deﬁne the
operations of addition and scalar multiplication as
follows:
x + y = xy,
c · x = xc.
Note that the multiplication and exponentiation appearing on the right side of these formulas refer to
the ordinary operations on real numbers. Determine
whether R+ , together with these algebraic operations,
is a vector space.
13. On R2 , deﬁne the operation of addition and multiplication by a real number as follows:
(x1 , y1 ) + (x2 , y2 ) = (x1 − x2 , y1 − y2 ),
k(x1 , y1 ) = (−kx1 , −ky1 ). 6. Let
S = {A ∈ M2 (R) : det(A) = 0}. Which of the axioms for a vector space are satisﬁed
by R2 with these algebraic operations? i i i i i i i “main”
2007/2/16
page 250
i 250 CHAPTER 4 Vector Spaces 14. On R2 , deﬁne the operation of addition by Determine which of the axioms for a vector space are
satisﬁed by M2 (R) with the operations ⊕ and ·. (x1 , y1 ) + (x2 , y2 ) = (x1 x2 , y1 y2 ).
Do axioms A5 and A6 in the deﬁnition of a vector
space hold? Justify your answer.
15. On M2 (R), deﬁne the operation of addition by
A + B = AB,
and use the usual scalar multiplication operation. Determine which axioms for a vector space are satisﬁed
by M2 (R) with the above operations.
16. On M2 (R), deﬁne the operations of addition and multiplication by a real number (⊕ and · , respectively) as
follows:
A ⊕ B = −(A + B),
k · A = −kA,
where the operations on the right-hand sides of these
equations are the usual ones associated with M2 (R). 4.3 For Problems 17–18, verify that the given set of objects together with the usual operations of addition and scalar multiplication is a complex vector space.
17. C2 .
18. M2 (C), the set of all 2 × 2 matrices with complex
entries.
19. Is C3 a real vector space? Explain.
20. Is R3 a complex vector space? Explain.
21. Prove part 3 of Theorem 4.2.6.
22. Prove part 6 of Theorem 4.2.6.
23. Prove that Pn is a vector space. Subspaces
Let us try to make contact between the abstract vector space idea and the solution of an
applied problem. Vector spaces generally arise as the sets containing the unknowns in
a given problem. For example, if we are solving a differential equation, then the basic
unknown is a function, and therefore any solution to the differential equation will be an
element of the vector space V of all functions deﬁned on an appropriate interval. Consequently, the solution set of a differential equation is a subset of V . Similarly, consider
the system of linear equations Ax = b, where A is an m × n matrix with real elements.
The basic unknown in this system, x, is a column n-vector, or equivalently a vector in
Rn . Consequently, the solution set to the system is a subset of the vector space Rn . As
these examples illustrate, the solution set of an applied problem is generally a subset
of vectors from an appropriate vector space (schematically represented in Figure 4.3.1).
The question we will need to answer in the future is whether this subset of vectors is
a vector space in its own right. The following deﬁnition introduces the terminology we
will use:
Vector space of unknowns V S Solution set of applied problem:
Is S a vector space? Figure 4.3.1: The solution set S of an applied problem is a subset of the vector space V of
unknowns in the problem. i i i i i i i “main”
2007/2/16
page 251
i 4.3 Subspaces 251 DEFINITION 4.3.1
Let S be a nonempty subset of a vector space V . If S is itself a vector space under the
same operations of addition and scalar multiplication as used in V , then we say that
S is a subspace of V .
In establishing that a given subset S of vectors from a vector space V is a subspace of
V , it would appear as though we must check that each axiom in the vector space deﬁnition
is satisﬁed when we restrict our attention to vectors lying only in S . The ﬁrst and most
important theorem of the section tells us that all we need do, in fact, is check the closure
axioms A1 and A2. If these are satisﬁed, then the remaining axioms necessarily hold in
S . This is a very useful theorem that will be applied on several occasions throughout the
remainder of the text.
Theorem 4.3.2 Let S be a nonempty subset of a vector space V . Then S is a subspace of V if and only
if S is closed under the operations of addition and scalar multiplication in V . Proof If S is a subspace of V , then it is a vector space, and hence, it is certainly closed
under addition and scalar multiplication. Conversely, assume that S is closed under addition and scalar multiplication. We must prove that Axioms A3–A10 of Deﬁnition 4.2.1
hold when we restrict to vectors in S . Consider ﬁrst the axioms A3, A4, and A7–A10.
These are properties of the addition and scalar multiplication operations, hence since we
use the same operations in S as in V , these axioms are all inherited from V by the subset
S . Finally, we establish A5 and A6: Choose any vector1 u in S . Since S is closed under
scalar multiplication, both 0u and (−1)u are in S . But by Theorem 4.2.6, 0u = 0 and
(−1)u = −u, hence 0 and −u are both in S . Therefore, A5 and A6 are satisﬁed.
The idea behind Theorem 4.3.2 is that once we have a vector space V in place,
then any nonempty subset S , equipped with the same addition and scalar multiplication
operations, will inherit all of the axioms that involve those operations. The only possible
concern we have for S is whether or not it satisﬁes the closure axioms A1 and A2. Of
course, we presumably had to carry out the full veriﬁcation of A1–A10 for the vector
space V in the ﬁrst place, before gaining the shortcut of Theorem 4.3.2 for the subset S .
In determining whether a subset S of a vector space V is a subspace of V , we must
keep clear in our minds what the given vector space is and what conditions on the vectors
in V restrict them to lie in the subset S . This is most easily done by expressing S in set
notation as follows:
S = {v ∈ V : conditions on v}.
We illustrate with an example.
Example 4.3.3 Verify that the set of all real solutions to the following linear system is a subspace of R3 :
x1 + 2x2 − x3 = 0,
2x1 + 5x2 − 4x3 = 0. Solution: The reduced row-echelon form of the augmented matrix of the system is
10 30
,
0 1 −2 0 1 This is possible since S is assumed to be nonempty. i i i i i i i “main”
2007/2/16
page 252
i 252 CHAPTER 4 Vector Spaces so that the solution set of the system is
S = {x ∈ R3 : x = (−3r, 2r, r), r ∈ R},
which is a nonempty subset of R3 . We now use Theorem 4.3.2 to verify that S is a
subspace of R3 : If x = (−3r, 2r, r) and y = (−3s, 2s, s) are any two vectors in S , then
x + y = (−3r, 2r, r) + (−3s, 2s, s) = (−3(r + s), 2(r + s), r + s) = (−3t, 2t, t),
where t = r + s . Thus, x + y meets the required form for elements of S , and consequently,
if we add two vectors in S , the result is another vector in S . Similarly, if we multiply an
arbitrary vector x = (−3r, 2r, r) in S by a real number k , the resulting vector is
k x = k(−3r, 2r, r) = (−3kr, 2kr, kr) = (−3w, 2w, w),
where w = kr . Hence, k x again has the proper form for membership in the subset S ,
and so S is closed under scalar multiplication. By Theorem 4.3.2, S is a subspace of R3 .
Note, of course, that our application of Theorem 4.3.2 hinges on our prior knowledge
that R3 is a vector space.
Geometrically, the vectors in S lie along the line of intersection of the planes with
the given equations. This is the line through the origin in the direction of the vector
v = (−3, 2, 1). (See Figure 4.3.2.)
z x
( 3r, 2r, r) r ( 3, 2, 1)
y x Figure 4.3.2: The solution set to the homogeneous system of linear equations in
Example 4.3.3 is a subspace of R3 . Example 4.3.4 Verify that S = {x ∈ R2 : x = (r, −3r + 1), r ∈ R} is not a subspace of R2 . Solution: One approach here, according to Theorem 4.3.2, is to demonstrate the
failure of closure under addition or scalar multiplication. For example, if we start with
two vectors in S , say x = (r, −3r + 1) and y = (s, −3s + 1), then
x + y = (r, −3r + 1) + (s, −3s + 1) = (r + s, −3(r + s) + 2) = (w, −3w + 2),
where w = r + s . We see that x + y does not have the required form for membership
in S . Hence, S is not closed under addition and therefore fails to be a subspace of R2 .
Alternatively, we can show similarly that S is not closed under scalar multiplication.
Observant readers may have noticed another reason that S cannot form a subspace.
Geometrically, the points in S correspond to those points that lie on the line with Cartesian
equation y = −3x + 1. Since this line does not pass through the origin, S does not contain
the zero vector 0 = (0, 0), and therefore we know S cannot be a subspace. i i i i i i i “main”
2007/2/16
page 253
i 4.3 Remark Subspaces 253 In general, we have the following important observation.
If a subset S of a vector space V fails to contain the zero vector 0,
then it cannot form a subspace. This observation can often be made more quickly than deciding whether or not S is
closed under addition and closed under scalar multiplication. However, we caution that
if the zero vector does belong to S , then the observation is inconclusive and further
investigation is required to determine whether or not S forms a subspace of V .
Example 4.3.5 Let S denote the set of all real symmetric n × n matrices. Verify that S is a subspace of
Mn (R). Solution: The subset of interest is
S = {A ∈ Mn (R) : AT = A}. Note that S is nonempty, since, for example, it contains the zero matrix 0n . We now
verify closure of S under addition and scalar multiplication. Let A and B be in S . Then
AT = A and B T = B. Using these conditions and the properties of the transpose yields
(A + B)T = AT + B T = A + B
and
(kA)T = kAT = kA
for all real values of k . Consequently A + B and kA are both symmetric matrices, so
they are elements of S . Hence S is closed under both addition and scalar multiplication
and so is indeed a subspace of Mn (R). Remark Notice in Example 4.3.5 that it was not necessary to actually write out the
matrices A and B in terms of their elements [aij ] and [bij ], respectively. This shows the
advantage of using simple abstract notation to describe the elements of the subset S in
some situations.
Example 4.3.6 Let V be the vector space of all real-valued functions deﬁned on an interval [a, b], and
let S denote the set of all functions in V that satisfy f (a) = 0. Verify that S is a subspace
of V . Solution: We have
S = {f ∈ V : f (a) = 0}, which is nonempty since it contains, for example, the zero function
O(x) = 0 for all x in [a, b]. Assume that f and g are in S , so that f (a) = 0 and g(a) = 0. We now check for closure
of S under addition and scalar multiplication. We have
(f + g)(a) = f (a) + g(a) = 0 + 0 = 0, i i i i i i i “main”
2007/2/16
page 254
i 254 CHAPTER 4 Vector Spaces which implies that f + g ∈ S . Hence, S is closed under addition. Further, if k is any real
number,
(kf )(a) = kf (a) = k 0 = 0,
so that S is also closed under scalar multiplication. Theorem 4.3.2 therefore implies that
S is a subspace of V . Some representative functions from S are sketched in Figure 4.3.3.
In the next theorem, we establish that the subset {0} of a vector space V is in fact a
subspace of V . We call this subspace the trivial subspace of V .
Theorem 4.3.7 Let V be a vector space with zero vector 0. Then S = {0} is a subspace of V . Proof Note that S is nonempty. Further, the closure of S under addition and scalar
multiplication follow, respectively, from
0+0=0 and k 0 = 0, where the second statement follows from Theorem 4.2.6.
We now use Theorem 4.3.2 to establish an important result pertaining to homogeneous systems of linear equations that has already been illustrated in Example 4.3.3.
Theorem 4.3.8 Let A be an m × n matrix. The solution set of the homogeneous system of linear equations
Ax = 0 is a subspace of Cn . Proof Let S denote the solution set of the homogeneous linear system. Then we can
write
S = {x ∈ Cn : Ax = 0},
y a subset of Cn . Since a homogeneous system always admits the trivial solution x = 0,
we know that S is nonempty. If x1 and x2 are in S , then f (x) Ax1 = 0
a b x and Ax2 = 0. Using properties of the matrix product, we have
A(x1 + x2 ) = Ax1 + Ax2 = 0 + 0 = 0, so that x1 + x2 also solves the system and therefore is in S . Furthermore, if k is any
Figure 4.3.3: Representative
functions in the subspace S given complex scalar, then
in Example 4.3.6. Each function in
A(k x) = kAx = k 0 = 0,
S satisﬁes f (a) = 0.
so that k x is also a solution of the system and therefore is in S . Since S is closed under both
addition and scalar multiplication, it follows from Theorem 4.3.2 that S is a subspace of
Cn .
The preceding theorem has established that the solution set to any homogeneous
linear system of equations is a vector space. Owing to the importance of this vector
space, it is given a special name. DEFINITION 4.3.9
Let A be an m × n matrix. The solution set to the corresponding homogeneous linear
system Ax = 0 is called the null space of A and is denoted nullspace(A). Thus,
nullspace(A) = {x ∈ Cn : Ax = 0}. i i i i i i i “main”
2007/2/16
page 255
i 4.3 255 Subspaces Remarks
1. If the matrix A has real elements, then we will consider only the corresponding
real solutions to Ax = 0. Consequently, in this case,
nullspace(A) = {x ∈ Rn : Ax = 0},
a subspace of Rn .
2. The previous theorem does not hold for the solution set of a nonhomogeneous
linear system Ax = b, for b = 0, since x = 0 is not in the solution set of the
system.
Next we introduce the vector space of primary importance in the study of linear
differential equations. This vector space arises as a subspace of the vector space of all
functions that are deﬁned on an interval I .
Example 4.3.10 Let V denote the vector space of all functions that are deﬁned on an interval I , and let
C k (I ) denote the set of all functions that are continuous and have (at least) k continuous
derivatives on the interval I , for a ﬁxed non-negative integer k . Show that C k (I ) is a
subspace of V . Solution: In this case C k (I ) = {f ∈ V : f, f , f , . . . , f (k) exist and are continuous on I }.
This set is nonempty, as the zero function O(x) = 0 for all x ∈ I is an element of C k (I ).
Moreover, it follows from the properties of derivatives that if we add two functions in
C k (I ), the result is a function in C k (I ). Similarly, if we multiply a function in C k (I ) by
a scalar, then the result is a function in C k (I ). Thus, Theorem 4.3.2 implies that C k (I )
is a subspace of V .
Our ﬁnal result in this section ties together the ideas introduced here with the theory
of differential equations.
Theorem 4.3.11 The set of all solutions to the homogeneous linear differential equation
y + a1 (x)y + a2 (x)y = 0 (4.3.1) on an interval I is a vector space. Proof Let S denote the set of all solutions to the given differential equation. Then S is
a nonempty subset of C 2 (I ), since the identically zero function y = 0 is a solution to
the differential equation. We establish that S is in fact a subspace of2 C k (I ). Let y1 and
y2 be in S , and let k be a scalar. Then we have the following:
y1 + a1 (x)y1 + a2 (x)y1 = 0 and y2 + a1 (x)y2 + a2 (x)y2 = 0. (4.3.2) Now, if y(x) = y1 (x) + y2 (x), then
y + a1 y + a2 y = (y1 + y2 ) + a1 (x)(y1 + y2 ) + a2 (x)(y1 + y2 )
= [y1 + a1 (x)y1 + a2 (x)y1 ] + [y2 + a1 (x)y2 + a2 (x)y2 ]
= 0 + 0 = 0,
2 It is important at this point that we have already established Example 4.3.10, so that S is a subset of a set
that is indeed a vector space. i i i i i i i “main”
2007/2/16
page 256
i 256 CHAPTER 4 Vector Spaces where we have used (4.3.2). Consequently, y(x) = y1 (x) + y2 (x) is a solution to the
differential equation (4.3.1). Moreover, if y(x) = ky1 (x), then
y + a1 y + a2 y = (ky1 ) + a1 (x)(ky1 ) + a2 (x)(ky1 )
= k [y1 + a1 (x)y1 + a2 (x)y1 ] = 0,
where we have once more used (4.3.2). This establishes that y(x) = ky1 (x) is a solution
to Equation (4.3.1). Therefore, S is closed under both addition and scalar multiplication.
Consequently, the set of all solutions to Equation (4.3.1) is a subspace of C 2 (I ).
We will refer to the set of all solutions to a differential equation of the form (4.3.1)
as the solution space of the differential equation. A key theoretical result that we will
establish in Chapter 6 regarding the homogeneous linear differential equation (4.3.1) is
that every solution to the differential equation has the form
y(x) = c1 y1 (x) + c2 y2 (x),
where y1 , y2 are any two nonproportional solutions. The power of this result is impressive: It reduces the search for all solutions to Equation (4.3.1) to the search for just two
nonproportional solutions. In vector space terms, the result can be restated as follows:
Every vector in the solution space to the differential equation (4.3.1) can be written
as a linear combination of any two nonproportional solutions y1 and y2 .
We say that the solution space is spanned by y1 and y2 . Moreover, two nonproportional
solutions are referred to as linearly independent. For example, we saw in Example 1.2.16
that the set of all solutions to the differential equation
y + ω2 y = 0
is spanned by y1 (x) = cos ωx , and y2 (x) = sin ωx, and y1 and y2 are linearly independent. We now begin our investigation as to whether this type of idea will work more
generally when the solution set to a problem is a vector space. For example, what about
the solution set to a homogeneous linear system Ax = 0? We might suspect that if there
are k free variables deﬁning the vectors in nullspace(A), then every solution to Ax = 0
can be expressed as a linear combination of k basic solutions. We will establish that this
is indeed the case in Section 4.9. The two key concepts we need to generalize are (1)
spanning a general vector space with a set of vectors, and (2) linear independence in a
general vector space. These will be addressed in turn in the next two sections. Exercises for 4.3 Key Terms
Subspace, Trivial subspace, Null space of a matrix A. Skills
• Be able to check whether or not a subset S of a vector
space V is a subspace of V .
• Be able to compute the null space of an m × n matrix A. you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false.
1. The null space of an m × n matrix A with real elements
is a subspace of Rm .
2. The solution set of any linear system of m equations
in n variables forms a subspace of Cn . True-False Review 3. The points in R2 that lie on the line y = mx + b form
a subspace of R2 if and only if b = 0. For Questions 1–8, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true, 4. If m < n, then Rm is a subspace of Rn . i i i i i i i “main”
2007/2/16
page 257
i 4.3 5. A nonempty set S of a vector space V that is closed
under scalar multiplication contains the zero vector of
V.
6. If V = R is a vector space under the usual operations
of addition and scalar multiplication, then the subset
R+ of positive real numbers, together with the operations deﬁned in Problem 12 of Section 4.2, forms a
subspace of V .
R3 7. If V =
and S consists of all points on the xy -plane,
the xz-plane, and the yz-plane, then S is a subspace
of V .
8. If V is a vector space, then two different subspaces of
V can contain no common vectors other than 0. Problems
1. Let S = {x ∈ R2 : x = (2k, −3k), k ∈ R}.
(a) Establish that S is a subspace of R2 .
(b) Make a sketch depicting the subspace S in the
Cartesian plane.
2. Let S = {x ∈ R3 : x = (r − 2s, 3r + s, s), r, s ∈ R}.
(a) Establish that S is a subspace of R3 .
(b) Show that the vectors in S lie on the plane with
equation 3x − y + 7z = 0.
For Problems 3–19, express S in set notation and determine
whether it is a subspace of the given vector space V .
3. V = R2 , and S is the set of all vectors (x, y) in V
satisfying 3x + 2y = 0. 6. V = Rn , and S is the set of all solutions to the nonhomogeneous linear system Ax = b, where A is a ﬁxed
m × n matrix and b (= 0) is a ﬁxed vector.
7. V = R2 , and S consists of all vectors (x, y) satisfying
x 2 − y 2 = 0.
8. V = M2 (R), and S is the subset of all 2 × 2 matrices
with det(A) = 1.
9. V = Mn (R), and S is the subset of all n × n lower
triangular matrices. 257 10. V = Mn (R), and S is the subset of all n × n invertible
matrices.
11. V = M2 (R), and S is the subset of all 2 × 2 symmetric
matrices.
12. V = M2 (R), and S is the subset of all 2 × 2 skewsymmetric matrices.
13. V is the vector space of all real-valued functions deﬁned on the interval [a, b], and S is the subset of V
consisting of all functions satisfying f (a) = f (b).
14. V is the vector space of all real-valued functions deﬁned on the interval [a, b], and S is the subset of V
consisting of all functions satisfying f (a) = 1.
15. V is the vector space of all real-valued functions deﬁned on the interval (−∞, ∞), and S is the subset of V
consisting of all functions satisfying f (−x) = f (x)
for all x ∈ (−∞, ∞).
16. V = P2 , and S is the subset of P2 consisting of all
polynomials of the form p(x) = ax 2 + b.
17. V = P2 , and S is the subset of P2 consisting of all
polynomials of the form p(x) = ax 2 + 1.
18. V = C 2 (I ), and S is the subset of V consisting of
those functions satisfying the differential equation
y + 2y − y = 0
on I .
19. V = C 2 (I ), and S is the subset of V consisting of
those functions satisfying the differential equation 4. V = R4 , and S is the set of all vectors of the form
(x1 , 0, x3 , 2).
5. V = R3 , and S is the set of all vectors (x, y, z) in V
satisfying x + y + z = 1. Subspaces y + 2y − y = 1
on I .
For Problems 20–22, determine the null space of the given
matrix A. 1 −2 1
20. A = 4 −7 −2 .
−1 3 4 1 3 −2 1
21. A = 3 10 −4 6 .
2 5 −6 −1 1
i −2
22. A = 3 4i −5 .
−1 −3i i i i i i i i i “main”
2007/2/16
page 258
i 258 CHAPTER 4 Vector Spaces 23. Show that the set of all solutions to the nonhomogeneous differential equation and let
S1 + S2 = {v ∈ V :
v = x + y for some x ∈ S1 and y ∈ S2 } . y + a1 y + a2 y = F (x),
where F (x) is nonzero on an interval I , is not a subspace of C 2 (I ).
24. Let S1 and S2 be subspaces of a vector space V . Let (b) Show that S1 ∩ S2 is a subspace of V . S1 ∪ S2 = {v ∈ V : v ∈ S1 or v ∈ S2 }, (c) Show that S1 + S2 is a subspace of V . S1 ∩ S2 = {v ∈ V : v ∈ S1 and v ∈ S2 }, 4.4 (a) Show that, in general, S1 ∪ S2 is not a subspace
of V . Spanning Sets
The only algebraic operations that are deﬁned in a vector space V are those of addition
and scalar multiplication. Consequently, the most general way in which we can combine
the vectors v1 , v2 , . . . , vk in V is
c1 v1 + c2 v2 + · · · + ck vk , (4.4.1) where c1 , c2 , . . . , ck are scalars. An expression of the form (4.4.1) is called a linear
combination of v1 , v2 , . . . , vk . Since V is closed under addition and scalar multiplication, it follows that the foregoing linear combination is itself a vector in V . One of the
questions we wish to answer is whether every vector in a vector space can be obtained
by taking linear combinations of a ﬁnite set of vectors. The following terminology is
used in the case when the answer to this question is afﬁrmative: DEFINITION 4.4.1
If every vector in a vector space V can be written as a linear combination of v1 ,
v2 , . . . , vk , we say that V is spanned or generated by v1 , v2 , . . . , vk and call the
set of vectors {v1 , v2 , . . . , vk } a spanning set for V . In this case, we also say that
{v1 , v2 , . . . , vk } spans V .
This spanning idea was introduced in the preceding section within the framework
of differential equations. In addition, we are all used to representing geometric vectors
in R3 in terms of their components as (see Section 4.1)
v = a i + bj + ck,
where i, j, and k denote the unit vectors pointing along the positive x -, y -, and z-axes,
respectively, of a rectangular Cartesian coordinate system. Using the above terminology,
we say that v has been expressed as a linear combination of the vectors i, j, and k, and
that the vector space of all geometric vectors is spanned by i, j, and k.
We now consider several examples to illustrate the spanning concept in different
vector spaces.
Example 4.4.2 Show that R2 is spanned by the vectors
v1 = (1, 1) and v2 = (2, −1). i i i i i i i “main”
2007/2/16
page 259
i 4.4 259 Spanning Sets Solution: We must establish that for every v = (x1 , x2 ) in R2 , there exist constants
c1 and c2 such that
v = c1 v1 + c2 v2 .
y (4.4.2) That is, in component form,
(4/3, 4/3)
(1, 1) (x1 , x2 ) = c1 (1, 1) + c2 (2, −1). (2, 1) v1 Equating corresponding components in this equation yields the following linear system: v
x
(2/3, 1/3) v2 c1 + 2c2 = x1 ,
c1 − c2 = x2 . (2, 1) In this system, we view x1 and x2 as ﬁxed, while the variables we must solve for are c1
and c2 . The determinant of the matrix of coefﬁcients of this system is Figure 4.4.1: The vector
v = (2, 1) expressed as a linear
combination of v1 = (1, 1) and
v2 = (2, −1). 12
= −3.
1 −1
Since this is nonzero regardless of the values of x1 and x2 , the matrix of coefﬁcients is
invertible, and hence for all (x1 , x2 ) ∈ R2 , the system has a (unique) solution according
to Theorem 2.6.4. Thus, Equation (4.4.2) can be satisﬁed for every vector v ∈ R2 , so the
given vectors do span R2 . Indeed, solving the linear system yields
c1 = 1 (x1 + 2x2 ),
3 y
c2v2
v2 Hence, v2
c2
c
v (x1 , x2 ) = 1 (x1 + 2x2 )v1 + 1 (x1 − x2 )v2 .
3
3 v1
1 v1 c2 = 1 (x1 − x2 ).
3 For example, if v = (2, 1), then c1 =
x illustrated in Figure 4.4.1. c1v1 4
3 and c2 = 1 , so that v = 4 v1 + 1 v2 . This is
3
3
3 Figure 4.4.2: Any two
noncollinear vectors in R2 span
R2 . More generally, any two nonzero and noncolinear vectors v1 and v2 in R2 span R2 ,
since, as illustrated geometrically in Figure 4.4.2, every vector in R2 can be written as a
linear combination of v1 and v2 . Example 4.4.3 Determine whether the vectors v1 = (1, −1, 4), v2 = (−2, 1, 3), and v3 = (4, −3, 5)
span R3 . Solution: Let v = (x1 , x2 , x3 ) be an arbitrary vector in R3 . We must determine
whether there are real numbers c1 , c2 , c3 such that
v = c1 v1 + c2 v2 + c3 v3 (4.4.3) or, in component form,
(x1 , x2 , x3 ) = c1 (1, −1, 4) + c2 (−2, 1, 3) + c3 (4, −3, 5).
Equating corresponding components on either side of this vector equation yields
c1 − 2c2 + 4c3 = x1 ,
−c1 + c2 − 3c3 = x2 ,
4c1 + 3c2 + 5c3 = x3 . i i i i i i i “main”
2007/2/16
page 260
i 260 CHAPTER 4 Vector Spaces Reducing the augmented matrix of this system to row-echelon form, we obtain 1 −2 4
x1 0 1 −1
.
−x1 − x2
0 0 0 7x1 + 11x2 + x3
It follows that the system is consistent if and only if x1 , x2 , x3 satisfy
7x1 + 11x2 + x3 = 0. (4.4.4) Consequently, Equation (4.4.3) holds only for those vectors v = (x1 , x2 , x3 ) in R3
whose components satisfy Equation (4.4.4). Hence, v1 , v2 , and v3 do not span R3 .
Geometrically, Equation (4.4.4) is the equation of a plane through the origin in space,
and so by taking linear combinations of the given vectors, we can obtain only those
vectors which lie on this plane. We leave it as an exercise to verify that indeed the three
given vectors lie in the plane with Equation (4.4.4). It is worth noting that this plane
forms a subspace S of R3 , and that while V is not spanned by the vectors v1 , v2 , and v3 ,
S is.
The reason that the vectors in the previous example did not span R3 was because
they were coplanar. In general, any three noncoplanar vectors v1 , v2 , and v3 in R3
span R3 , since, as illustrated in Figure 4.4.3, every vector in R3 can be written as a linear
combination of v1 , v2 , and v3 . In subsequent sections we will make this same observation
from a more algebraic point of view.
z
v c1v1 c2v2 c3v3 v1
c3v3
c1v1
v3 c2v2 v2
y
x Figure 4.4.3: Any three noncoplanar vectors in R3 span R3 . Notice in the previous example that the linear combination (4.4.3) can be written as
the matrix equation
Ac = v,
where the columns of A are the given vectors v1 , v2 , and v3 : A = [v1 , v2 , v3 ]. Thus,
the question of whether or not the vectors v1 , v2 , and v3 span R3 can be formulated as
follows: Does the system Ac = v have a solution c for every v in R3 ? If so, then the
column vectors of A span R3 , and if not, then the column vectors of A do not span R3 .
This reformulation applies more generally to vectors in Rn , and we state it here for the
record.
Theorem 4.4.4 Let v1 , v2 , . . . , vk be vectors in Rn . Then {v1 , v2 , . . . , vk } spans Rn if and only if, for
the matrix A = [v1 , v2 , . . . , vk ], the linear system Ac = v is consistent for every v in
Rn . i i i i i i i “main”
2007/2/16
page 261
i 4.4 Spanning Sets 261 Proof Rewriting the system Ac = v as the linear combination
c1 v1 + c2 v2 + · · · + ck vk = v,
we see that the existence of a solution (c1 , c2 , . . . , ck ) to this vector equation for each v
in Rn is equivalent to the statement that {v1 , v2 , . . . , vk } spans Rn .
Next, we consider a couple of examples involving vector spaces other than Rn .
Example 4.4.5 Verify that
10
,
00 A1 = A2 = 11
,
00 A3 = 11
,
10 A4 = 11
11 span M2 (R). Solution: An arbitrary vector in M2 (R) is of the form
A= ab
.
cd If we write
c1 A1 + c2 A2 + c3 A3 + c4 A4 = A,
then equating the elements of the matrices on each side of the equation yields the system
c1 + c2 + c3 + c4
c2 + c3 + c4
c3 + c4
c4 = a,
= b,
= c,
= d. Solving this by back substitution gives
c1 = a − b, c2 = b − c, c3 = c − d, c4 = d. Hence, we have
A = (a − b)A1 + (b − c)A2 + (c − d)A3 + dA4 .
Consequently every vector in M2 (R) can be written as a linear combination of A1 , A2 ,
A3 , and A4 , and therefore these matrices do indeed span M2 (R). Remark The most natural spanning set for M2 (R) is
10
01
00
00
,
,
,
00
00
10
01 , a fact that we leave to the reader as an exercise.
Example 4.4.6 Determine a spanning set for P2 , the vector space of all polynomials of degree 2 or less. Solution: The general polynomial in P2 is
p(x) = a0 + a1 x + a2 x 2 . If we let
p0 (x) = 1, p1 (x) = x, p2 (x) = x 2 , then
p(x) = a0 p0 (x) + a1 p1 (x) + a2 p2 (x).
Thus, every vector in P2 is a linear combination of 1, x , and x 2 , and so a spanning set
for P2 is {1, x, x 2 }. For practice, the reader might show that {x 2 , x + x 2 , 1 + x + x 2 } is
another spanning set for P2 , by making the appropriate modiﬁcations to the calculations
in this example. i i i i i i i “main”
2007/2/16
page 262
i 262 CHAPTER 4 Vector Spaces The Linear Span of a Set of Vectors
Now let v1 , v2 , . . . , vk be vectors in a vector space V . Forming all possible linear combinations of v1 , v2 , . . . , vk generates a subset of V called the linear span of {v1 , v2 , . . . , vk },
denoted span{v1 , v2 , . . . , vk }. We have
span{v1 , v2 , . . . , vk } = {v ∈ V : v = c1 v1 + c2 v2 + · · · + ck vk , c1 , c2 , . . . , ck ∈ F }.
(4.4.5)
For example, suppose V = C 2 (I ), and let y1 (x) = sin x and y2 (x) = cos x . Then
span{y1 , y2 } = {y ∈ C 2 (I ) : y(x) = c1 cos x + c2 sin x, c1 , c2 ∈ R}.
From Example 1.2.16, we recognize y1 and y2 as being nonproportional solutions to
the differential equation y + y = 0. Consequently, in this example, the linear span of
the given functions coincides with the set of all solutions to the differential equation
y + y = 0 and therefore is a subspace of V . Our next theorem generalizes this to show
that any linear span of vectors in any vector space forms a subspace. Theorem 4.4.7 Let v1 , v2 , . . . , vk be vectors in a vector space V . Then span{v1 , v2 , . . . , vk } is a subspace
of V . Proof Let S = span{v1 , v2 , . . . , vk }. Then 0 ∈ S (corresponding to c1 = c2 = · · · =
ck = 0 in (4.4.5)), so S is nonempty. We now verify closure of S under addition and
scalar multiplication. If u and v are in S , then, from Equation (4.4.5),
u = a1 v1 + a2 v2 + · · · + ak vk and v = b1 v1 + b2 v2 + · · · + bk vk , for some scalars ai , bi . Thus,
u + v = (a1 v1 + a2 v2 + · · · + ak vk ) + (b1 v1 + b2 v2 + · · · + bk vk )
= (a1 + b1 )v1 + (a2 + b2 )v2 + · · · + (ak + bk )vk
= c1 v1 + c2 v2 + · · · + ck vk ,
where ci = ai + bi for each i = 1, 2, . . . , k . Consequently, u + v has the proper form
for membership in S according to (4.4.5), so S is closed under addition. Further, if r is
any scalar, then
r u = r(a1 v1 + a2 v2 + · · · + ak vk )
= (ra1 )v1 + (ra2 )v2 + · · · + (rak )vk
= d1 v1 + d2 v2 + · · · + dk vk ,
where di = rai for each i = 1, 2, . . . , k . Consequently, r u ∈ S , and so S is also closed
under scalar multiplication. Hence, S = span{v1 , v2 , . . . , vk } is a subspace of V . Remarks
1. We will also refer to span{v1 , v2 , . . . , vk } as the subspace of V spanned by
v1 , v2 , . . . , vk .
2. As a special case, we will declare that span(∅) = {0}. i i i i i i i “main”
2007/2/16
page 263
i 4.4 Example 4.4.8 Spanning Sets 263 If V = R2 and v1 = (−1, 1), determine span{v1 }. Solution: We have
span{v1 } = {v ∈ R2 : v = c1 v1 , c1 ∈ R}
= {v ∈ R2 : v = c1 (−1, 1), c1 ∈ R}
= {v ∈ R2 : v = (−c1 , c1 ), c1 ∈ R}. Geometrically, this is the line through the origin with parametric equations x = −c1 ,
y = c1 , so that the Cartesian equation of the line is y = −x . (See Figure 4.4.4.)
y (—c1, c1)
c1v1
( 1, 1) The subspace of
by the vector v1 2 spanned
( 1, 1) v1
x Figure 4.4.4: The subspace of R2 spanned by v1 = (−1, 1). Example 4.4.9 If V = R3 , v1 = (1, 0, 1), and v2 = (0, 1, 1), determine the subspace of R3 spanned by
v1 and v2 . Does w = (1, 1, −1) lie in this subspace? Solution: We have span{v1 , v2 } = {v ∈ R3 : v = c1 v1 + c2 v2 , c1 , c2 ∈ R}
= {v ∈ R3 : v = c1 (1, 0, 1) + c2 (0, 1, 1), c1 , c2 ∈ R}
= {v ∈ R3 : v = (c1 , c2 , c1 + c2 ), c1 , c2 ∈ R}.
Since the vector w = (1, 1, −1) is not of the form (c1 , c2 , c1 + c2 ), it does not lie in
span{v1 , v2 }. Geometrically, span{v1 , v2 } is the plane through the origin determined by
the two given vectors v1 and v2 . It has parametric equations x = c1 , y = c2 , z = c1 + c2 ,
which implies that its Cartesian equation is z = x + y . Thus, the fact that w is not
in span{v1 , v2 } means that w does not lie in this plane. The subspace is depicted in
Figure 4.4.5.
z
The subspace of 3 spanned
by v1 (1, 0, 1), v2 (0, 1, 1)
v1 v2
y x w (1, 1, 1) does not
lie in span{v1, v2} Figure 4.4.5: The subspace of R3 spanned by v1 = (1, 0, 1) and v2 = (0, 1, 1) is the plane
with Cartesian equation z = x + y . i i i i i i i “main”
2007/2/16
page 264
i 264 CHAPTER 4 Vector Spaces Example 4.4.10 Let
A1 = 10
,
00 A2 = 01
,
10 A3 = 00
01 in M2 (R). Determine span{A1 , A2 , A3 }. Solution: By deﬁnition we have span{A1 , A2 , A3 } = {A ∈ M2 (R) : A = c1 A1 + c2 A2 + c3 A3 , c1 , c2 , c3 ∈ R}
10
01
00
= A ∈ M2 (R) : A = c1
+ c2
+ c3
00
10
01
c1 c2
= A ∈ M2 (R) : A =
, c1 , c2 , c3 ∈ R .
c2 c3
This is the set of all real 2 × 2 symmetric matrices. Example 4.4.11 Determine the subspace of P2 spanned by
p1 (x) = 1 + 3x, p2 (x) = x + x 2 , and decide whether {p1 , p2 } is a spanning set for P2 . Solution: We have span{p1 , p2 } = {p ∈ P2 : p(x) = c1 p1 (x) + c2 p2 (x), c1 , c2 ∈ R}
= {p ∈ P2 : p(x) = c1 (1 + 3x) + c2 (x + x 2 ), c1 , c2 ∈ R}
= {p ∈ P2 : p(x) = c1 + (3c1 + c2 )x + c2 x 2 , c1 , c2 ∈ R}.
Next, we will show that {p1 , p2 } is not a spanning set for P2 . To establish this, we need
give only one example of a polynomial in P2 that is not in span{p1 , p2 }. There are many
such choices here, but suppose we consider p(x) = 1 + x . If this polynomial were in
span{p1 , p2 }, then we would have to be able to ﬁnd values of c1 and c2 such that
1 + x = c1 + (3c1 + c2 )x + c2 x 2 . (4.4.6) Since there is no x 2 term on the left-hand side of this expression, we must set c2 = 0.
But then (4.4.6) would reduce to
1 + x = c1 (1 + 3x).
Equating the constant terms on each side of this forces c1 = 1, but then the coefﬁcients of
x do not match. Hence, such an equality is impossible. Consequently, there are no values
of c1 and c2 such that the Equation (4.4.6) holds, and therefore, span{p1 , p2 } = P2 . Remark In the previous example, the reader may well wonder why we knew from
the beginning to select p(x) = 1 + x as a vector that would be outside of span{p1 , p2 }.
In truth, we only need to ﬁnd a polynomial that does not have the form p(x) = c1 +
(3c1 + c2 )x + c2 x 2 and in fact, “most” of the polynomials in P2 would have achieved
the desired result here. i i i i i i i “main”
2007/2/16
page 265
i 4.4 Spanning Sets 265 Exercises for 4.4 Key Terms
Linear combination, Linear span, Spanning set. 8. If S is a spanning set for a vector space V , then any
proper subset S of S is not a spanning set for V . Skills 9. The vector space of 3 × 3 upper triangular matrices is
spanned by the matrices Eij where 1 ≤ i ≤ j ≤ 3. • Be able to determine whether a given set of vectors
S spans a vector space V , and be able to prove your
answer mathematically.
• Be able to determine the linear span of a set of vectors. For vectors in Rn , be able to give a geometric
description of the linear span.
• If S is a spanning set for a vector space V , be able to
write any vector in V as a linear combination of the
elements of S .
• Be able to construct a spanning set for a vector space
V . As a special case, be able to determine a spanning
set for the null space of an m × n matrix.
• Be able to determine whether a particular vector v in
a vector space V lies in the linear span of a set S of
vectors in V . True-False Review
For Questions 1–12, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false.
1. The linear span of a set of vectors in a vector space V
forms a subspace of V .
2. If some vector v in a vector space V is a linear combination of vectors in a set S , then S spans V .
3. If S is a spanning set for a vector space V and W is a
subspace of V , then S is a spanning set for W .
4. If S is a spanning set for a vector space V , then every
vector v in V must be uniquely expressible as a linear
combination of the vectors in S .
5. A set S of vectors in a vector space V spans V if and
only if the linear span of S is V .
6. The linear span of two vectors in R3 is a plane through
the origin.
7. Every vector space V has a ﬁnite spanning set. 10. A spanning set for the vector space P2 must contain a
polynomial of each degree 0, 1, and 2.
11. If m < n, then any spanning set for Rn must contain
more vectors than any spanning set for Rm .
12. The vector space P of all polynomials with real coefﬁcients cannot be spanned by a ﬁnite set S . Problems
For Problems 1–3, determine whether the given set of vectors
spans R2 .
1. {(1, −1), (2, −2), (2, 3)}.
2. {(2, 5), (0, 0)}.
3. {(6, −2), (−2, 2/3), (3, −1)}.
Recall that three vectors v1 , v2 , v3 in R3 are coplanar if and
only if
det([v1 , v2 , v3 ]) = 0.
For Problems 4–6, use this result to determine whether the
given set of vectors spans R3 .
4. {(1, −1, 1), (2, 5, 3), (4, −2, 1)}.
5. {(1, −2, 1), (2, 3, 1), (0, 0, 0), (4, −1, 2)}.
6. {(2, −1, 4), (3, −3, 5), (1, 1, 3)}.
7. Show that the set of vectors
{(1, 2, 3), (3, 4, 5), (4, 5, 6)}
does not span R3 , but that it does span the subspace
of R3 consisting of all vectors lying in the plane with
equation x − 2y + z = 0.
8. Show that v1 = (2, −1), v2 = (3, 2) span R2 , and express the vector v = (5, −7) as a linear combination
of v1 , v2 .
9. Show that v1 = (−1, 3, 2), v2 = (1, −2, 1), v3 =
(2, 1, 1) span R3 , and express v = (x, y, z) as a linear
combination of v1 , v2 , v3 . i i i i i i i “main”
2007/2/16
page 266
i 266 CHAPTER 4 Vector Spaces 10. Show that v1 = (1, 1), v2 = (−1, 2), v3 = (1, 4) span
R2 . Do v1 , v2 alone span R2 also? For Problems 22–24, determine whether the given vector v
lies in span{v1 , v2 }. 11. Let S be the subspace of R3 consisting of all vectors
of the form v = (c1 , c2 , c2 − 2c1 ). Show that S is
spanned by v1 = (1, 0, −2), v2 = (0, 1, 1). 22. v = (3, 3, 4), v1 = (1, −1, 2), v2 = (2, 1, 3) in R3 . 12. Let S be the subspace of R4 consisting of all vectors
of the form v = (c1 , c2 , c2 − c1 , c1 − 2c2 ). Determine
a set of vectors that spans S .
13. Let S be the subspace of R3 consisting of all solutions
to the linear system
x − 2y − z = 0. 23. v = (5, 3, −6), v1 = (−1, 1, 2), v2 = (3, 1, −4) in
R3 .
24. v = (1, 1, −2), v1 = (3, 1, 2), v2 = (−2, −1, 1) in
R3 .
25. If p1 (x) = x − 4 and p2 (x) = x 2 − x + 3, determine
whether p(x) = 2x 2 − x + 2 lies in span{p1 , p2 }.
26. Consider the vectors Determine a set of vectors that spans S .
For Problems 14–15, determine a spanning set for the null
space of the given matrix A. 123
14. A = 3 4 5 .
567 123 5
15. A = 1 3 4 2 .
2 4 6 −1
16. Let S be the subspace of M2 (R) consisting of all symmetric 2 × 2 matrices with real elements. Show that S
is spanned by the matrices
A1 = 10
,
00 A2 = 00
,
01 A3 = 01
.
10 17. Let S be the subspace of M2 (R) consisting of all skewsymmetric 2 × 2 matrices with real elements. Determine a matrix that spans S .
18. Let S be the subset of M2 (R) consisting of all upper
triangular 2 × 2 matrices. A1 = 20. v1 = (1, 2, −1), v2 = (−2, −4, 2).
R3 spanned by the vectors
21. Let S be the subspace of
v1 = (1, 1, −1), v2 = (2, 1, 3), v3 = (−2, −2, 2).
Show that S also is spanned by v1 and v2 only. 01
,
−2 1 A3 = 30
12 27. Consider the vectors
A1 = 12
,
−1 3 A2 = −2 1
1 −1 in M2 (R). Find span{A1 , A2 }, and determine whether
or not
31
B=
−2 4
lies in this subspace.
28. Let V = C ∞ (I ) and let S be the subspace of V
spanned by the functions
f (x) = cosh x, g(x) = sinh x. (a) Give an expression for a general vector in S .
(b) Show that S is also spanned by the functions
h(x) = ex , (b) Determine a set of 2 × 2 matrices that spans S . 19. v1 = (1, −1, 2), v2 = (2, −1, 3). A2 = in M2 (R). Determine span{A1 , A2 , A3 }. (a) Verify that S is a subspace of M2 (R).
For Problems 19–20, determine span{v1 , v2 } for the given
vectors in R3 , and describe it geometrically. 1 −1
,
20 j (x) = e−x . For Problems 29–32, give a geometric description of the subspace of R3 spanned by the given set of vectors.
29. {0}.
30. {v1 }, where v1 is any nonzero vector in R3 .
31. {v1 , v2 }, where v1 , v2 are nonzero and noncollinear
vectors in R3 . i i i i i i i “main”
2007/2/16
page 267
i 4.5 32. {v1 , v2 }, where v1 , v2 are collinear vectors in R3 .
33. Prove that if S and S are subsets of a vector space V
such that S is a subset of S , then span(S) is a subset
of span(S ). 4.5 Linear Dependence and Linear Independence 267 34. Prove that
span{v1 , v2 , v3 } = span{v1 , v2 }
if and only if v3 can be written as a linear combination
of v1 and v2 . Linear Dependence and Linear Independence
As indicated in the previous section, in analyzing a vector space we will be interested in
determining a spanning set. The reader has perhaps already noticed that a vector space
V can have many such spanning sets. Example 4.5.1 Observe that {(1, 0), (0, 1)}, {(1, 0), (1, 1)}, and {(1, 0), (0, 1), (1, 2)} are all spanning
sets for R2 .
As another illustration, two different spanning sets for V = M2 (R) were given in Example 4.4.5 and the remark that followed. Given the abundance of spanning sets available
for a given vector space V , we are faced with a natural question: Is there a “best class
of” spanning sets to use? The answer, to a large degree, is “yes”. For instance, in Example 4.5.1, the spanning set {(1, 0), (0, 1), (1, 2)} contains an “extra” vector, (1, 2), which
seems to be unnecessary for spanning R2 , since {(1, 0), (0, 1)} is already a spanning set.
In some sense, {(1, 0), (0, 1)} is a more efﬁcient spanning set. It is what we call a minimal spanning set, since it contains the minimum number of vectors needed to span the
vector space.3
But how will we know if we have found a minimal spanning set (assuming one
exists)? Returning to the example above, we have seen that
span{(1, 0), (0, 1)} = span{(1, 0), (0, 1), (1, 2)} = R2 .
Observe that the vector (1, 2) is already a linear combination of (1, 0) and (0, 1), and
therefore it does not add any new vectors to the linear span of {(1, 0), (0, 1)}.
As a second example, consider the vectors v1 = (1, 1, 1), v2 = (3, −2, 1), and
v3 = 4v1 + v2 = (7, 2, 5). It is easily veriﬁed that det([v1 , v2 , v3 ]) = 0. Consequently,
the three vectors lie in a plane (see Figure 4.5.1) and therefore, since they are not collinear,
the linear span of these three vectors is the whole of this plane. Furthermore, the same
plane is generated if we consider the linear span of v1 and v2 alone. As in the previous
example, the reason that v3 does not add any new vectors to the linear span of {v1 , v2 }
is that it is already a linear combination of v1 and v2 . It is not possible, however, to
generate all vectors in the plane by taking linear combinations of just one vector, as we
could generate only a line lying in the plane in that case. Consequently, {v1 , v2 } is a
minimal spanning set for the subspace of R3 consisting of all points lying on the plane.
As a ﬁnal example, recall from Example 1.2.16 that the solution space to the differential equation
y +y =0 3 Since a single (nonzero) vector in R2 spans only the line through the origin along which it points, it cannot
span all of R2 ; hence, the minimum number of vectors required to span R2 is 2. i i i i i i i “main”
2007/2/16
page 268
i 268 CHAPTER 4 Vector Spaces
z (7, 2, 5) v3 4v1 v2 (3, 2, 1)
v2 v1
(1, 1, 1)
y (3, 2, 0)
x (1, 1, 0)
(7, 2, 0) Figure 4.5.1: v3 = 4v1 + v2 lies in the plane through the origin containing v1 and v2 , and so,
span{v1 , v2 , v3 } = span{v1 , v2 }. can be written as span{y1 , y2 }, where y1 (x) = cos x and y2 (x) = sin x . However, if we
let y3 (x) = 3 cos x − 2 sin x , for instance, then {y1 , y2 , y3 } is also a spanning set for the
solution space of the differential equation, since
span{y1 , y2 , y3 } = {c1 cos x + c2 sin x + c3 (3 cos x − 2 sin x) : c1 , c2 , c3 ∈ R}
= {(c1 + 3c3 ) cos x + (c2 − 2c3 ) sin x : c1 , c2 , c3 ∈ R}
= {d1 cos x + d2 sin x : d1 , d2 ∈ R}
= span{y1 , y2 }.
The reason that {y1 , y2 , y3 } is not a minimal spanning set for the solution space is that
y3 is a linear combination of y1 and y2 , and therefore, as we have just shown, it does not
add any new vectors to the linear span of {cos x, sin x }.
More generally, it is not too difﬁcult to extend the argument used in the preceding
examples to establish the following general result. Theorem 4.5.2 Let {v1 , v2 , . . . , vk } be a set of at least two vectors in a vector space V . If one of the
vectors in the set is a linear combination of the other vectors in the set, then that vector
can be deleted from the given set of vectors and the linear span of the resulting set of
vectors will be the same as the linear span of {v1 , v2 , . . . , vk }. Proof The proof of this result is left for the exercises (Problem 48).
For instance, if v1 is a linear combination of v2 , v3 , . . . , vk , then Theorem 4.5.2 says
that
span{v1 , v2 , . . . , vk } = span{v2 , v3 , . . . , vk }.
In this case, the set {v1 , v2 , . . . , vk } is not a minimal spanning set.
To determine a minimal spanning set, the problem we face in view of Theorem 4.5.2
is that of determining when a vector in {v1 , v2 , . . . , vk } can be expressed as a linear
combination of the remaining vectors in the set. The correct formulation for solving this
problem requires the concepts of linear dependence and linear independence, which we
are now ready to introduce. First we consider linear dependence. i i i i i i i “main”
2007/2/16
page 269
i 4.5 Linear Dependence and Linear Independence 269 DEFINITION 4.5.3
A ﬁnite nonempty set of vectors {v1 , v2 , . . . , vk } in a vector space V is said to be
linearly dependent if there exist scalars c1 , c2 , . . . , ck , not all zero, such that
c1 v1 + c2 v2 + · · · + ck vk = 0.
Such a nontrivial linear combination of vectors is sometimes referred to as a linear
dependency among the vectors v1 , v2 , . . . , vk .
A set of vectors that is not linearly dependent is called linearly independent. This can be
stated mathematically as follows: DEFINITION 4.5.4
A ﬁnite, nonempty set of vectors {v1 , v2 , . . . , vk } in a vector space V is said to be
linearly independent if the only values of the scalars c1 , c2 , . . . , ck for which
c1 v1 + c2 v2 + · · · + ck vk = 0
are c1 = c2 = · · · = ck = 0. Remarks
1. It follows immediately from the preceding two deﬁnitions that a nonempty set of
vectors in a vector space V is linearly independent if and only if it is not linearly
dependent.
2. If {v1 , v2 , . . . , vk } is a linearly independent set of vectors, we sometimes informally
say that the vectors v1 , v2 , . . . , vk are themselves linearly independent. The same
remark applies to the linearly dependent condition as well.
Consider the simple case of a set containing a single vector v. If v = 0, then {v} is
linearly dependent, since for any nonzero scalar c1 ,
c1 0 = 0.
On the other hand, if v = 0, then the only value of the scalar c1 for which
c1 v = 0
is c1 = 0. Consequently, {v} is linearly independent. We can therefore state the next
theorem.
Theorem 4.5.5 A set consisting of a single vector v in a vector space V is linearly dependent if and only
if v = 0. Therefore, any set consisting of a single nonzero vector is linearly independent.
We next establish that linear dependence of a set containing at least two vectors is
equivalent to the property that we are interested in—namely, that at least one vector in
the set can be expressed as a linear combination of the remaining vectors in the set. i i i i i i i “main”
2007/2/16
page 270
i 270 CHAPTER 4 Vector Spaces Theorem 4.5.6 Let {v1 , v2 , . . . , vk } be a set of at least two vectors in a vector space V . Then {v1 , v2 , . . . , vk }
is linearly dependent if and only if at least one of the vectors in the set can be expressed
as a linear combination of the others. Proof If {v1 , v2 , . . . , vk } is linearly dependent, then according to Deﬁnition 4.5.3, there
exist scalars c1 , c2 , . . . , ck , not all zero, such that
c1 v1 + c2 v2 + · · · + ck vk = 0.
Suppose that ci = 0. Then we can express vi as a linear combination of the other vectors
as follows:
vi = − 1
(c1 v1 + c2 v2 + · · · + ci −1 vi −1 + ci +1 vi +1 + · · · + ck vk ).
ci Conversely, suppose that one of the vectors, say, vj , can be expressed as a linear combination of the remaining vectors. That is,
vj = c1 v1 + c2 v2 + · · · + cj −1 vj −1 + cj +1 vj +1 + · · · + ck vk .
Adding (−1)vj to both sides of this equation yields
c1 v1 + c2 v2 + · · · + cj −1 vj −1 − vj + cj +1 vj +1 + · · · + ck vk = 0.
Since the coefﬁcient of vj is −1 = 0, the set of vectors {v1 , v2 , . . . , vk } is linearly
dependent.
As far as the minimal-spanning-set idea is concerned, Theorems 4.5.6 and 4.5.2 tell
us that a linearly dependent spanning set for a (nontrivial) vector space V cannot be a
minimal spanning set. On the other hand, we will see in the next section that a linearly
v3
independent spanning set for V must be a minimal spanning set for V . For the remainder
v2
of this section, however, we focus more on the mechanics of determining whether a given
v1
set of vectors is linearly independent or linearly dependent. Sometimes this can be done
x
by inspection. For example, Figure 4.5.2 illustrates that any set of three vectors in R2 is
linearly dependent.
Figure 4.5.2: The set of vectors
As another example, let V be the vector space of all functions deﬁned on an interval
{v1 , v2 , v3 } is linearly dependent
I . If
2 , since v is a linear
in R
3
f1 (x) = 1,
f2 (x) = 2 sin2 x,
f3 (x) = −5 cos2 x,
combination of v1 and v2 .
y then {f1 , f2 , f3 } is linearly dependent in V , since the identity sin2 x + cos2 x = 1
implies that for all x ∈ I ,
1
f1 (x) = 2 f2 (x) − 1 f3 (x).
5 We can therefore conclude from Theorem 4.5.2 that
span{1, 2 sin2 x, −5 cos2 x } = span{2 sin2 x, −5 cos2 x }.
In relatively simple examples, the following general results can be applied. They are a
direct consequence of the deﬁnition of linearly dependent vectors and are left for the
exercises (Problem 49).
Proposition 4.5.7 Let V be a vector space.
1. Any set of two vectors in V is linearly dependent if and only if the vectors are
proportional. i i i i i i i “main”
2007/2/16
page 271
i 4.5 Linear Dependence and Linear Independence 271 2. Any set of vectors in V containing the zero vector is linearly dependent. Remark We emphasize that the ﬁrst result in Proposition 4.5.7 holds only for the
case of two vectors. It cannot be applied to sets containing more than two vectors.
Example 4.5.8 If v1 = (1, 2, −9) and v2 = (−2, −4, 18), then {v1 , v2 } is linearly dependent in R3 ,
since v2 = −2v1 . Geometrically, v1 and v2 lie on the same line. Example 4.5.9 If
A1 = 21
,
34 A2 = 00
,
00 A3 = 25
,
−3 2 then {A1 , A2 , A3 } is linearly dependent in M2 (R), since it contains the zero vector from
M2 (R).
For more complicated situations, we must resort to Deﬁnitions 4.5.3 and 4.5.4,
although conceptually it is always helpful to keep in mind that the essence of the problem
we are solving is to determine whether a vector in a given set can be expressed as a linear
combination of the remaining vectors in the set. We now give some examples to illustrate
the use of Deﬁnitions 4.5.3 and 4.5.4.
Example 4.5.10 If v1 = (1, 2, −1) v2 = (2, −1, 1), and v3 = (8, 1, 1), show that {v1 , v2 , v3 } is linearly
dependent in R3 , and determine the linear dependency relationship. Solution: We must ﬁrst establish that there are values of the scalars c1 , c2 , c3 , not all
zero, such that
c1 v1 + c2 v2 + c3 v3 = 0. (4.5.1) Substituting for the given vectors yields
c1 (1, 2, −1) + c2 (2, −1, 1) + c3 (8, 1, 1) = (0, 0, 0).
That is,
(c1 + 2c2 + 8c3 , 2c1 − c2 + c3 , −c1 + c2 + c3 ) = (0, 0, 0).
Equating corresponding components on either side of this equation yields
c1 + 2c2 + 8c3 = 0,
2c1 − c2 + c3 = 0,
−c1 + c2 + c3 = 0.
The reduced row-echelon form of the augmented matrix of this system is 1020
0 1 3 0.
0000
Consequently, the system has an inﬁnite number of solutions for c1 , c2 , c3 , so the vectors
are linearly dependent.
In order to determine a speciﬁc linear dependency relationship, we proceed to ﬁnd
c1 , c2 , and c3 . Setting c3 = t , we have c2 = −3t and c1 = −2t . Taking t = 1 and i i i i i i i “main”
2007/2/16
page 272
i 272 CHAPTER 4 Vector Spaces substituting these values for c1 , c2 , c3 into (4.5.1), we obtain the linear dependency
relationship
−2v1 − 3v2 + v3 = 0,
or equivalently,
3
1
v1 = − 2 v2 + 2 v3 , which can be easily veriﬁed using the given expressions for v1 , v2 , and v3 . It follows
from Theorem 4.5.2 that
span{v1 , v2 , v3 } = span{v2 , v3 }.
Geometrically, we can conclude that v1 lies in the plane determined by the vectors v2
and v3 . Example 4.5.11 Determine whether the following matrices are linearly dependent or linearly independent
in M2 (R):
A1 = 1 −1
,
20 A2 = 21
,
03 A3 = 1 −1
.
21 Solution: The condition for determining whether these vectors are linearly dependent
or linearly independent,
c1 A1 + c2 A2 + c3 A3 = 02 ,
is equivalent in this case to
c1 1 −1
21
1 −1
+ c2
+ c3
20
03
21 = 00
,
00 which is satisﬁed if and only if
c1 + 2c2 +
−c1 + c2 −
2c1
+
3c2 + c3
c3
2c3
c3 =
=
=
= 0,
0,
0,
0. The reduced row-echelon form of the augmented matrix of this homogeneous system is 1000
0 1 0 0 0 0 1 0,
0000
which implies that the system has only the trivial solution c1 = c2 = c3 = 0. It follows
from Deﬁnition 4.5.4 that {A1 , A2 , A3 } is linearly independent.
As a corollary to Theorem 4.5.2, we establish the following result.
Corollary 4.5.12 Any nontrivial, ﬁnite set of linearly dependent vectors in a vector space V contains a
linearly independent subset that has the same linear span as the given set of vectors. i i i i i i i “main”
2007/2/16
page 273
i 4.5 Linear Dependence and Linear Independence 273 Proof Since the given set is linearly dependent, at least one of the vectors in the set is a
linear combination of the remaining vectors, by Theorem 4.5.6. Thus, by Theorem 4.5.2,
we can delete that vector from the set, and the resulting set of vectors will span the same
subspace of V as the original set. If the resulting set is linearly independent, then we
are done. If not, then we can repeat the procedure to eliminate another vector in the
set. Continuing in this manner (with a ﬁnite number of iterations), we will obtain a
linearly independent set that spans the same subspace of V as the subspace spanned by
the original set of vectors.
Remark Corollary 4.5.12 is actually true even if the set of vectors in question is
inﬁnite, but we shall not need to consider that case in this text. In the case of an inﬁnite
set of vectors, other techniques are required for the proof.
Note that the linearly independent set obtained using the procedure given in the
previous theorem is not unique, and therefore the question arises whether the number
of vectors in any resulting linearly independent set is independent of the manner in
which the procedure is applied. We will give an afﬁrmative answer to this question in
Section 4.6.
Example 4.5.13 Let v1 = (1, 2, 3), v2 = (−1, 1, 4), v3 = (3, 3, 2), and v4 = (−2, −4, −6). Determine a linearly independent set of vectors that spans the same subspace of R3 as
span{v1 , v2 , v3 , v4 }. Solution: Setting
c1 v1 + c2 v2 + c3 v3 + c4 v4 = 0 requires that
c1 (1, 2, 3) + c2 (−1, 1, 4) + c3 (3, 3, 2) + c4 (−2, −4, −6) = (0, 0, 0),
leading to the linear system
c1 − c2 + 3c3 − 2c4 = 0,
2c1 + c2 + 3c3 − 4c4 = 0,
3c1 + 4c2 + 2c3 − 6c4 = 0.
The augmented matrix of this system is 1 −1 3 −2 0 2 1 3 −4 0 3 4 2 −6 0
and the reduced row-echelon form of the augmented matrix of this system is 1 0 2 −2 0 0 1 −1 0 0 .
00 0 00
The system has two free variables, c3 = s and c4 = t , and so {v1 , v2 , v3 , v4 } is linearly
dependent. Then c2 = s and c1 = 2t − 2s . So the general form of the solution is
(2t − 2s, s, s, t) = s(−2, 1, 1, 0) + t (2, 0, 0, 1).
Setting s = 1 and t = 0 yields the linear combination
−2v1 + v2 + v3 = 0, (4.5.2) i i i i i i i “main”
2007/2/16
page 274
i 274 CHAPTER 4 Vector Spaces and setting s = 0 and t = 1 yields the linear combination
2v1 + v4 = 0. (4.5.3) We can solve (4.5.2) for v3 in terms of v1 and v2 , and we can solve (4.5.3) for v4 in terms
of v1 . Hence, according to Theorem 4.5.2, we have
span{v1 , v2 , v3 , v4 } = span{v1 , v2 }.
By Proposition 4.5.7, v1 and v2 are linearly independent, so {v1 , v2 } is the linearly
independent set we are seeking. Geometrically, the subspace of R3 spanned by v1 and
v2 is a plane, and the vectors v3 and v4 lie in this plane. Linear Dependence and Linear Independence in Rn
Let {v1 , v2 , . . . , vk } be a set of vectors in Rn , and let A denote the matrix that has
v1 , v2 , . . . , vk as column vectors. Thus,
A = [v1 , v2 , . . . , vk ]. (4.5.4) Since each of the given vectors is in Rn , it follows that A has n rows and is therefore an
n × k matrix.
The linear combination c1 v1 + c2 v2 + · · · + ck vk = 0 can be written in matrix form
as (see Theorem 2.2.9)
Ac = 0, (4.5.5) where A is given in Equation (4.5.4) and c = [c1 c2 . . . ck ]T . Consequently, we can
state the following theorem and corollary:
Theorem 4.5.14 Let v1 , v2 , . . . , vk be vectors in Rn and A = [v1 , v2 , . . . , vk ]. Then {v1 , v2 , . . . , vk } is
linearly dependent if and only if the linear system Ac = 0 has a nontrivial solution. Corollary 4.5.15 Let v1 , v2 , . . . , vk be vectors in Rn and A = [v1 , v2 , . . . , vk ].
1. If k > n, then {v1 , v2 , . . . , vk } is linearly dependent.
2. If k = n, then {v1 , v2 , . . . , vk } is linearly dependent if and only if det (A) = 0. Proof If k > n, the system (4.5.5) has an inﬁnite number of solutions (see Corollary 2.5.11), hence the vectors are linearly dependent by Theorem 4.5.14.
On the other hand, if k = n, the system (4.5.5) is n × n, and hence, from Corollary
3.2.5, it has an inﬁnite number of solutions if and only if det(A) = 0.
Example 4.5.16 Determine whether the given vectors are linearly dependent or linearly independent in
R4 .
1. v1 = (1, 3, −1, 0), v2 = (2, 9, −1, 3), v3 = (4, 5, 6, 11), v4 = (1, −1, 2, 5),
v5 = (3, −2, 6, 7).
2. v1 = (1, 4, 1, 7), v2 = (3, −5, 2, 3), v3 = (2, −1, 6, 9), v4 = (−2, 3, 1, 6). i i i i i i i “main”
2007/2/16
page 275
i 4.5 Linear Dependence and Linear Independence 275 Solution:
1. Since we have ﬁve vectors in R4 , Corollary 4.5.15 implies that {v1 , v2 , v3 , v4 , v5 }
is necessarily linearly dependent.
2. In this case, we have four vectors in R4 , and therefore, we can use the determinant:
1 3 2 −2
4 −5 −1 3
det(A) = det[v1 , v2 , v3 , v4 ] =
= −462.
1261
7396
Since the determinant is nonzero, it follows from Corollary 4.5.15 that the given
set of vectors is linearly independent. Linear Independence of Functions
We now consider the general problem of determining whether or not a given set of
functions is linearly independent or linearly dependent. We begin by specializing the
general Deﬁnition 4.5.4 to the case of a set of functions deﬁned on an interval I . DEFINITION 4.5.17
The set of functions {f1 , f2 , . . . , fk } is linearly independent on an interval I if and
only if the only values of the scalars c1 , c2 , . . . , ck such that
c1 f1 (x) + c2 f2 (x) + · · · + ck fk (x) = 0, for all x ∈ I , (4.5.6) are c1 = c2 = · · · = ck = 0.
The main point to notice is that the condition (4.5.6) must hold for all x in I .
A key tool in deciding whether or not a collection of functions is linearly independent
on an interval I is the Wronskian. As we will see in Chapter 6, we can draw particularly
sharp conclusions from the Wronskian about the linear dependence or independence of
a family of solutions to a linear homogeneous differential equation. DEFINITION 4.5.18
Let f1 , f2 , . . . , fk be functions in C k −1 (I ). The Wronskian of these functions is
the order k determinant deﬁned by W [f1 , f2 , . . . , fk ](x) = f1 (x)
f1 (x)
.
.
. (k −1) f1 f2 (x)
f2 (x)
.
.
. (k −1) (x) f2 ...
... fk (x)
fk (x)
.
.
. (k −1) (x) . . . fk . (x) Remark Notice that the Wronskian is a function deﬁned on I . Also note that this
function depends on the order of the functions in the Wronskian. For example, using
properties of determinants,
W [f2 , f1 , . . . , fk ](x) = −W [f1 , f2 , . . . , fk ](x). i i i i i i i “main”
2007/2/16
page 276
i 276 CHAPTER 4 Vector Spaces Example 4.5.19 If f1 (x) = sin x and f2 (x) = cos x on (−∞, ∞), then
W [f1 , f2 ](x) = sin x cos x
= (sin x)(− sin x) − (cos x)(cos x)
cos x − sin x = −(sin2 x + cos2 x) = −1. Example 4.5.20 If f1 (x) = x , f2 (x) = x 2 , and f3 (x) = x 3 on (−∞, ∞), then
x x2 x3
W [f1 , f2 , f3 ](x) = 1 2x 3x 2 = x(12x 2 − 6x 2 ) − (6x 3 − 2x 3 ) = 2x 3 .
0 2 6x
We can now state and prove the main result about the Wronskian. Theorem 4.5.21 Let f1 , f2 , . . . , fk be functions in C k −1 (I ). If W [f1 , f2 , . . . , fk ] is nonzero at some
point x0 in I , then {f1 , f2 , . . . , fk } is linearly independent on I . Proof To apply Deﬁnition 4.5.17, assume that
c1 f1 (x) + c2 f2 (x) + · · · + ck fk (x) = 0,
for all x in I . Then, differentiating k − 1 times yields the linear system
c1 f1 (x)
c1 f1 (x)
(k −1) c1 f1 + c2 f2 (x)
+ c2 f2 (x)
(k −1) (x) + c2 f2 + · · · + ck fk (x)
+ · · · + ck fk (x)
(k −1) (x) + · · · + ck fk = 0,
= 0,
.
.
.
(x) = 0, where the unknowns in the system are c1 , c2 , . . . , ck . We wish to show that c1 = c2 =
· · · = ck = 0. The determinant of the matrix of coefﬁcients of this system is just
W [f1 , f2 , . . . , fk ](x). Consequently, if W [f1 , f2 , . . . , fk ](x0 ) = 0 for some x0 in I ,
then the determinant of the matrix of coefﬁcients of the system is nonzero at that point, and
therefore the only solution to the system is the trivial solution c1 = c2 = · · · = ck = 0.
That is, the given set of functions is linearly independent on I . Remarks
1. Notice that it is only necessary for W [f1 , f2 , . . . , fk ](x) to be nonzero at one point
in I for {f1 , f2 , . . . , fk } to be linearly independent on I .
2. Theorem 4.5.21 does not say that if W [f1 , f2 , . . . , fk ](x) = 0 for every x in I ,
then {f1 , f2 , . . . , fk } is linearly dependent on I . As we will see in the next example
below, the Wronskian of a linearly independent set of functions on an interval I can
be identically zero on I . Instead, the logical equivalent of the preceding theorem
is: If {f1 , f2 , . . . , fk } is linearly dependent on I , then W [f1 , f2 , . . . , fk ](x) = 0
at every point of I . i i i i i i i “main”
2007/2/16
page 277 i
4.5 Linear Dependence and Linear Independence 277 If W [f1 , f2 , . . . , fk ](x) = 0 for all x in I , Theorem 4.5.21 gives no information
as to the linear dependence or independence of {f1 , f2 , . . . , fk } on I . Example 4.5.22 Determine whether the following functions are linearly dependent or linearly independent on I = (−∞, ∞).
(a) f1 (x) = ex , f2 (x) = x 2 ex .
(b) f1 (x) = x , f2 (x) = x + x 2 , f3 (x) = 2x − x 2 .
(c) f1 (x) = x 2 , f2 (x) = 2x 2 , if x ≥ 0,
−x 2 , if x < 0. Solution:
(a)
W [f1 , f2 ](x) = ex
x 2 ex
2x 2
2 2x
= 2xe2x .
x ex (x 2 + 2x) = e (x + 2x) − x e
e Since W [f1 , f2 ](x) = 0 (except at x = 0), the functions are linearly independent
on (−∞, ∞).
(b)
x x + x 2 2x − x 2
W [f1 , f2 , f3 ](x) = 1 1 + 2x 2 − 2x
0
2
−2
= x [(−2)(1 + 2x) − 2(2 − 2x)]
− (−2)(x + x 2 ) − 2(2x − x 2 ) = 0.
Thus, no conclusion can be drawn from Theorem 4.5.21. However, a closer inspection of the functions reveals, for example, that
f2 = 3f1 − f3 .
Consequently, the functions are linearly dependent on (−∞, ∞).
(c) If x ≥ 0, then
W [f1 , f2 ](x) = x 2 2x 2
= 0,
2 x 4x W [f1 , f2 ](x) = x 2 −x 2
= 0.
2x −2x whereas if x < 0, then Thus, W [f1 , f2 ](x) = 0 for all x in (−∞, ∞), so no conclusion can be drawn
from Theorem 4.5.21. Again we take a closer look at the given functions. They
are sketched in Figure 4.5.3. In this case, we see that on the interval (−∞, 0), the
functions are linearly dependent, since
f1 + f2 = 0. i i i i i i i “main”
2007/2/16
page 278
i 278 CHAPTER 4 Vector Spaces
y
y
y f1(x) x f2(x) 2x2
f2(x) 2 y
f2(x) f1(x) on ( 2f1(x) on [0, ) f1(x)
x x2 , 0)
y f2(x) x2 Figure 4.5.3: Two functions that are linearly independent on (−∞, ∞), but whose
Wronskian is identically zero on that interval. They are also linearly dependent on [0, ∞), since on this interval we have
2f1 − f2 = 0.
The key point is to realize that there is no set of nonzero constants c1 , c2 for which
c1 f1 + c2 f2 = 0
holds for all x in (−∞, ∞). Hence, the given functions are linearly independent
on (−∞, ∞). This illustrates our second remark following Theorem 4.5.21, and
it emphasizes the importance of the role played by the interval I when discussing
linear dependence and linear independence of functions. A collection of functions
may be linearly independent on an interval I1 , but linearly dependent on another
interval I2 .
It might appear at this stage that the usefulness of the Wronskian is questionable,
since if W [f1 , f2 , . . . , fk ] vanishes on an interval I , then no conclusion can be drawn
as to the linear dependence or linear independence of the functions f1 , f2 , . . . , fk on
I . However, the real power of the Wronskian is in its application to solutions of linear
differential equations of the form
y (n) + a1 (x)y (n−1) + · · · + an−1 (x)y + an (x)y = 0. (4.5.7)
In Chapter 6, we will establish that if we have n functions that are solutions of an equation
of the form (4.5.7) on an interval I , then if the Wronskian of these functions is identically
zero on I , the functions are indeed linearly dependent on I . Thus, the Wronskian does
completely characterize the linear dependence or linear independence of solutions of
such equations. This is a fundamental result in the theory of linear differential equations. Exercises for 4.5 Key Terms
Linearly dependent set, Linear dependency, Linearly independent set, Minimal spanning set, Wronskian of a set of
functions. Skills
• Be able to determine whether a given ﬁnite set of vectors is linearly dependent or linearly independent. For sets of one or two vectors, you should be able to do
this at a glance. If the set is linearly dependent, be able
to determine a linear dependency relationship among
the vectors. • Be able to take a linearly dependent set of vectors and
remove vectors until it becomes a linearly independent
set of vectors with the same span as the original set. i i i i i i i “main”
2007/2/16
page 279
i 4.5 Linear Dependence and Linear Independence 279 Problems
• Be able to produce a linearly independent set of vectors that spans a given subspace of a vector space V .
• Be able to conclude immediately that a set of k vectors
in Rn is linearly dependent if k > n, and know what
can be said in the case where k = n as well.
• Know what information the Wronskian does (and does
not) give about the linear dependence or linear independence of a set of functions on an interval I . For Problems 1–9, determine whether the given set of vectors
is linearly independent or linearly dependent in Rn . In the
case of linear dependence, ﬁnd a dependency relationship.
1. {(1, −1), (1, 1)}.
2. {(2, −1), (3, 2), (0, 1)}.
3. {(1, −1, 0), (0, 1, −1), (1, 1, 1)}.
4. {(1, 2, 3), (1, −1, 2), (1, −4, 1)}.
5. {(−2, 4, −6), (3, −6, 9)}. True-False Review 6. {(1, −1, 2), (2, 1, 0)}. For Questions 1–9, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false. 7. {(−1, 1, 2), (0, 2, −1), (3, 1, 2), (−1, −1, 1)}. 1. Every vector space V possesses a unique minimal
spanning set.
2. The set of column vectors of a 5 × 7 matrix A must be
linearly dependent. 8. {(1, −1, 2, 3), (2, −1, 1, −1), (−1, 1, 1, 1)}.
9. {(2, −1, 0, 1), (1, 0, −1, 2), (0, 3, 1, 2),
(−1, 1, 2, 1)}.
10. Let v1 = (1, 2, 3), v2 = (4, 5, 6), v3 = (7, 8, 9). Determine whether {v1 , v2 , v3 } is linearly independent in
R3 . Describe
span{v1 , v2 , v3 }
geometrically. 3. The set of column vectors of a 7 × 5 matrix A must be
linearly independent.
4. Any nonempty subset of a linearly independent set of
vectors is linearly independent.
5. If the Wronskian of a set of functions is nonzero at
some point x0 in an interval I , then the set of functions is linearly independent.
6. If it is possible to express one of the vectors in a set
S as a linear combination of the others, then S is a
linearly dependent set.
7. If a set of vectors S in a vector space V contains a
linearly dependent subset, then S is itself a linearly
dependent set.
8. A set of three vectors in a vector space V is linearly dependent if and only if all three vectors are proportional
to one another. 11. Consider the vectors v1 = (2, −1, 5),
v2 = (1, 3, −4), v3 = (−3, −9, 12) in R3 .
(a) Show that {v1 , v2 , v3 } is linearly dependent.
(b) Is v1 ∈ span{v2 , v3 }? Draw a picture illustrating
your answer.
12. Determine all values of the constant k for which the
vectors (1, 1, k), (0, 2, k), and (1, k, 6) are linearly dependent in R3 .
For Problems 13–14, determine all values of the constant k
for which the given set of vectors is linearly independent in
R4 .
13. {(1, 0, 1, k), (−1, 0, k, 1), (2, 0, 1, 3)}.
14. {(1, 1, 0, −1), (1, k, 1, 1), (2, 1, k, 1), (−1, 1, 1, k)}.
For Problems 15–17, determine whether the given set of vectors is linearly independent in M2 (R).
15. A1 = 9. If the Wronskian of a set of functions is identically
zero at every point of an interval I , then the set of
functions is linearly dependent. 11
, A2 =
01 16. A1 = 2 −1
, A2 =
34 2 −1
, A3 =
01 36
.
04 −1 2
.
13 i i i i i i i “main”
2007/2/16
page 280
i 280 CHAPTER 4 17. A1 = Vector Spaces 10
, A2 =
12 −1 1
, A3 =
21 21
.
57 For Problems 18–19, determine whether the given set of vectors is linearly independent in P1 .
18. p1 (x) = 1 − x,
19. p1 (x) = 2 + 3x, p2 (x) = 1 + x . 20. Show that the vectors
and f1 (x) = if x ≥ 0,
if x < 0, x2,
3x 3 , f2 (x) = 7x 2 , I = (−∞, ∞). p2 (x) = 4 + 6x . p1 (x) = a + bx 33. p2 (x) = c + dx are linearly independent in P1 if and only if the constants a, b, c, d satisfy ad − bc = 0.
21. If f1 (x) = cos 2x, f2 (x) = sin2 x, f3 (x) = cos2 x ,
determine whether {f1 , f2 , f3 } is linearly dependent
or linearly independent in C ∞ (−∞, ∞).
For Problems 22–28, determine a linearly independent set of
vectors that spans the same subspace of V as that spanned
by the original set of vectors. For Problems 34–36, show that the Wronskian of the
given functions is identically zero on (−∞, ∞). Determine
whether the functions are linearly independent or linearly
dependent on that interval.
34. f1 (x) = 1, f2 (x) = x, f3 (x) = 2x − 1.
35. f1 (x) = ex , f2 (x) = e−x , f3 (x) = cosh x .
36. f1 (x) = 2x 3 , 37. Consider the functions f1 (x) = x , 22. V = R3 , {(1, 2, 3), (−3, 4, 5), (1, − 4 , − 5 )}.
3
3
23. V = R3 , {(3, 1, 5), (0, 0, 0), (1, 2, −1), (−1, 2, 3)}. ifx ≥ 0,
if x < 0. 5x 3 ,
−3x 3 , f2 (x) = f2 (x) = if x ≥ 0,
if x < 0. x,
−x, 24. V = R3 , {(1, 1, 1), (1, −1, 1), (1, −3, 1), (3, 1, 2)}.
25. V = R4 ,
{(1, 1, −1, 1), (2, −1, 3, 1), (1, 1, 2, 1), (2, −1, 2, 1)}.
26. V = M2 (R),
12
−1 2
32
,
,
34
57
11 . 27. V = P1 , {2 − 5x, 3 + 7x, 4 − x }.
28. V = P2 , {2 + x 2 , 4 − 2x + 3x 2 , 1 + x }.
For Problems 29–33, use the Wronskian to show that the
given functions are linearly independent on the given interval I .
29. f1 (x) = 1, f2 (x) = x, f3 (x) = x 2 , I = (−∞, ∞).
30. f1 (x) = sin x, f2 (x) = cos x, f3 (x) = tan x,
I = (−π/2, π/2).
31. f1 (x) = 1, f2 (x) = 3x, f3 (x) = x 2 − 1, I =
(−∞, ∞).
32. f1 (x) = e2x , f2 (x) = e3x , f3 (x) = e−x , I =
(−∞, ∞). (a) Show that f2 is not in C 1 (−∞, ∞).
(b) Show that {f1 , f2 } is linearly dependent on the intervals (−∞, 0) and [0, ∞), while it is linearly independent on the interval (−∞, ∞). Justify your
results by making a sketch showing both of the
functions.
38. Determine whether the functions f1 (x) = x ,
f2 (x) = x,
1, if x = 0,
if x = 0. are linearly dependent or linearly independent on I =
(−∞, ∞).
39. Show that the functions
f1 (x) = x − 1,
2(x − 1), if x ≥ 1,
if x < 1, f2 (x) = 2x, f3 (x) = 3 form a linearly independent
set on (−∞, ∞). Determine all intervals on which
{f1 , f2 , f3 } is linearly dependent. i i i i i i i “main”
2007/2/16
page 281
i 4.6 40. (a) Show that {1, x, x 2 , x 3 } is linearly independent
on every interval.
(b) If fk (x) = x k for k = 0, 1, . . . , n, show that
{f0 , f1 , . . . , fn } is linearly independent on every
interval for all ﬁxed n.
41. (a) Show that the functions
f1 (x) = er1 x , f2 (x) = er2 x , f3 (x) = er3 x have Wronskian
111
W [f1 , f2 , f3 ](x) = e(r1 +r2 +r3 )x r1 r2 r3
222
r1 r2 r3
= e(r1 +r2 +r3 )x (r3 − r1 )(r3 − r2 )(r2 − r1 ),
and hence determine the conditions on r1 , r2 , r3
such that {f1 , f2 , f3 } is linearly independent on
every interval.
(b) More generally, show that the set of functions
{er1 x , er2 x , . . . , ern x }
is linearly independent on every interval if and
only if all of the ri are distinct. [Hint: Show that
the Wronskian of the given functions is a multiple
of the n × n Vandermonde determinant, and then
use Problem 21 in Section 3.3.]
42. Let {v1 , v2 } be a linearly independent set in a vector
space V , and let v = α v1 + v2 , w = v1 + α v2 , where
α is a constant. Use Deﬁnition 4.5.4 to determine all
values of α for which {v, w} is linearly independent.
43. If v1 and v2 are vectors in a vector space V , and
u1 , u2 , u3 are each linear combinations of them, prove
that {u1 , u2 , u3 } is linearly dependent.
44. Let v1 , v2 , . . . , vm be a set of linearly independent vectors in a vector space V and suppose that the vectors
u1 , u2 , . . . , un are each linear combinations of them.
It follows that we can write
m uk = aik vi , k = 1, 2, . . . , n, i =1 for appropriate constants aik . 4.6 Bases and Dimension 281 (a) If n > m, prove that {u1 , u2 , . . . , un } is linearly
dependent on V .
(b) If n = m, prove that {u1 , u2 , . . . , un } is linearly
independent in V if and only if det[aij ] = 0.
(c) If n < m, prove that {u1 , u2 , . . . , un } is linearly
independent in V if and only if rank(A) = n,
where A = [aij ].
(d) Which result from this section do these results
generalize?
45. Prove from the deﬁnition of “linearly independent”
that if {v1 , v2 , . . . , vn } is linearly independent and
if A is an invertible n × n matrix, then the set
{Av1 , Av2 , . . . , Avn } is linearly independent.
46. Prove that if {v1 , v2 } is linearly independent and v3
is not in span{v1 , v2 }, then {v1 , v2 , v3 } is linearly
independent.
47. Generalizing the previous exercise, prove that if
{v1 , v2 , . . . , vk } is linearly independent and vk +1 is
not in span{v1 , v2 , . . . , vk }, then {v1 , v2 , . . . , vk +1 } is
linearly independent.
48. Prove Theorem 4.5.2.
49. Prove Proposition 4.5.7.
50. Prove that if {v1 , v2 , . . . , vk } spans a vector space V ,
then for every vector v in V , {v, v1 , v2 , . . . , vk } is linearly dependent.
51. Prove that if V = Pn and S = {p1 , p2 , . . . , pk } is a
set of vectors in V each of a different degree, then S is
linearly independent. [Hint: Assume without loss of
generality that the polynomials are ordered in descending degree: deg(p1 ) > deg(p2 ) > · · · > deg(pk ).
Assuming that c1 p1 + c2 p2 + · · · + ck pk = 0, ﬁrst
show that c1 is zero by examining the highest degree.
Then repeat for lower degrees to show successively
that c2 = 0, c3 = 0, and so on.] Bases and Dimension
The results of the previous section show that if a minimal spanning set exists in a
(nontrivial) vector space V , it cannot be linearly dependent. Therefore if we are looking
for minimal spanning sets for V , we should focus our attention on spanning sets that are
linearly independent. One of the results of this section establishes that every spanning
set for V that is linearly independent is indeed a minimal spanning set. Such a set will be i i i i i i i “main”
2007/2/16
page 282
i 282 CHAPTER 4 Vector Spaces called a basis. This is one of the most important concepts in this text and a cornerstone
of linear algebra. DEFINITION 4.6.1
A set of vectors {v1 , v2 , . . . , vk } in a vector space V is called a basis4 for V if
(a) The vectors are linearly independent.
(b) The vectors span V .
Notice that if we have a ﬁnite spanning set for a vector space, then we can always,
in principle, determine a basis for V by using the technique of Corollary 4.5.12. Furthermore, the computational aspects of determining a basis have been covered in the
previous two sections, since all we are really doing is combining the two concepts of
linear independence and linear span. Consequently, this section is somewhat more theoretically oriented than the preceding ones. The reader is encouraged not to gloss over
the theoretical aspects, as these really are fundamental results in linear algebra.
There do exist vector spaces V for which it is impossible to ﬁnd a ﬁnite set of linearly
independent vectors that span V . The vector space C n (I ), n ≥ 1, is such an example
(Example 4.6.19). Such vector spaces are called inﬁnite-dimensional vector spaces.
Our primary interest in this text, however, will be vector spaces that contain a ﬁnite spanning set of linearly independent vectors. These are known as ﬁnite-dimensional vector
spaces, and we will encounter numerous examples of them throughout the remainder of
this section.
We begin with the vector space Rn . In R2 , the most natural basis, denoted {e1 , e2 },
consists of the two vectors
e1 = (1, 0), e2 = (0, 1), (4.6.1) and in R3 , the most natural basis, denoted {e1 , e2 , e3 }, consists of the three vectors
e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1). (4.6.2) The veriﬁcation that the sets (4.6.1) and (4.6.2) are indeed bases of R2 and R3 , respectively, is straightforward and left as an exercise.5 These bases are referred to as the
standard basis on R2 and R3 , respectively. In the case of the standard basis for R3 given
in (4.6.2), we recognize the vectors e1 , e2 , e3 as the familiar unit vectors i, j, k pointing
along the positive x -, y -, and z-axes of the rectangular Cartesian coordinate system.
More generally, consider the set of vectors {e1 , e2 , . . . , en } in Rn deﬁned by
e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), ..., en = (0, 0, . . . , 1). These vectors are linearly independent by Corollary 4.5.15, since
det([e1 , e2 , . . . , en ]) = det(In ) = 1 = 0.
Furthermore, the vectors span Rn , since an arbitrary vector v = (x1 , x2 , . . . , xn ) in Rn
can be written as
v = x1 (1, 0, . . . , 0) + x2 (0, 1, . . . , 0) + · · · + xn (0, 0, . . . , 1)
= x1 e1 + x2 e2 + · · · + xn en .
4 The plural of basis is bases. 5 Alternatively, the veriﬁcation is a special case of that given shortly for the general case of Rn . i i i i i i i “main”
2007/2/16
page 283
i 4.6 Bases and Dimension 283 Consequently, {e1 , e2 , . . . , en } is a basis for Rn . We refer to this basis as the standard
basis for Rn .
The general vector in Rn has n components, and the standard basis vectors arise as
the n vectors that are obtained by sequentially setting one component to the value 1 and
the other components to 0. In general, this is how we obtain standard bases in vector
spaces whose vectors are determined by the speciﬁcation of n independent constants.
We illustrate with some examples.
Example 4.6.2 Determine the standard basis for M2 (R). Solution: The general matrix in M2 (R) is
ab
.
cd Consequently, there are four independent parameters that give rise to four special vectors
in M2 (R). Sequentially setting one of these parameters to the value 1 and the others to
0 generates the following four matrices:
A1 = 10
,
00 A2 = 01
,
00 A3 = 00
,
10 A4 = 00
.
01 We see that {A1 , A2 , A3 , A4 } is a spanning set for M2 (R). Furthermore,
c1 A1 + c2 A2 + c3 A3 + c4 A4 = 02
holds if and only if
c1 10
01
00
00
+ c2
+ c3
+ c4
00
00
10
01 = 00
00 —that is, if and only if c1 = c2 = c3 = c4 = 0. Consequently, {A1 , A2 , A3 , A4 } is
a linearly independent spanning set for M2 (R), hence it is a basis. This is the standard
basis for M2 (R). Remark More generally, consider the vector space of all m × n matrices with real
entries, Mm×n (R). If we let Eij denote the m × n matrix with value 1 in the (i, j )-position
and zeros elsewhere, then we can show routinely that
{Eij : 1 ≤ i ≤ m, 1 ≤ j ≤ n}
is a basis for Mm×n (R), and it is the standard basis.
Example 4.6.3 Determine the standard basis for P2 . Solution: We have
P2 = {a0 + a1 x + a2 x 2 : a0 , a1 , a2 ∈ R}, so that the vectors in P2 are determined by specifying values for the three parameters
a0 , a1 , and a2 . Sequentially setting one of these parameters to the value 1 and the other
two to the value 0 yields the following vectors in P2 :
p0 (x) = 1, p1 (x) = x, p2 (x) = x 2 . i i i i i i i “main”
2007/2/16
page 284
i 284 CHAPTER 4 Vector Spaces We have shown in Example 4.4.6 that {p0 , p1 , p2 } is a spanning set for P2 . Furthermore,
1 x x2
W [p0 , p1 , p2 ](x) = 0 1 2x = 2 = 0,
00 2
which implies that {p0 , p1 , p2 } is linearly independent on any interval.6 Consequently,
{p0 , p1 , p2 } is a basis for P2 . This is the standard basis for P2 . Remark More generally, the reader can check that the standard basis for the vector
space of all polynomials of degree n or less, Pn , is
{1, x, x 2 , . . . , x n }. Dimension of a Finite-Dimensional Vector Space
The reader has probably realized that there can be many different bases for a given
vector space V . In addition to the standard basis {e1 , e2 , e3 } on R3 , for example, it can
be checked7 that {(1, 2, 3), (4, 5, 6), (7, 8, 8)} and {(1, 0, 0), (1, 1, 0), (1, 1, 1)} are also
bases for R3 . And there are countless others as well.
Despite the multitude of different bases available for a vector space V , they all share
one common feature: the number of vectors in each basis for V is the same. This fact
will be deduced as a corollary of our next theorem, a fundamental result in the theory of
vector spaces.
Theorem 4.6.4 If a ﬁnite-dimensional vector space has a basis consisting of m vectors, then any set of
more than m vectors is linearly dependent. Proof Let {v1 , v2 , . . . , vm } be a basis for V , and consider an arbitrary set of vectors in V ,
say, {u1 , u2 , . . . , un }, with n > m. We wish to prove that {u1 , u2 , . . . , un } is necessarily
linearly dependent. Since {v1 , v2 , . . . , vm } is a basis for V , it follows that each uj can
be written as a linear combination of v1 , v2 , . . . , vm . Thus, there exist constants aij such
that
u1 = a11 v1 + a21 v2 + · · · + am1 vm ,
u2 = a12 v1 + a22 v2 + · · · + am2 vm ,
.
.
.
un = a1n v1 + a2n v2 + · · · + amn vm .
To prove that {u1 , u2 , . . . , un } is linearly dependent, we must show that there exist scalars
c1 , c2 , . . . , cn , not all zero, such that
c1 u1 + c2 u2 + · · · + cn un = 0. (4.6.3) Inserting the expressions for u1 , u2 , . . . , un into Equation (4.6.3) yields
c1 (a11 v1 + a21 v2 + · · · + am1 vm ) + c2 (a12 v1 + a22 v2 + · · · + am2 vm )
+ · · · + cn (a1n v1 + a2n v2 + · · · + amn vm ) = 0.
6 Alternatively, we can start with the equation c p (x) + c p (x) + c p (x) = 0 for all x in R and show
00
11
22
readily that c0 = c1 = c2 = 0.
7 The reader desiring extra practice at the computational aspects of verifying a basis is encouraged to pause here to check these examples. i i i i i i i “main”
2007/2/16
page 285
i 4.6 Bases and Dimension 285 Rearranging terms, we have
(a11 c1 + a12 c2 + · · · + a1n cn )v1 + (a21 c1 + a22 c2 + · · · + a2n cn )v2
+ · · · + (am1 c1 + am2 c2 + · · · + amn cn )vm = 0.
Since {v1 , v2 , . . . , vm } is linearly independent, we can conclude that
a11 c1 + a12 c2 + · · · + a1n cn = 0,
a21 c1 + a22 c2 + · · · + a2n cn = 0,
.
.
.
am1 c1 + am2 c2 + · · · + amn cn = 0.
This is an m × n homogeneous system of linear equations with m < n, and hence,
from Corollary 2.5.11, it has nontrivial solutions for c1 , c2 , . . . , cn . It therefore follows
from Equation (4.6.3) that {u1 , u2 , . . . , un } is linearly dependent.
Corollary 4.6.5 All bases in a ﬁnite-dimensional vector space V contain the same number of vectors. Proof Suppose {v1 , v2 , . . . , vn } and {u1 , u2 , . . . , um } are two bases for V . From Theorem 4.6.4 we know that we cannot have m > n (otherwise {u1 , u2 , . . . , um } would be
a linearly dependent set and hence could not be a basis for V ). Nor can we have n > m
(otherwise {v1 , v2 , . . . , vn } would be a linearly dependent set and hence could not be a
basis for V ). Thus, it follows that we must have m = n.
We can now prove that any basis provides a minimal spanning set for V .
Corollary 4.6.6 If a ﬁnite-dimensional vector space V has a basis consisting of n vectors, then any
spanning set must contain at least n vectors. Proof If the spanning set contained fewer than n vectors, then there would be a subset
of less than n linearly independent vectors that spanned V ; that is, there would be a basis
consisting of less than n vectors. But this would contradict the previous corollary.
The number of vectors in a basis for a ﬁnite-dimensional vector space is clearly a
fundamental property of the vector space, and by Corollary 4.6.5 it is independent of the
particular chosen basis. We call this number the dimension of the vector space. DEFINITION 4.6.7
The dimension of a ﬁnite-dimensional vector space V , written dim[V ], is the number
of vectors in any basis for V . If V is the trivial vector space, V = {0}, then we deﬁne
its dimension to be zero. Remark We say that the dimension of the world we live in is three for the very reason
that the maximum number of independent directions that we can perceive is three. If a
vector space has a basis containing n vectors, then from Theorem 4.6.4, the maximum
number of vectors in any linearly independent set is n. Thus, we see that the terminology
dimension used in an arbitrary vector space is a generalization of a familiar idea.
Example 4.6.8 It follows from our examples earlier in this section that dim[R3 ] = 3, dim[M2 (R)] = 4,
and dim[P2 ] = 3. i i i i i i i “main”
2007/2/16
page 286
i 286 CHAPTER 4 Vector Spaces More generally, the following dimensions should be remembered:
dim[Rn ] = n, dim[Mm×n (R)] = mn, dim[Mn (R)] = n2 , dim[Pn ] = n + 1. These values have essentially been established previously in our discussion of standard
bases. The standard basis for Rn is {e1 , e2 , . . . , en }, where ei is the n-tuple with value
1 in the i th position and value 0 elsewhere. Thus, this basis contains n vectors. The
standard basis for Mm×n (R) is the set of matrices Eij (1 ≤ i ≤ m, 1 ≤ j ≤ n) with
value 1 in the (i, j ) position and value 0 elsewhere. There are mn such matrices in this
standard basis. The case of Mn (R) is just a special case of Mm×n (R) in which m = n.
Finally, the standard basis for Pn is {1, x, x 2 , . . . , x n }, a set of n + 1 vectors.
Next, let us return once more to Example 1.2.16 to cast its results in terms of the
basis concept. Example 4.6.9 Determine a basis for the solution space to the differential equation
y +y =0
on any interval I . Solution: Our results from Example 1.2.16 tell us that all solutions to the given
differential equation are of the form
y(x) = c1 cos x + c2 sin x.
Consequently, {cos x, sin x } is a linearly independent spanning set for the solution space
of the differential equation and therefore is a basis.
More generally, we will show in Chapter 6 that all solutions to the differential
equation
y + a1 (x)y + a2 (x)y = 0
on the interval I have the form
y(x) = c1 y1 (x) + c2 y2 (x),
where {y1 , y2 } is any linearly independent set of solutions to the differential equation.
Using the terminology introduced in this section, it will therefore follow that:
The set of all solutions to y + a1 (x)y + a2 (x)y = 0 on an
interval I is a vector space of dimension two.
If a vector space has dimension n, then from Theorem 4.6.4, the maximum number
of vectors in any linearly independent set is n. On the other hand, from Corollary 4.6.6,
the minimum number of vectors that can span V is also n. Thus, a basis for V must be
a linearly independent set of n vectors. Our next theorem establishes that any set of n
linearly independent vectors is a basis for V . Theorem 4.6.10 If dim[V ] = n, then any set of n linearly independent vectors in V is a basis for V . i i i i i i i “main”
2007/2/16
page 287
i 4.6 Bases and Dimension 287 Proof Let v1 , v2 , . . . , vn be n linearly independent vectors in V . We need to show that
they span V . To do this, let v be an arbitrary vector in V . From Theorem 4.6.4, the set of
vectors {v, v1 , v2 , . . . , vn } is linearly dependent, and so there exist scalars c0 , c1 , . . . , cn ,
not all zero, such that
c0 v + c1 v1 + · · · + cn vn = 0. (4.6.4) If c0 = 0, then the linear independence of {v1 , v2 , . . . , vn } and (4.6.4) would imply that
c0 = c1 = · · · = cn = 0, a contradiction. Hence, c0 = 0, and so, from Equation (4.6.4),
v=− 1
(c1 v1 + c2 v2 + · · · + cn vn ).
c0 Thus v, and hence any vector in V , can be written as a linear combination of v1 , v2 , . . . , vn ,
and hence, {v1 , v2 , . . . , vn } spans V , in addition to being linearly independent. Hence it
is a basis for V , as required.
Theorem 4.6.10 is one of the most important results of the section. In Chapter 6, we
will explicitly construct a basis for the solution space to the differential equation
y (n) + a1 (x)y (n−1) + · · · + an−1 (x)y + an (x)y = 0
consisting of n vectors. That is, we will show that the solution space to this differential
equation is n-dimensional. It will then follow immediately from Theorem 4.6.10 that
every solution to this differential equation is of the form
y(x) = c1 y1 (x) + c2 y2 (x) + · · · + cn yn (x),
where {y1 , y2 , . . . , yn } is any linearly independent set of n solutions to the differential
equation. Therefore, determining all solutions to the differential equation will be reduced
to determining any linearly independent set of n solutions. A similar application of the
theorem will be used to develop the theory for systems of differential equations in
Chapter 7.
More generally, Theorem 4.6.10 says that if we know in advance that the dimension
of the vector space V is n, then n linearly independent vectors in V are already guaranteed
to form a basis for V without the need to explicitly verify that these n vectors also span
V . This represents a useful reduction in the work required to verify a basis. Here is an
example:
Example 4.6.11 Verify that {1 + x, 2 − 2x + x 2 , 1 + x 2 } is a basis for P2 . Solution: Since dim[P2 ] = 3, Theorem 4.6.10 will guarantee that the three given
vectors are a basis, once we conﬁrm only that they are linearly independent. The polynomials
p1 (x) = 1 + x, p2 (x) = 2 − 2x + x 2 , p3 (x) = 1 + x 2 have Wronskian
1 + x 2 − 2x + x 2 1 + x 2
W [p1 , p2 , p3 ](x) = 1
−2 + 2x
2x = −6 = 0.
0
2
2
Since the Wronskian is nonzero, the given set of vectors is linearly independent on any
interval. Consequently, {1 + x, 2 − 2x + x 2 , 1 + x 2 } is indeed a basis for P2 . i i i i i i i “main”
2007/2/16
page 288
i 288 CHAPTER 4 Vector Spaces There is a notable parallel result to Theorem 4.6.10 which can also cut down the
work required to verify that a set of vectors in V is a basis for V , provided that we know
the dimension of V in advance.
Theorem 4.6.12 If dim[V ] = n, then any set of n vectors in V that spans V is a basis for V . Proof Let v1 , v2 , . . . , vn be n vectors in V that span V . To conﬁrm that {v1 , v2 , . . . , vn }
is a basis for V , we need only show that this is a linearly independent set of vectors.
Suppose, to the contrary, that {v1 , v2 , . . . , vn } is a linearly dependent set. By Corollary 4.5.12, there is a linearly independent subset of {v1 , v2 , . . . , vn }, with fewer than n
vectors, which also spans V . But this implies that V contains a basis with fewer than n
vectors, a contradiction.
Putting the results of Theorems 4.6.10 and 4.6.12 together, the following result is
immediate.
Corollary 4.6.13 If dim[V ] = n and S = {v1 , v2 , . . . , vn } is a set of n vectors in V , the following
statements are equivalent:
1. S is a basis for V .
2. S is linearly independent.
3. S spans V .
We emphasize once more the importance of this result. It means that if we have a
set S of dim[V ] vectors in V , then to determine whether or not S is a basis for V , we
need only check if S is linearly independent or if S spans V , not both.
We next establish another corollary to Theorem 4.6.10. Corollary 4.6.14 Let S be a subspace of a ﬁnite-dimensional vector space V . If dim[V ] = n, then
dim[S ] ≤ n.
Furthermore, if dim[S ] = n, then S = V . Proof Suppose that dim[S ] > n. Then any basis for S would contain more than n
linearly independent vectors, and therefore we would have a linearly independent set of
more than n vectors in V . This would contradict Theorem 4.6.4. Thus, dim[S ] ≤ n.
Now consider the case when dim[S ] = n = dim[V ]. In this case, any basis for S
consists of n linearly independent vectors in S and hence n linearly independent vectors
in V . Thus, by Theorem 4.6.10, these vectors also form a basis for V . Hence, every
vector in V is spanned by the basis vectors for S , and hence, every vector in V lies in S .
Thus, V = S .
Example 4.6.15 Give a geometric description of the subspaces of R3 of dimensions 0, 1, 2, 3. Solution: Zero-dimensional subspace: This corresponds to the subspace {(0, 0, 0)},
and therefore it is represented geometrically by the origin of a Cartesian coordinate
system.
One-dimensional subspace: These are subspaces generated by a single (nonzero) basis
vector. Consequently, they correspond geometrically to lines through the origin.
Two-dimensional subspace: These are the subspaces generated by any two noncollinear
vectors and correspond geometrically to planes through the origin. i i i i i i i “main”
2007/2/16
page 289
i 4.6 Bases and Dimension 289 Three-dimensional subspace: Since dim[R3 ] = 3, it follows from Corollary 4.6.14 that
the only three-dimensional subspace of R3 is R3 itself.
Example 4.6.16 Determine a basis for the subspace of R3 consisting of all solutions to the equation
x1 + 2x2 − x3 = 0. Solution: We can solve this problem geometrically. The given equation is that of a
plane through the origin and therefore is a two-dimensional subspace of R3 . In order to
determine a basis for this subspace, we need only choose two linearly independent (i.e.,
noncollinear) vectors that lie in the plane. A simple choice of vectors is8 v1 = (1, 0, 1)
and v2 = (2, −1, 0). Thus, a basis for the subspace is {(1, 0, 1), (2, −1, 0)}.
Corollary 4.6.14 has shown that if S is a subspace of a ﬁnite-dimensional vector
space V with dim[S ] = dim[V ], then S = V . Our next result establishes that, in general,
a basis for a subspace of a ﬁnite-dimensional vector space V can be extended to a basis
for V . This result will be required in the next section and also in Chapter 5.
Theorem 4.6.17 Let S be a subspace of a ﬁnite-dimensional vector space V . Any basis for S is part of a
basis for V . Proof Suppose dim[V ] = n and dim[S ] = k . By Corollary 4.6.14, k ≤ n. If k = n,
then S = V , so that any basis for S is a basis for V . Suppose now that k < n, and let
{v1 , v2 , . . . , vk } be a basis for S . These basis vectors are linearly independent, but they
fail to span V (otherwise they would form a basis for V , contradicting k < n). Thus,
there is at least one vector, say vk +1 , in V that is not in span{v1 , v2 , . . . , vk }. Hence,
{v1 , v2 , . . . , vk , vk +1 } is linearly independent. If k + 1 = n, then we have a basis for V by
Theorem 4.6.10, and we are done. Otherwise, we can repeat the procedure to obtain the
linearly independent set {v1 , v2 , . . . , vk , vk +1 , vk +2 }. The process will terminate when
we have a linearly independent set containing n vectors, including the original vectors
v1 , v2 , . . . , vk in the basis for S . This proves the theorem.
Remark
a basis.
Example 4.6.18 The process used in proving the previous theorem is referred to as extending Let S denote the subspace of M2 (R) consisting of all symmetric 2×2 matrices. Determine
a basis for S , and ﬁnd dim[S ]. Extend this basis for S to obtain a basis for M2 (R). Solution: We ﬁrst express S in set notation as
S = {A ∈ M2 (R) : AT = A}. In order to determine a basis for S , we need to obtain the element form of the matrices
in S . We can write
ab
S=
: a, b, c ∈ R .
bc
Since
ab
bc =a 10
01
00
+b
+c
,
00
10
01 it follows that
S = span 10
01
00
,
,
00
10
01 . 8 There are many others, of course. i i i i i i i “main”
2007/2/16
page 290
i 290 CHAPTER 4 Vector Spaces Furthermore, it is easily shown that the matrices in this spanning set are linearly independent. Consequently, a basis for S is
10
01
00
,
,
00
10
01 , so that dim[S ] = 3. Since dim[M2 (R)] = 4, in order to extend the basis for S to a basis
for M2 (R), we need to add one additional matrix from M2 (R) such that the resulting
set is linearly independent. We must choose a nonsymmetric matrix, for any symmetric
matrix can be expressed as a linear combination of the three basis vectors for S , and this
would create a linear dependency among the matrices. A simple choice of nonsymmetric
matrix (although this is certainly not the only choice) is
01
.
00
Adding this vector to the basis for S yields the linearly independent set
10
01
00
01
,
,
,
00
10
01
00 . (4.6.5) Since dim[M2 (R)] = 4, Theorem 4.6.10 implies that (4.6.5) is a basis for M2 (R).
It is important to realize that not all vector spaces are ﬁnite dimensional. Some are
inﬁnite-dimensional. In an inﬁnite-dimensional vector space, we can ﬁnd an arbitrarily
large number of linearly independent vectors. We now give an example of an inﬁnitedimensional vector space that is of primary importance in the theory of differential
equations, C n (I ).
Example 4.6.19 Show that the vector space C n (I ) is an inﬁnite-dimensional vector space. Solution: Consider the functions 1, x, x 2 , . . . , x k in C n (I ). Of course, each of these
functions is in C k (I ) as well, and for each ﬁxed k , the Wronskian of these functions
is nonzero (the reader can check that the matrix involved in this calculation is upper
triangular, with nonzero entries on the main diagonal). Hence, the functions are linearly
independent on I by Theorem 4.5.21. Since we can choose k arbitrarily, it follows that
there are an arbitrarily large number of linearly independent vectors in C n (I ), hence
C n (I ) is inﬁnite-dimensional.
In this example we showed that C n (I ) is an inﬁnite-dimensional vector space. Consequently, the use of our ﬁnite-dimensional vector space theory in the analysis of differential equations appears questionable. However, the key theoretical result that we will
establish in Chapter 6 is that the solution set of certain linear differential equations is a
ﬁnite-dimensional subspace of C n (I ), and therefore our basis results will be applicable
to this solution set. Exercises for 4.6 Key Terms
Basis, Standard basis, Inﬁnite-dimensional, Finitedimensional, Dimension, Extension of a subspace basis. Skills
• Be able to determine whether a given set of vectors
forms a basis for a vector space V . • Be able to construct a basis for a given vector space
V.
• Be able to extend a basis for a subspace of V to V
itself.
• Be familiar with the standard bases on Rn , Mm×n (R),
and Pn . i i i i i i i “main”
2007/2/16
page 291
i 4.6 Bases and Dimension 291 • Be able to give the dimension of a vector space V . 4. {(1, 1, −1, 2), (1, 0, 1, −1), (2, −1, 1, −1)}. • Be able to draw conclusions about the properties of a
set of vectors in a vector space (i.e., spanning or linear
independence) based solely on the size of the set. 5. {(1, 1, 0, 2), (2, 1, 3, −1), (−1, 1, 1, −2), (2, −1, 1, 2)}. • Understand the usefulness of Theorems 4.6.10 and
4.6.12. True-False Review
For Questions 1–11, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false.
1. A basis for a vector space V is a set S of vectors that
spans V .
2. If V and W are vector spaces of dimensions n and m,
respectively, and if n > m, then W is a subspace of V .
3. A vector space V can have many different bases.
4. dim[Pn ] = dim[Rn ].
5. If V is an n-dimensional vector space, then any set S
of m vectors with m > n must span V .
6. Five vectors in P3 must be linearly dependent.
7. Two vectors in P3 must be linearly independent.
8. Ten vectors in M3 (R) must be linearly dependent.
9. If V is an n-dimensional vector space, then every set
S with fewer than n vectors can be extended to a basis
for V .
10. Every set of vectors that spans a ﬁnite-dimensional
vector space V contains a subset which forms a basis
for V .
11. The set of all 3 × 3 upper triangular matrices forms a
three-dimensional subspace of M3 (R). Problems
For Problems 1–5, determine whether the given set of vectors
is a basis for Rn .
1. {(1, 1), (−1, 1)}.
2. {(1, 2, 1), (3, −1, 2), (1, 1, −1)}.
3. {(1, −1, 1), (2, 5, −2), (3, 11, −5)}. 6. Determine all values of the constant k for which the
set of vectors {(0, −1, 0, k), (1, 0, 1, 0), (0, 1, 1, 0),
(k, 0, 2, 1)} is a basis for R4 .
7. Determine a basis S for P3 , and hence, prove that
dim[P3 ] = 4. Be sure to prove that S is a basis.
8. Determine a basis S for P3 whose elements all have
the same degree. Be sure to prove that S is a basis.
For Problems 9–12, ﬁnd the dimension of the null space of
the given matrix A.
13
.
−2 −6 000
10. A = 0 0 0 .
010 1 −1 4
11. A = 2 3 −2 .
1 2 −2 1 −1 2 3 2 −1 3 4 12. A = 1 0 1 1 .
3 −1 4 5
9. A = 13. Let S be the subspace of R3 that consists of all solutions to the equation x − 3y + z = 0. Determine a
basis for S , and hence, ﬁnd dim[S ].
14. Let S be the subspace of R3 consisting of all vectors
of the form (r, r − 2s, 3s − 5r), where r and s are
real numbers. Determine a basis for S , and hence, ﬁnd
dim[S ].
15. Let S be the subspace of M2 (R) consisting of all 2 × 2
upper triangular matrices. Determine a basis for S , and
hence, ﬁnd dim[S ].
16. Let S be the subspace of M2 (R) consisting of all 2 × 2
matrices with trace zero. Determine a basis for S , and
hence, ﬁnd dim[S ].
17. Let S be the subspace of R3 spanned by the vectors
v1 = (1, 0, 1), v2 = (0, 1, 1), v3 = (2, 0, 2). Determine a basis for S , and hence, ﬁnd dim[S ]. i i i i i i i “main”
2007/2/16
page 292
i 292 CHAPTER 4 Vector Spaces 18. Let S be the vector space consisting of the set of
all linear combinations of the functions f1 (x) =
ex , f2 (x) = e−x , f3 (x) = sinh(x). Determine a basis
for S , and hence, ﬁnd dim[S ]. 26. Let
A1 = 13
,
−1 2 00
,
00 −1 4
,
11 5 −6
.
−5 1 20. Let v1 = (1, 1) and v2 = (−1, 1).
(a) Show that {v1 , v2 } spans A2 = 13
,
−1 0 A3 = 19. Determine a basis for the subspace of M2 (R) spanned
by −1 1
,
01
10
,
12 A4 = 0 −1
.
23 (a) Show that {A1 , A2 , A3 , A4 } is a basis for M2 (R).
[The hint on the previous problems applies again.]
(b) Express the vector R2 . 56
78 (b) Show that {v1 , v2 } is linearly independent.
(c) Conclude from (a) or (b) that {v1 , v2 } is a basis for R2 . What theorem in this section allows
you to draw this conclusion from either (a) or (b),
without proving both?
21. Let v1 = (2, 1) and v2 = (3, 1).
(a) Show that {v1 , v2 } spans R2 .
(b) Show that {v1 , v2 } is linearly independent.
(c) Conclude from (a) or (b) that {v1 , v2 } is a basis for R2 . What theorem in this section allows
you to draw this conclusion from either (a) or (b),
without proving both?
22. Let v1 = (0, 6, 3), v2 = (3, 0, 3), and v3 =
(6, −3, 0). Show that {v1 , v2 , v3 } is a basis for R3 .
[Hint: You need not show that the set is both linearly
independent and a spanning set for P2 . Use a theorem
from this section to shorten your work.]
23. Determine all values of the constant α for which
{1 + αx 2 , 1 + x + x 2 , 2 + x } is a basis for P2 .
24. Let p1 (x) = 1 + x, p2 (x) = x(x − 1), p3 (x) =
1+2x 2 . Show that {p1 , p2 , p3 } is a basis for P2 . [Hint:
You need not show that the set is both linearly independent and a spanning set for P2 . Use a theorem from
this section to shorten your work.]
25. The Legendre polynomial of degree n, pn (x), is deﬁned to be the polynomial solution of the differential
equation
(1 − x 2 )y − 2xy + n(n + 1)y = 0,
which has been normalized so that pn (1) = 1. The ﬁrst
three Legendre polynomials are p0 (x) = 1, p1 (x) =
1
x , and p2 (x) = 2 (3x 2 − 1). Show that {p0 , p1 , p2 }
is a basis for P2 . [The hint for the previous problem
applies again.] as a linear combination of the basis in (a).
27. Let 1 1 −1 1
A = 2 −3 5 −6 ,
5 0 2 −3 and let v1 = (−2, 7, 5, 0) and v2 = (3, −8, 0, 5).
(a) Show that {v1 , v2 } is a basis for the null space of
A.
(b) Using the basis in part (a), write an expression for
an arbitrary vector (x, y, z, w) in the null space
of A.
28. Let V = M3 (R) and let S be the subset of all vectors
in V such that the sum of the entries in each row and
in each column is zero.
(a) Find a basis and the dimension of S .
(b) Extend the basis in (a) to a basis for V .
29. Let V = M3 (R) and let S be the subset of all vectors
in V such that the sum of the entries in each row and
in each column is the same.
(a) Find a basis and the dimension of S .
(b) Extend the basis in (a) to a basis for V .
For Problems 30–31, Symn (R) and Skewn (R) denote the
vector spaces consisting of all real n × n matrices that are
symmetric and skew-symmetric, respectively.
30. Find a basis for Sym2 (R) and Skew2 (R), and show
that
dim[Sym2 (R)] + dim[Skew2 (R)] = dim[M2 (R)]. i i i i i i i “main”
2007/2/16
page 293
i 4.7 31. Determine the dimensions
Skewn (R), and show that of Symn (R) and dim[Symn (R)] + dim[Skewn (R)] = dim[Mn (R)].
For Problems 32–34, a subspace S of a vector space V is
given. Determine a basis for S and extend your basis for S
to obtain a basis for V .
R3 , 32. V =
S is the subspace consisting of all points
lying on the plane with Cartesian equation
x + 4y − 3z = 0.
33. V = M2 (R), S is the subspace consisting of all matrices of the form
ab
.
ba 4.7 293 Change of Basis 34. V = P2 , S is the subspace consisting of all polynomials of the form (2a1 + a2 )x 2 + (a1 + a2 )x + (3a1 − a2 ).
35. Let S be a basis for Pn−1 . Prove that S ∪ {x n } is a basis
for Pn .
36. Generalize the previous problem as follows. Let S be a
basis for Pn−1 , and let p be any polynomial of degree
n. Prove that S ∪ {p} is a basis for Pn .
37. (a) What is the dimension of Cn as a real vector
space? Determine a basis.
(b) What is the dimension of Cn as a complex vector
space? Determine a basis. Change of Basis
Throughout this section, we restrict our attention to vector spaces that are ﬁnite-dimensional.
If we have a (ﬁnite) basis for such a vector space V , then, since the vectors in a basis
span V , any vector in V can be expressed as a linear combination of the basis vectors.
The next theorem establishes that there is only one way in which we can do this. Theorem 4.7.1 If V is a vector space with basis {v1 , v2 , . . . , vn }, then every vector v ∈ V can be written
uniquely as a linear combination of v1 , v2 , . . . , vn . Proof Since v1 , v2 , . . . , vn span V , every vector v ∈ V can be expressed as
v = a1 v1 + a2 v2 + · · · + an vn , (4.7.1) for some scalars a1 , a2 , . . . , an . Suppose also that
v = b1 v1 + b2 v2 + · · · + bn vn , (4.7.2) for some scalars b1 , b2 , . . . , bn . We will show that ai = bi for each i , which will prove
the uniqueness assertion of this theorem. Subtracting Equation (4.7.2) from Equation
(4.7.1) yields
(a1 − b1 )v1 + (a2 − b2 )v2 + · · · + (an − bn )vn = 0. (4.7.3) But {v1 , v2 , . . . , vn } is linearly independent, and so Equation (4.7.3) implies that
a1 − b1 = 0, a2 − b2 = 0, ..., an − bn = 0. That is, ai = bi for each i = 1, 2, . . . , n. Remark The converse of Theorem 4.7.1 is also true. That is, if every vector v in
a vector space V can be written uniquely as a linear combination of the vectors in
{v1 , v2 , . . . , vn }, then {v1 , v2 , . . . , vn } is a basis for V . The proof of this fact is left as
an exercise (Problem 38).
Up to this point, we have not paid particular attention to the order in which the
vectors of a basis are listed. However, in the remainder of this section, this will become i i i i i i i “main”
2007/2/16
page 294
i 294 CHAPTER 4 Vector Spaces an important consideration. By an ordered basis for a vector space, we mean a basis in
which we are keeping track of the order in which the basis vectors are listed. DEFINITION 4.7.2
If B = {v1 , v2 , . . . , vn } is an ordered basis for V and v is a vector in V , then the
scalars c1 , c2 , . . . , cn in the unique n-tuple (c1 , c2 , . . . , cn ) such that
v = c1 v1 + c2 v2 + · · · + cn vn
are called the components of v relative to the ordered basis B = {v1 , v2 , . . . , vn }.
We denote the column vector consisting of the components of v relative to the ordered
basis B by [v]B , and we call [v]B the component vector of v relative to B . Determine the components of the vector v = (1, 7) relative to the ordered basis B =
{(1, 2), (3, 1)}. Example 4.7.3 Solution: If we let v1 = (1, 2) and v2 = (3, 1), then since these vectors are not
collinear, B = {v1 , v2 } is a basis for R2 . We must determine constants c1 , c2 such that
y c1 v1 + c2 v2 = v. (4, 8)
(1, 7) We write 4v1 v = 4v1 c1 (1, 2) + c2 (3, 1) = (1, 7). v2 This requires that
c1 + 3c2 = 1 and 2c1 + c2 = 7. The solution to this system is (4, −1), which gives the components of v relative to the
ordered basis B = {v1 , v2 }. (See Figure 4.7.1.) Thus, (1, 2)
v1 (3, 1)
v2 v = 4v1 − v2 .
x Therefore, we have v2 Figure 4.7.1: The components
of the vector v = (1, 7) relative to
the basis {(1, 2), (3, 1)}. [v]B = 4
.
−1 Remark In the preceding example, the component vector of v = (1, 7) relative to
the ordered basis B = {(3, 1), (1, 2)} is
[v]B = −1
.
4 Thus, even though the bases B and B contain the same vectors, the fact that the vectors
are listed in different order affects the components of the vectors in the vector space.
Example 4.7.4 In P2 , determine the component vector of p(x) = 5 + 7x − 3x 2 relative to the following:
(a) The standard (ordered) basis B = {1, x, x 2 }.
(b) The ordered basis C = {1 + x, 2 + 3x, 5 + x + x 2 }. i i i i i i i “main”
2007/2/16
page 295
i 4.7 Change of Basis 295 Solution:
(a) The given polynomial is already written as a linear combination of the standard
basis vectors. Consequently, the components of p(x) = 5 + 7x − 3x 2 relative to
the standard basis B are 5, 7, and −3. We write 5
[p(x)]B = 7 .
−3
(b) The components of p(x) = 5 + 7x − 3x 2 relative to the ordered basis
C = {1 + x, 2 + 3x, 5 + x + x 2 }
are c1 , c2 , and c3 , where
c1 (1 + x) + c2 (2 + 3x) + c3 (5 + x + x 2 ) = 5 + 7x − 3x 2 .
That is,
(c1 + 2c2 + 5c3 ) + (c1 + 3c2 + c3 )x + c3 x 2 = 5 + 7x − 3x 2 .
Hence, c1 , c2 , and c3 satisfy
c1 + 2c2 + 5c3 = 5,
c1 + 3c2 + c3 = 7,
c3 = −3.
The augmented matrix of this system has reduced row-echelon form 1 0 0 40 0 1 0 −10 ,
0 0 1 −3
so that the system has solution (40, −10, −3), which gives the required components. Hence, we can write
5 + 7x − 3x 2 = 40(1 + x) − 10(2 + 3x) − 3(5 + x + x 2 ).
Therefore, 40
[p(x)]C = −10 .
−3 Change-of-Basis Matrix
The preceding example naturally motivates the following question: If we are given two
different ordered bases for an n-dimensional vector space V , say
B = {v1 , v2 , . . . , vn } and C = {w1 , w2 , . . . , wn }, (4.7.4) and a vector v in V , how are [v]B and [v]C related? In practical terms, we may know
the components of v relative to B and wish to know the components of v relative to
a different ordered basis C . This question actually arises quite often, since different
bases are advantageous in different circumstances, so it is useful to be able to convert i i i i i i i “main”
2007/2/16
page 296
i 296 CHAPTER 4 Vector Spaces components of a vector relative to one basis to components relative to another basis.
The tool we need in order to do this efﬁciently is the change-of-basis matrix. Before
we describe this matrix, we pause to record the linearity properties satisﬁed by the
components of a vector. These properties will facilitate the discussion that follows.
Lemma 4.7.5 Let V be a vector space with ordered basis B = {v1 , v2 , . . . , vn }, let x and y be vectors
in V , and let c be a scalar. Then we have
(a) [x + y]B = [x]B + [y]B .
(b) [cx]B = c[x]B . Proof Write
x = a1 v1 + a2 v2 + · · · + an vn and y = b1 v1 + b2 v2 + · · · + bn vn , so that
x + y = (a1 + b1 )v1 + (a2 + b2 )v2 + · · · + (an + bn )vn .
Hence, a1
b1
a1 + b1 a2 + b2 a2 b2 [x + y]B = = . + . = [x]B + [y]B ,
.
. . . .
.
. an + bn an bn which establishes (a). The proof of (b) is left as an exercise (Problem 37). DEFINITION 4.7.6
Let V be an n-dimensional vector space with ordered bases B and C given in (4.7.4).
We deﬁne the change-of-basis matrix from B to C by
PC ←B = [v1 ]C , [v2 ]C , . . . , [vn ]C . (4.7.5) In words, we determine the components of each vector in the “old basis” B with
respect the “new basis” C and write the component vectors in the columns of the
change-of-basis matrix. Remark Of course, there is also a change-of-basis matrix from C to B , given by
PB ←C = [w1 ]B , [w2 ]B , . . . , [wn ]B . We will see shortly that the matrices PB ←C and PC ←B are intimately related.
Our ﬁrst order of business at this point is to see why the matrix in (4.7.5) converts
the components of a vector relative to B into components relative to C . Let v be a vector
in V and write
v = a1 v1 + a2 v2 + · · · + an vn . i i i i i i i “main”
2007/2/16
page 297
i 4.7 297 Change of Basis a1 a2 [v]B = . .
.
. Then an
Hence, using Theorem 2.2.9 and Lemma 4.7.5, we have
PC ←B [v]B = a1 [v1 ]C + a2 [v2 ]C +· · ·+ an [vn ]C = [a1 v1 + a2 v2 +· · ·+ an vn ]C = [v]C .
This calculation shows that premultiplying the component vector of v relative to B by
the change of basis matrix PC ←B yields the component vector of v relative to C :
[v]C = PC ←B [v]B . Example 4.7.7 (4.7.6) Let V = R2 , B = {(1, 2), (3, 4)}, C = {(7, 3), (4, 2)}, and v = (1, 0). It is routine to
verify that B and C are bases for V .
(a) Determine [v]B and [v]C .
(b) Find PC ←B and PB ←C .
(c) Use (4.7.6) to compute [v]C , and compare your answer with (a). Solution:
(a) Solving (1, 0) = a1 (1, 2) + a2 (3, 4), we ﬁnd a1 = −2 and a2 = 1. Hence,
−2
.
1 [v]B = Likewise, setting (1, 0) = b1 (7, 3) + b2 (4, 2), we ﬁnd b1 = 1 and b2 = −1.5.
Hence,
1
[v]C =
.
−1.5
(b) A short calculation shows that
[(1, 2)]C = −3
5.5 and Thus, we have
PC ←B = [(3, 4)]C = −5
.
9.5 −3 −5
.
5.5 9.5 Likewise, another short calculation shows that
[(7, 3)]B = −9.5
5.5 Hence,
PB ←C = and [(4, 2)]B = −5
.
3 −9.5 −5
.
5.5 3 i i i i i i i “main”
2007/2/16
page 298
i 298 CHAPTER 4 Vector Spaces (c) We compute as follows:
PC ←B [v]B = −3 −5
5.5 9.5 −2
1 = 1
−1.5 = [v]C , as we found in part (a).
The reader may have noticed a close resemblance between the two matrices PC ←B
and PB ←C computed in part (b) of the preceding example. In fact, a brief calculation
shows that
PC ←B PB ←C = I2 = PB ←C PC ←B .
The two change-of-basis matrices are inverses of each other. This turns out to be always
true. To see why, consider again Equation (4.7.6). If we premultiply both sides of (4.7.6)
by the matrix PB ←C , we get
PB ←C [v]C = PB ←C PC ←B [v]B . (4.7.7) Rearranging the roles of B and C in (4.7.6), the left side of (4.7.7) is simply [v]B . Thus,
PB ←C PC ←B [v]B = [v]B .
Since this is true for any vector [v]B in Rn , this implies that
PB ←C PC ←B = In ,
the n × n identity matrix. Likewise, a similar calculation shows that
PC ←B PB ←C = In .
Thus, we have proved that
The matrices PC ←B and PB ←C are inverses of one another. Example 4.7.8 Let V = P2 , and let B = {1, 1 + x, 1 + x + x 2 }, and C = {2 + x + x 2 , x + x 2 , x }. It is
routine to verify that B and C are bases for V . Find the change-of-basis matrix from B
to C , and use it to calculate the change-of-basis matrix from C to B . Solution: We set 1 = a1 (2 + x + x 2 ) + a2 (x + x 2 ) + a3 x . With a quick calculation,
we ﬁnd that a1 = 0.5, a2 = −0.5, and a3 = 0. Next, we set 1 + x = b1 (2 + x + x 2 )
+ b2 (x + x 2 ) + b3 x , and we ﬁnd that b1 = 0.5, b2 = −0.5, and b3 = 1. Finally, we
set 1 + x + x 2 = c1 (2 + x + x 2 ) + c2 (x + x 2 ) + c3 x , from which it follows that
c1 = 0.5, c2 = 0.5, and c3 = 0. Hence, we have a1 b1 c1
0.5 0.5 0.5
PC ←B = a2 b2 c2 = −0.5 −0.5 0.5 .
0
10
a3 b3 c3
Thus, we have PB ←C = (PC ←B )−1 1 −1 −1
= 0 0 1.
110 i i i i i i i “main”
2007/2/16
page 299
i 4.7 Change of Basis 299 In much the same way that we showed above that the matrices PC ←B and PB ←C
are inverses of one another, we can make the following observation.
Theorem 4.7.9 Let V be a vector space with ordered bases A, B , and C . Then
PC ←A = PC ←B PB ←A . (4.7.8) Proof Using (4.7.6), for every v ∈ V , we have
PC ←B PB ←A [v]A = PC ←B [v]B = [v]C = PC ←A [v]A ,
so that premultiplication of [v]A by either matrix in (4.7.8) yields the same result. Hence,
the matrices on either side of (4.7.8) are the same.
We conclude this section by using Theorem 4.7.9 to show how an arbitrary changeof-basis matrix PC ←B in Rn can be expressed as a product of change-of-basis matrices
involving the standard basis E = {e1 , e2 , . . . , en } of Rn . Let B = {v1 , v2 , . . . , vn } and
C = {w1 , w2 , . . . , wn } be arbitrary ordered bases for Rn . Since [v]E = v for all column
vectors v in Rn , the matrices
PE ←B = [[v1 ]E , [v2 ]E , . . . , [vn ]E ] = [v1 , v2 , . . . , vn ]
and
PE ←C = [[w1 ]E , [w2 ]E , . . . , [wn ]E ] = [w1 , w2 , . . . , wn ]
can be written down immediately. Using these matrices, together with Theorem 4.7.9,
we can compute the arbitrary change-of-basis matrix PC ←B with ease:
PC ←B = PC ←E PE ←B = (PE ←C )−1 PE ←B . Exercises for 4.7 Key Terms True-False Review Ordered basis, Components of a vector relative to an ordered
basis, Change-of-basis matrix. For Questions 1–8, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false. Skills
• Be able to ﬁnd the components of a vector relative to
a given ordered basis for a vector space V .
• Be able to compute the change-of-basis matrix for a
vector space V from one ordered basis B to another
ordered basis C .
• Be able to use the change-of-basis matrix from B to
C to determine the components of a vector relative to
C from the components of the vector relative to B .
• Be familiar with the relationship between the two
change-of-basis matrices PC ←B and PB ←C . 1. Every vector in a ﬁnite-dimensional vector space V
can be expressed uniquely as a linear combination of
vectors comprising a basis for V .
2. The change-of-basis matrix PB ←C acts on the component vector of a vector v relative to the basis C and
produces the component vector of v relative to the basis B .
3. A change-of-basis matrix is always a square matrix.
4. A change-of-basis matrix is always invertible. i i i i i i i “main”
2007/2/16
page 300
i 300 CHAPTER 4 Vector Spaces 5. For any vectors v and w in a ﬁnite-dimensional vector
space V with basis B , we have [v−w]B = [v]B −[w]B .
6. If the bases B and C for a vector space V contain the
same set of vectors, then [v]B = [v]C for every vector
v in V .
7. If B and C are bases for a ﬁnite-dimensional vector
space V , and v and w are in V such that [v]B = [w]C ,
then v = w.
8. The matrix PB ←B is the identity matrix for any basis
B for V . Problems
For Problems 1–13, determine the component vector of the
given vector in the vector space V relative to the given ordered basis B .
1. V = R2 ; B = {(2, −2), (1, 4)}; v = (5, −10).
2. V = R2 ; B = {(−1, 3), (3, 2)}; v = (8, −2).
3. V = R3 ; B = {(1, 0, 1), (1, 1, −1), (2, 0, 1)}; v =
(−9, 1, −8).
4. V = R3 ; B = {(1, −6, 3), (0, 5, −1), (3, −1, −1)};
v = (1, 7, 7).
5. V = R3 ; B = {(3, −1, −1), (1, −6, 3), (0, 5, −1)};
v = (1, 7, 7).
6. V = R3 ; B = {(−1, 0, 0), (0, 0, −3), (0, −2, 0)};
v = (5, 5, 5).
7. V = P2 ; B = {x 2 + x, 2 + 2x, 1}; p(x) = −4x 2 +
2x + 6.
8. V = P2 ; B = {5 − 3x, 1, 1 + 2x 2 }; p(x) = 15 −
18x − 30x 2 . 12. V = M2 (R);
2 −1
04
11
3 −1
B=
,
,
,
35
−1 1
11
25
−10 16
A=
.
−15 −14
13. V = M2 (R);
−1 1
13
10
0 −1
B=
,
,
,
01
−1 0
12
23
56
A=
.
78 ; ; 14. Let v1 = (0, 6, 3), v2 = (3, 0, 3), and v3 =
(6, −3, 0). Determine the component vector of an arbitrary vector v = (x, y, z) relative to the ordered
basis {v1 , v2 , v3 }.
15. Let p1 (x) = 1 + x , p2 (x) = x(x − 1), and p3 (x) =
1 + 2x 2 . Determine the component vector of an arbitrary polynomial p(x) = a0 + a1 x + a2 x 2 relative to
the ordered basis {p1 , p2 , p3 }.
For Problems 16–25, ﬁnd the change-of-basis matrix PC ←B
from the given ordered basis B to the given ordered basis C
of the vector space V .
16. V = R2 ; B = {(9, 2), (4, −3)}; C = {(2, 1),
(−3, 1)}.
17. V = R2 ; B = {(−5, −3), (4, 28)}; C = {(6, 2),
(1, −1)}.
18. V = R3 ; B = {(2, −5, 0), (3, 0, 5), (8, −2, −9)};
C = {(1, −1, 1), (2, 0, 1), (0, 1, 3)}.
19. V = R3 ; B = {(−7, 4, 4), (4, 2, −1), (−7, 5, 0)};
C = {(1, 1, 0), (0, 1, 1), (3, −1, −1)}.
20. V = P1 ; B = {7 − 4x, 5x }; C = {1 − 2x, 2 + x }.
21. V = P2 ; B = {−4 + x − 6x 2 , 6 + 2x 2 , −6 − 2x + 4x 2 };
C = {1 − x + 3x 2 , 2, 3 + x 2 }. 9. V = P3 ; B = {1, 1 + x, 1 + x + x 2 , 1 + x + x 2 + x 3 };
p(x) = 4 − x + x 2 − 2x 3 . 22. V = P3 ;
B = {−2+3x +4x 2 −x 3 , 3x +5x 2 +2x 3 , −5x 2 −5x 3 ,
4 + 4x + 4x 2 }; C = {1 − x 3 , 1 + x, x + x 2 , x 2 + x 3 }. 10. V = P3 ; B = {x 3 + x 2 , x 3 − 1, x 3 + 1, x 3 + x };
p(x) = 8 + x + 6x 2 + 9x 3 . 23. V = P2 ; B = {2 + x 2 , −1 − 6x + 8x 2 , −7 − 3x − 9x 2 };
C = {1 + x, −x + x 2 , 1 + 2x 2 }. 11. V = M2 (R);
11
11
11
10
B=
,
,
,
11
10
00
00
−3 −2
A=
.
−1 2 24. V = M2 (R);
10
0 −1
35
−2 −4
B=
,
,
,
−1 −2
30
00
00
11
11
11
10
C=
,
,
,
.
11
10
00
00 ; ; i i i i i i i “main”
2007/2/16
page 301
i 4.8 Row Space and Column Space 301 33. v = (−1, 2, 0); V , B , and C from Problem 19. 25. V = M2 (R); B = {E12 , E22 , E21 , E11 };
C = {E22 , E11 , E21 , E12 }.
For Problems 26–31, ﬁnd the change-of-basis matrix PB ←C
from the given basis C to the given basis B of the vector
space V . 34. p(x) = 6 − 4x ; V , B , and C from Problem 20.
35. p(x) = 5 − x + 3x 2 ; V , B , and C from Problem 21.
−1 −1
; V , B , and C from Problem 24.
−4 5 26. V , B , and C from Problem 16. 36. A = 27. V , B , and C from Problem 17. 37. Prove part (b) of Lemma 4.7.5. 28. V , B , and C from Problem 18. 38. Prove that if every vector v in a vector space V can be
written uniquely as a linear combination of the vectors
in {v1 , v2 , . . . , vn }, then {v1 , v2 , . . . , vn } is a basis for
V. 29. V , B , and C from Problem 20.
30. V , B , and C from Problem 22.
31. V , B , and C from Problem 25.
For Problems 32–36, verify Equation (4.7.6) for the given
vector.
32. v = (−5, 3); V , B , and C from Problem 16. 4.8 39. Show that if B is a basis for a ﬁnite-dimensional vector space V , and C is a basis obtained by reordering
the vectors in B , then the matrices PC ←B and PB ←C
each contain exactly one 1 in each row and column,
and zeros elsewhere. Row Space and Column Space
In this section, we consider two vector spaces that can be associated with any m × n
matrix. For simplicity, we will assume that the matrices have real entries, although the
results that we establish can easily be extended to matrices with complex entries. Row Space
Let A = [aij ] be an m × n real matrix. The row vectors of this matrix are row n-vectors,
and therefore they can be associated with vectors in Rn . The subspace of Rn spanned by
these vectors is called the row space of A and denoted rowspace(A). For example, if
A= 2 −1 3
,
5 9 −7 then
rowspace(A) = span{(2, −1, 3), (5, 9, −7)}.
For a general m × n matrix A, how can we obtain a basis for rowspace(A)? By its
very deﬁnition, the row space of A is spanned by the row vectors of A, but these may
not be linearly independent, hence the row vectors of A do not necessarily form a basis
for rowspace(A). We wish to determine a systematic and efﬁcient method for obtaining
a basis for the row space. Perhaps not surprisingly, it involves the use of elementary row
operations.
If we perform elementary row operations on A, then we are merely taking linear
combinations of vectors in rowspace(A), and we therefore might suspect that the row
space of the resulting matrix coincides with the row space of A. This is the content of
the following theorem.
Theorem 4.8.1 If A and B are row-equivalent matrices, then
rowspace(A) = rowspace(B). i i i i i i i “main”
2007/2/16
page 302
i 302 CHAPTER 4 Vector Spaces Proof We establish that the matrix that results from performing any of the three elementary row operations on a matrix A has the same row space as the row space of A. If
we interchange two rows of A, then clearly we have not altered the row space, since we
still have the same set of row vectors (listed in a different order).
Now let a1 , a2 , . . . , am denote the row vectors of A. We combine the remaining
two types of elementary row operations by considering the result of replacing ai by the
vector r ai + s aj , where r (= 0) and s are real numbers. If s = 0, then this corresponds
to scaling ai by a factor of r , whereas if r = 1 and s = 0, this corresponds to adding a
multiple of row j to row i . If B denotes the resulting matrix, then
rowspace(B) = {c1 a1 + c2 a2 + · · · + ci (r ai + s aj ) + · · · + cm am }
= {c1 a1 + c2 a2 + · · · + (rci )ai + · · · + (cj + sci )aj + · · · + cm am }
= {c1 a1 + c2 a2 + · · · + di ai + · · · + dj aj + · · · + cm am },
where di = rci and dj = cj + sci . Note that di and dj can take on arbitrary values,
hence the vectors in rowspace(B) consist precisely of arbitrary linear combinations of
a1 , a2 , . . . , am . That is,
rowspace(B) = span{a1 , a2 , . . . , am } = rowspace(A).
The previous theorem is the key to determining a basis for rowspace(A). The idea
we use is to reduce A to row-echelon form. If d1 , d2 , . . . , dk denote the nonzero row
vectors in this row-echelon form, then from the previous theorem,
rowspace(A) = span{d1 , d2 , . . . , dk }.
We now establish that {d1 , d2 , . . . , dk } is linearly independent. Consider
c1 d1 + c2 d2 + · · · + ck dk = 0. (4.8.1) Owing to the positioning of the leading ones in a row-echelon matrix, each of the row
vectors d1 , d2 , . . . , dk −1 will have a leading one in a position where each succeeding
row vector in the row-echelon form has a zero. Hence, Equation (4.8.1) is satisﬁed only if
c1 = c2 = · · · = ck −1 = 0,
and therefore, it reduces to
ck dk = 0.
However, dk is a nonzero vector, and so we must have ck = 0. Consequently, all of
the constants in Equation (4.8.1) must be zero, and therefore {d1 , d2 , . . . , dk } not only
spans rowspace(A), but also is linearly independent. Hence, {d1 , d2 , . . . , dk } is a basis
for rowspace(A). We have therefore established the next theorem.
Theorem 4.8.2 The set of nonzero row vectors in any row-echelon form of an m × n matrix A is a basis
for rowspace(A).
As a consequence of the preceding theorem, we can conclude that all row-echelon
forms of A have the same number of nonzero rows. For if this were not the case, then we
could ﬁnd two bases for rowspace(A) containing a different number of vectors, which
would contradict Corollary 4.6.5. We can therefore consider Theorem 2.4.10 as a direct
consequence of Theorem 4.8.2. i i i i i i i “main”
2007/2/16
page 303
i 4.8 Example 4.8.3 Row Space and Column Space 303 Determine a basis for the row space of 1 −1 1 3 2 2 −1 1 5 1 A= 3 −1 1 7 0 .
0 1 −1 −1 −3 Solution: We ﬁrst reduce A to row-echelon form: 1 −1 1 3 2
1 −1 1 3 2
1 −1 −1 −3 2 0 1 −1 −1 −3 1 0 A∼ 0 2 −2 −2 −6 ∼ 0 0 0 0 0 .
0 1 −1 −1 −3
00000
1. A12 (−2), A13 (−3) 2. A23 (−2), A24 (−1) Consequently, a basis for rowspace(A) is {(1, −1, 1, 3, 2), (0, 1, −1, −1, −3)}, and
therefore rowspace(A) is a two-dimensional subspace of R5 .
Theorem 4.8.2 also gives an efﬁcient method for determining a basis for the subspace
of Rn spanned by a given set of vectors. If we let A be the matrix whose row vectors are
the given vectors from Rn , then rowspace(A) coincides with the subspace of Rn spanned
by those vectors. Consequently, the nonzero row vectors in any row-echelon form of A
will be a basis for the subspace spanned by the given set of vectors. Example 4.8.4 Determine a basis for the subspace of R4 spanned by {(1, 2, 3, 4),(4, 5, 6, 7),(7, 8, 9, 10)}. Solution: We ﬁrst let A denote the matrix that has the given vectors as row vectors. Thus, 123 4
A = 4 5 6 7 .
7 8 9 10 We now reduce A to row-echelon form: 12
3
4
12
3
4
1234
1
2
3
2
3 ∼ 0 1 2 3.
A ∼ 0 −3 −6 −9 ∼ 0 1
0 −6 −12 −18
0 −6 −12 −18
0000
1. A12 (−4), A13 (−7) 2. M2 (− 1 )
3 3. A23 (6) Consequently, a basis for the subspace of R4 spanned by the given vectors is {(1, 2, 3, 4),
(0, 1, 2, 3)}. We see that the given vectors span a two-dimensional subspace of R4 . Column Space
If A is an m × n matrix, the column vectors of A are column m-vectors and therefore
can be associated with vectors in Rm . The subspace of Rm spanned by these vectors is
called the column space of A and denoted colspace(A). i i i i i i i “main”
2007/2/16
page 304
i 304 CHAPTER 4 Vector Spaces Example 4.8.5 For the matrix
A= 2 −1 3
,
5 9 −7 we have colspace(A) = span{(2, 5), (−1, 9), (3, −7)}.
We now consider the problem of determining a basis for the column space of an
m × n matrix A. Since the column vectors of A coincide with the row vectors of AT , it
follows that
colspace(A) = rowspace(AT ).
Hence one way to obtain a basis for colspace(A) would be to reduce AT to row-echelon
form, and then the nonzero row vectors in the resulting matrix would form a basis for
colspace(A).
There is, however, a better method for determining a basis for colspace(A) directly
from any row-echelon form of A. The derivation of this technique is somewhat involved
and will require full attention.
We begin by determining the column space of an m × n reduced row-echelon matrix.
In order to introduce the basic ideas, consider the particular reduced row-echelon matrix 12030
0 0 1 5 0 E=
0 0 0 0 1.
00000
In this case, we see that the ﬁrst, third, and ﬁfth column vectors, which are the column
vectors containing the leading ones, coincide with the ﬁrst three standard basis vectors
in R4 (written as column vectors): 1
0
0
0
1
0
e1 = ,
e2 = ,
e3 = .
0
0
1
0
0
0
Consequently, these column vectors are linearly independent. Furthermore, the remaining column vectors in E (those that do not contain leading ones) are both linear combinations of e1 and e2 , columns that do contain leading ones. Therefore {e1 , e2 , e3 } is a linearly
independent set of vectors that spans colspace(E), and so a basis for colspace(E) is
{(1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0)}.
Clearly, the same arguments apply to any reduced row-echelon matrix E . Thus, if
E contains k (necessarily ≤ n) leading ones, a basis for colspace(E) is {e1 , e2 , . . . , ek }.
Now consider an arbitrary m × n matrix A, and let E denote the reduced row-echelon
form of A. Recall from Chapter 2 that performing elementary row operations on a linear
system does not alter its solution set. Hence, the two homogeneous systems of equations
Ac = 0 and Ec = 0 (4.8.2) have the same solution sets. If we write A and E in column-vector form as A =
[a1 , a2 , . . . , an ] and E = [d1 , d2 , . . . , dn ], respectively, then the two systems in (4.8.2)
can be written as
c1 a1 + c2 a2 + · · · + cn an = 0,
c1 d1 + c2 d2 + · · · + cn dn = 0,
respectively. Thus, the fact that these two systems have the same solution set means that
a linear dependence relationship will hold between the column vectors of E if and only if i i i i i i i “main”
2007/2/16
page 305
i 4.8 Row Space and Column Space 305 precisely the same linear dependence relation holds between the corresponding column
vectors of A. In particular, since our previous work shows that the column vectors in
E that contain leading ones give a basis for colspace(E), they give a maximal linearly
independent set in colspace(E). Therefore, the corresponding column vectors in A will
also be a maximal linearly independent set in colspace(A). Consequently, this set of
vectors from A will be a basis for colspace(A).
We have therefore shown that the set of column vectors of A corresponding to those
column vectors containing leading ones in the reduced row-echelon form of A is a basis
for colspace(A). But do we have to reduce A to reduced row-echelon form? The answer
is no. We need only reduce A to row-echelon form. The reason is that going further
to reduce a matrix from row-echelon form to reduced row-echelon form does not alter
the position or number of leading ones in a matrix, and therefore the column vectors
containing leading ones in any row-echelon form of A will correspond to the column
vectors containing leading ones in the reduced row-echelon form of A. Consequently,
we have established the following result.
Theorem 4.8.6 Let A be an m × n matrix. The set of column vectors of A corresponding to those column
vectors containing leading ones in any row-echelon form of A is a basis for colspace(A). Example 4.8.7 Determine a basis for colspace(A) if 1 2 −1 −2 −1 2 4 −2 −3 −1 A= 5 10 −5 −3 −1 .
−3 −6 3 2 1 Solution: We ﬁrst reduce A to row-echelon form: 1 2 −1 −2 −1
1 2 −1 −2 −1
0 1 1 2 0 0 0 1 1
1 0 0 A∼ 0 0 0 7 4 ∼ 0 0 0 0 −3 0 0 0 −4 −2
00 0 0 2 1 2 −1 −2 −1
1 2 −1 −2 −1
0 1 1 4 0 0 0 1 1
3 0 0 ∼
0 0 0 0 1 ∼ 0 0 0 0 1.
00 0 0 2
00 0 0 0 1. A12 (−2), A13 (−5), A14 (3) 2. A23 (−7), A24 (4) 3. M3 (− 1 ) 4. A34 (−2)
3
Since the ﬁrst, fourth, and ﬁfth column vectors in this row-echelon form of A contain
the leading ones, it follows from Theorem 4.8.6 that the set of corresponding column
vectors in A is a basis for colspace(A). Consequently, a basis for colspace(A) is
{(1, 2, 5, −3), (−2, −3, −3, 2), (−1, −1, −1, 1)}.
Hence, colspace(A) is a three-dimensional subspace of R4 . Notice from the row-echelon
form of A that a basis for rowspace(A) is {(1, 2, −1, −2, −1), (0, 0, 0, 1, 1), (0, 0, 0,
0, 1)} so that rowspace(A) is a three-dimensional subspace of R5 .
We now summarize the discussion of row space and column space. i i i i i i i “main”
2007/2/16
page 306
i 306 CHAPTER 4 Vector Spaces Summary: Let A be an m × n matrix. In order to determine a basis for rowspace(A)
and a basis for colspace(A), we reduce A to row-echelon form.
1. The row vectors containing the leading ones in the row-echelon form
give a basis for rowspace(A) (a subspace of Rn ).
2. The column vectors of A corresponding to the column vectors containing
the leading ones in the row-echelon form give a basis for colspace(A)
(a subspace of Rm ).
Since the number of vectors in a basis for rowspace(A) or in a basis for colspace(A)
is equal to the number of leading ones in any row-echelon form of A, it follows that
dim[rowspace(A)] = dim[colspace(A)] = rank(A).
However, we emphasize that rowspace(A) and colspace(A) are, in general, subspaces
of different vector spaces. In Example 4.8.7, for instance, rowspace(A) is a subspace
of R5 , while colspace(A) is a subspace of R4 . For an m × n matrix, rowspace(A) is a
subspace of Rn , whereas colspace(A) is a subspace of Rm . Exercises for 4.8 Key Terms
Row space, Column space. Skills
• Be able to compute a basis for the row space of a
matrix.
• Be able to compute a basis for the column space of a
matrix. True-False Review
For Questions 1–6, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false.
1. If A is an m × n matrix such that rowspace(A) =
colspace(A), then m = n.
2. A basis for the row space of a matrix A consists of the
row vectors of any row-echelon form of A.
3. The nonzero column vectors of a row-echelon form of
a matrix A form a basis for colspace(A).
4. The sets rowspace(A) and colspace(A) have the same
dimension.
5. If A is an n × n invertible matrix, then rowspace(A) =
Rn . 6. If A is an n × n invertible matrix, then colspace(A) =
Rn . Problems
For Problems 1–6, determine a basis for rowspace(A) and a
basis for colspace(A).
1. A = 1 −2
.
−3 6 2. A = 1 1 −3 2
.
3 4 −11 7 123
3. A = 5 6 7 .
9 10 11 031
4. A = 0 −6 −2 .
0 12 4 1
3
5. A = 1
5 2
6
2
10 −1 3
−3 5 .
−1 −1 −5 7 1 −1 2 3
6. A = 1 1 −2 6 .
3 1 42 i i i i i i i “main”
2007/2/16
page 307
i 4.9 The Rank-Nullity Theorem 307 124
12. Let A = 5 11 21 .
3 7 13 For Problems 7–10, use the ideas in this section to determine
a basis for the subspace of Rn spanned by the given set of
vectors.
7. {(1, −1, 2), (5, −4, 1), (7, −5, −4)}. (a) Find a basis for rowspace(A) and colspace(A). 8. {(1, 3, 3), (1, 5, −1), (2, 7, 4), (1, 4, 1)}. (b) Show that rowspace(A) corresponds to the plane
with Cartesian equation 2x + y − z = 0, whereas
colspace(A) corresponds to the plane with Cartesian equation 2x − y + z = 0. 9. {(1, 1, −1, 2), (2, 1, 3, −4), (1, 2, −6, 10)}.
10. {(1, 4, 1, 3), (2, 8, 3, 5), (1, 4, 0, 4), (2, 8, 2, 6)}.
11. Let
A= 13. Give examples to show how each type of elementary
row operation applied to a matrix can change the column space of the matrix. −3 9
.
1 −3 Find a basis for rowspace(A) and colspace(A). Make
a sketch to show each subspace in the xy -plane. 4.9 14. Give an example of a square matrix A whose row space
and column space have no nonzero vectors in common. The Rank-Nullity Theorem
In Section 4.3, we deﬁned the null space of a real m × n matrix A to be the set of all real
solutions to the associated homogeneous linear system Ax = 0. Thus,
nullspace(A) = {x ∈ Rn : Ax = 0}.
The dimension of nullspace(A) is referred to as the nullity of A and is denoted nullity(A).
In order to ﬁnd nullity(A), we need to determine a basis for nullspace(A). Recall that if
rank(A) = r , then any row-echelon form of A contains r leading ones, which correspond
to the bound variables in the linear system. Thus, there are n − r columns without leading
ones, which correspond to free variables in the solution of the system Ax = 0. Hence,
there are n − r free variables in the solution of the system Ax = 0. We might therefore
suspect that nullity(A) = n − r . Our next theorem, often referred to as the Rank-Nullity
Theorem, establishes that this is indeed the case. Theorem 4.9.1 (Rank-Nullity Theorem)
For any m × n matrix A,
rank(A) + nullity(A) = n. (4.9.1) Proof If rank(A) = n, then by the Invertible Matrix Theorem, the only solution to
Ax = 0 is the trivial solution x = 0. Hence, in this case, nullspace(A) = {0}, so
nullity(A) = 0 and Equation (4.9.1) holds.
Now suppose rank(A) = r < n. In this case, there are n − r > 0 free variables
in the solution to Ax = 0. Let t1 , t2 , . . . , tn−r denote these free variables (chosen as
those variables not attached to a leading one in any row-echelon form of A), and let
x1 , x2 , . . . , xn−r denote the solutions obtained by sequentially setting each free variable
to 1 and the remaining free variables to zero. Note that {x1 , x2 , . . . , xn−r } is linearly independent. Moreover, every solution to Ax = 0 is a linear combination of x1 , x2 , . . . , xn−r :
x = t1 x1 + t2 x2 + · · · + tn−r xn−r ,
which shows that {x1 , x2 , . . . , xn−r } spans nullspace(A). Thus, {x1 , x2 , . . . , xn−r } is a
basis for nullspace(A), and nullity(A) = n − r . i i i i i i i “main”
2007/2/16
page 308
i 308 CHAPTER 4 Vector Spaces Example 4.9.2 If 1 1 23
A = 3 4 −1 2 ,
−1 −2 5 4 ﬁnd a basis for nullspace(A) and verify Theorem 4.9.1. Solution: We must ﬁnd all solutions to Ax = 0. Reducing the augmented matrix of
this system yields 1 1 2 30
11 2 30
1
2
A# ∼ 0 1 −7 −7 0 ∼ 0 1 −7 −7 0 .
0 −1 7 7 0
00 0 00
1. A12 (−3), A13 (1) 2. A23 (1) Consequently, there are two free variables, x3 = t1 and x4 = t2 , so that
x2 = 7t1 + 7t2 , x1 = −9t1 − 10t2 . Hence,
nullspace(A) = {(−9t1 − 10t2 , 7t1 + 7t2 , t1 , t2 ) : t1 , t2 ∈ R}
= {t1 (−9, 7, 1, 0) + t2 (−10, 7, 0, 1) : t1 , t2 ∈ R}
= span{(−9, 7, 1, 0), (−10, 7, 0, 1)}.
Since the two vectors in this spanning set are not proportional, they are linearly independent. Consequently, a basis for nullspace(A) is {(−9, 7, 1, 0), (−10, 7, 0, 1)}, so that
nullity(A) = 2. In this problem, A is a 3 × 4 matrix, and so, in the Rank-Nullity Theorem,
n = 4. Further, from the foregoing row-echelon form of the augmented matrix of the
system Ax = 0, we see that rank(A) = 2. Hence,
rank(A) + nullity(A) = 2 + 2 = 4 = n,
and the Rank-Nullity Theorem is veriﬁed. Systems of Linear Equations
We now examine the linear structure of the solution set to the linear system Ax =
b in terms of the concepts introduced in the last few sections. First we consider the
homogeneous case b = 0.
Corollary 4.9.3 Let A be an m × n matrix, and consider the corresponding homogeneous linear system
Ax = 0.
1. If rank(A) = n, then Ax = 0 has only the trivial solution, so nullspace(A) = {0}.
2. If rank(A) = r < n, then Ax = 0 has an inﬁnite number of solutions, all of which
can be obtained from
x = c1 x1 + c2 x2 + · · · + cn−r xn−r , (4.9.2) where {x1 , x2 , . . . , xn−r } is any linearly independent set of n − r solutions to
Ax = 0. i i i i i i i “main”
2007/2/16
page 309
i 4.9 The Rank-Nullity Theorem 309 Proof Note that part 1 is a restatement of previous results, or can be quickly deduced
from the Rank-Nullity Theorem. Now for part 2, assume that rank(A) = r < n.
By the Rank-Nullity Theorem, nullity(A) = n − r . Thus, from Theorem 4.6.10, if
{x1 , x2 , . . . , xn−r } is any set of n − r linearly independent solutions to Ax = 0, it is a
basis for nullspace(A), and so all vectors in nullspace(A) can be written as
x = c1 x1 + c2 x2 + · · · + cn−r xn−r ,
for appropriate values of the constants c1 , c2 , . . . , cn−r . Remark
Ax = 0. The expression (4.9.2) is referred to as the general solution to the system We now turn our attention to nonhomogeneous linear systems. We begin by formulating Theorem 2.5.9 in terms of colspace(A).
Theorem 4.9.4 Let A be an m × n matrix and consider the linear system Ax = b.
1. If b is not in colspace(A), then the system is inconsistent.
2. If b ∈ colspace(A), then the system is consistent and has the following:
(a) a unique solution if and only if dim[colspace(A)] = n.
(b) an inﬁnite number of solutions if and only if dim[colspace(A)] < n. Proof If we write A in terms of its column vectors as A = [a1 , a2 , . . . , an ], then the
linear system Ax = b can be written as
x1 a1 + x2 a2 + · · · + xn an = b.
Consequently, the linear system is consistent if and only if the vector b is a linear
combination of the column vectors of A. Thus, the system is consistent if and only if b ∈
colspace(A). This proves part 1. Parts 2(a) and 2(b) follow directly from Theorem 2.5.9,
since rank(A) = dim[colspace(A)].
The set of all solutions to a nonhomogeneous linear system is not a vector space,
since, for example, it does not contain the zero vector, but the linear structure of
nullspace(A) can be used to determine the general form of the solution of a nonhomogeneous system.
Theorem 4.9.5 Let A be an m × n matrix. If rank(A) = r < n and b ∈ colspace(A), then all solutions
to Ax = b are of the form
x = c1 x1 + c2 x2 + · · · + cn−r xn−r + xp , (4.9.3) where xp is any particular solution to Ax = b, and {x1 , x2 , . . . , xn−r } is a basis for
nullspace(A). Proof Since xp is a solution to Ax = b, we have
Axp = b. (4.9.4) Let x = u be an arbitrary solution to Ax = b. Then we also have
Au = b. (4.9.5) i i i i i i i main”
2007/2/16
page 310
i 310 CHAPTER 4 Vector Spaces Subtracting (4.9.4) from (4.9.5) yields
Au − Axp = 0,
or equivalently,
A(u − xp ) = 0.
Consequently, the vector u − xp is in nullspace(A), and so there exist scalars c1 , c2 ,
. . . , cn−r such that
u − xp = c1 x1 + c2 x2 + · · · + cn−r xn−r ,
since {x1 , x2 , . . . , xn−r } is a basis for nullspace(A). Hence,
u = c1 x1 + c2 x2 + · · · + cn−r xn−r + xp ,
as required. Remark The expression given in Equation (4.9.3) is called the general solution to
Ax = b. It has the structure
x = xc + xp ,
where
xc = c1 x1 + c2 x2 + · · · + cn−r xn−r
is the general solution of the associated homogeneous system and xp is one particular
solution of the nonhomogeneous system. In later chapters, we will see that this structure
is also apparent in the solution of all linear differential equations and in all linear systems
of differential equations. It is a result of the linearity inherent in the problem, rather than
the speciﬁc problem that we are studying. The unifying concept, in addition to the vector
space, is the idea of a linear transformation, which we will study in the next chapter. Example 4.9.6 Let 1 1 23
A = 3 4 −1 2 −1 −2 5 4 and 3
b = 10 .
−4 Verify that xp = (1, 1, −1, 1) is a particular solution to Ax = b, and use Theorem 4.9.5
to determine the general solution to the system. Solution: For the given xp , we have 1
1 1 23 3
1 Axp = 3 4 −1 2 −1 = 10 = b.
−1 −2 5 4
−4
1 Consequently, xp = (1, 1, −1, 1) is a particular solution to Ax = b. Further, from
Example 4.9.2, a basis for nullspace(A) is {x1 , x2 }, where x1 = (−9, 7, 1, 0) and x2 =
(−10, 7, 0, 1). Thus, the general solution to Ax = 0 is
xc = c1 x1 + c2 x2 , i i i i i i i “main”
2007/2/16
page 311
i 4.9 The Rank-Nullity Theorem 311 and therefore, from Theorem 4.9.5, the general solution to Ax = b is
x = c1 x1 + c2 x2 + xp = c1 (−9, 7, 1, 0) + c2 (−10, 7, 0, 1) + (1, 1, −1, 1),
which can be written as
x = (−9c1 − 10c2 + 1, 7c1 + 7c2 + 1, c1 − 1, c2 + 1). Exercises for 4.9 Skills
• For a given matrix A, be able to determine the rank
from the nullity, or the nullity from the rank.
• Know the relationship between the rank of a matrix A
and the consistency of a linear system Ax = b.
• Know the relationship between the column space of
a matrix A and the consistency of a linear system
Ax = b.
• Be able to formulate the solution set to a linear system
Ax = b in terms of the solution set to the corresponding homogeneous linear equation. True-False Review
For Questions 1–9, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false.
1. For an m × n matrix A, the nullity of A must be at
least |m − n|.
2. If A is a 7 × 9 matrix with nullity(A) = 2, then
rowspace(A) = R7 .
3. If A is a 9 × 7 matrix with nullity(A) = 0, then
rowspace(A) = R7 .
4. The nullity of an n × n upper triangular matrix A
is simply the number of zeros appearing on the main
diagonal of A.
5. An n × n matrix A for which nullspace(A) =
colspace(A) cannot be invertible.
6. For all m × n matrices A and B , nullity(A + B) =
nullity(A)+ nullity(B).
7. For all n × n matrices A and B , nullity(AB) =
nullity(A)· nullity(B). 8. For all n × n matrices A and B , nullity(AB) ≥
nullity(B).
9. If xp is a solution to the linear system Ax = b, then
y + xp is also a solution for any y in nullspace(A). Problems
For Problems 1–4, determine the null space of A and verify
the Rank-Nullity Theorem.
1. A = 1 0 −6 −1 .
2 −1
.
−4 2 1 1 −1
3. A = 3 4 4 .
11 0 1 4 −1 3
4. A = 2 9 −1 7 .
2 8 −2 6
2. A = For Problems 5–8, determine the nullity of A “by inspection” by appealing to the Rank-Nullity Theorem. Avoid
computations. 2 −3
0
0
.
5. A = −4
6
22 −33 1
3 −3 2
5 −4 −12 12 −8 −20 .
6. A = 0
000
0
1
3 −3 2
6 010
0 1 0 7. A = 0 0 1 .
001
8. A = 0 0 0 −2 . i i i i i i i “main”
2007/2/16
page 312
i 312 CHAPTER 4 Vector Spaces For Problems 9–12, determine the solution set to Ax = b,
and show that all solutions are of the form (4.9.3). 1 3 −1
4
9. A = 2 7 9 , b = 11 .
1 5 21
10 2 −1 1 4
5
10. A = 1 −1 2 3 , b = 6 .
1 −2 5 5
13 1 1 −2
−3 3 −1 −7 2 11. A = 1 1 1 , b = 0 .
2 2 −4
−6 1 1 −1 5
0
12. A = 0 2 −1 7 , b = 0 .
4 2 −3 13
0 14. Show that a 6 × 4 matrix A with nullity(A) = 0 must
have rowspace(A) = R4 . Is colspace(A) = R4 ?
15. Prove that if rowspace(A) = nullspace(A), then A
contains an even number of columns.
16. Show that a 5×7 matrix A must have 2 ≤ nullity(A) ≤
7. Give an example of a 5 × 7 matrix A with
nullity(A) = 2 and an example of a 5 × 7 matrix
A with nullity(A) = 7.
17. Show that 3 × 8 matrix A must have 5 ≤ nullity(A) ≤
8. Give an example of a 3 × 8 matrix A with
nullity(A) = 5 and an example of a 3 × 8 matrix
A with nullity(A) = 8.
18. Prove that if A and B are n × n matrices and A is
invertible, then 13. Show that a 3 × 7 matrix A with nullity(A) = 4 must
have colspace(A) = R3 . Is rowspace(A) = R3 ? 4.10 nullity(AB) = nullity(B).
[Hint: B x = 0 if and only if AB x = 0.] The Invertible Matrix Theorem II
In Section 2.8, we gave a list of characterizations of invertible matrices (Theorem 2.8.1).
In view of the concepts introduced in this chapter, we are now in a position to add to the
list that was begun there. Theorem 4.10.1 (Invertible Matrix Theorem)
Let A be an n×n matrix with real elements. The following conditions on A are equivalent:
(a) A is invertible.
(h) nullity(A) = 0.
(i) nullspace(A) = {0}.
(j) The columns of A form a linearly independent set of vectors in Rn .
(k) colspace(A) = Rn (that is, the columns of A span Rn ).
(l) The columns of A form a basis for Rn .
(m) The rows of A form a linearly independent set of vectors in Rn .
(n) rowspace(A) = Rn (that is, the rows of A span Rn ).
(o) The rows of A form a basis for Rn .
(p) AT is invertible. Proof The equivalence of (a) and (h) follows at once from Theorem 2.8.1(d) and the
Rank-Nullity Theorem (Theorem 4.9.1). The equivalence of (h) and (i) is immediately
clear. The equivalence of (a) and (j) is immediate from Theorem 2.8.1(c) and Theorem 4.5.14. Since the dimension of colspace(A) is simply rank(A), the equivalence of
(a) and (k) is immediate from Theorem 2.8.1(d). Next, from the deﬁnition of a basis, i i i i i i i “main”
2007/2/16
page 313
i 4.10 The Invertible Matrix Theorem II 313 we see that (j) and (k) are logically equivalent to (l). Moreover, since the row space
and column space of A have the same dimension, (k) and (n) are equivalent. Since
rowspace(A) = colspace(AT ), the equivalence of (k) and (n) proves that (a) and (p) are
equivalent. Finally, the equivalence of (a) and (p) proves that (j) is equivalent to (m) and
that (l) is equivalent to (o).
Example 4.10.2 Do the rows of the matrix below span R4 ? −2 −2 1 3 3 3 0 −1 A= −1 −1 −2 −5 2211 Solution: We see by inspection that the columns of A are linearly dependent, since
the ﬁrst two columns are identical. Therefore, by the equivalence of (j) and (n) in the
Invertible Matrix Theorem, the rows of A do not span R4 .
Example 4.10.3 If A is an n × n matrix such that the linear system AT x = 0 has no nontrivial solution
x, then nullspace(AT ) = {0}, and thus AT is invertible by the equivalence of (a) and (i)
in the Invertible Matrix Theorem. Thus, by the same theorem, we can conclude that the
columns of A form a linearly independent set.
Despite the lengthy list of characterizations of invertible matrices that we have been
able to develop so far, this list is still by no means complete. In the next chapter, we
will use linear transformations and eigenvalues to provide further characterizations of
invertible matrices. Exercises for 4.10 Skills
• Be well familiar with all of the conditions (a)–(p) in the
Invertible Matrix Theorem that characterize invertible
matrices. True-False Review
For Questions 1–10, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false.
1. The set of all row vectors of an invertible matrix is
linearly independent. 4. If A is an n × n matrix with det(A) = 0, then the
columns of A must form a basis for Rn .
5. If A and B are row-equivalent n × n matrices such that
rowspace(A) = Rn , then colspace(B) = Rn .
6. If E is an n × n elementary matrix and A is an n × n
matrix with nullspace(A) = {0}, then det(EA) = 0.
7. If A and B are n × n invertible matrices, then
nullity([A|B ]) = 0, where [A|B ] is the n × 2n matrix
with the blocks A and B as shown.
8. A matrix of the form 0a0
b 0 c
0d0 2. An n × n matrix can have n linearly independent rows
and n linearly dependent columns.
3. The set of all row vectors of an n × n matrix can be linearly dependent while the set of all columns is linearly
independent. cannot be invertible. i i i i i i i “main”
2007/2/16
page 314
i 314 CHAPTER 4 Vector Spaces 9. A matrix of the form 0
c 0
g a
0
e
0 0
d
0
h 10. A matrix of the form abc
d e f ghi b
0 f
0 such that ae − bd = 0 cannot be invertible. cannot be invertible. 4.11 Inner Product Spaces
We now extend the familiar idea of a dot product for geometric vectors to an arbitrary
vector space V . This enables us to associate a magnitude with each vector in V and also
to deﬁne the angle between two vectors in V . The major reason that we want to do this
is that, as we will see in the next section, it enables us to construct orthogonal bases in
a vector space, and the use of such a basis often simpliﬁes the representation of vectors.
We begin with a brief review of the dot product.
Let x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ) be two arbitrary vectors in R3 , and consider
the corresponding geometric vectors
x = x1 i + x2 j + x3 k, y = y1 i + y2 j + y3 k. The dot product of x and y can be deﬁned in terms of the components of these vectors as
x · y = x1 y1 + x2 y2 + x3 y3 . (4.11.1) An equivalent geometric deﬁnition of the dot product is
z
(y1, y2, y3)
(x1, x2, x3)
x y
y x Figure 4.11.1: Deﬁning the dot
product in R3 . x · y = ||x|| ||y|| cos θ, (4.11.2) where ||x||, ||y|| denote the lengths of x and y respectively, and 0 ≤ θ ≤ π is the angle
between them. (See Figure 4.11.1.)
Taking y = x in Equations (4.11.1) and (4.11.2) yields
2
2
2
||x||2 = x · x = x1 + x2 + x3 , so that the length of a geometric vector is given in terms of the dot product by
||x|| = √ x·x = 2
2
2
x1 + x2 + x3 . Furthermore, from Equation (4.11.2), the angle between any two nonzero vectors x and
y is
cos θ = x·y
,
||x|| ||y|| (4.11.3) which implies that x and y are orthogonal (perpendicular) if and only if
x · y = 0.
In a general vector space, we do not have a geometrical picture to guide us in deﬁning
the dot product, hence our deﬁnitions must be purely algebraic. We begin by considering
the vector space Rn , since there is a natural way to extend Equation (4.11.1) in this
case. Before proceeding, we note that from now on we will use the standard terms inner
product and norm in place of dot product and length, respectively. i i i i i i i “main”
2007/2/16
page 315
i 4.11 Inner Product Spaces 315 DEFINITION 4.11.1
Let x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) be vectors in Rn . We deﬁne the
standard inner product in Rn , denoted x, y , by
x, y = x1 y1 + x2 y2 + · · · + xn yn .
The norm of x is
||x|| = Example 4.11.2 x, x = 2
2
2
x1 + x2 + · · · + xn . If x = (1, −1, 0, 2, 4) and y = (2, 1, 1, 3, 0) in R5 , then
x, y = (1)(2) + (−1)(1) + (0)(1) + (2)(3) + (4)(0) = 7,
√
||x|| = 12 + (−1)2 + 02 + 22 + 42 = 22,
√
||y|| = 22 + 12 + 12 + 32 + 02 = 15. Basic Properties of the Standard Inner Product in Rn
In the case of Rn , the deﬁnition of the standard inner product was a natural extension of
the familiar dot product in R3 . To generalize this deﬁnition further to an arbitrary vector
space, we isolate the most important properties of the standard inner product in Rn and
use them as the deﬁning criteria for a general notion of an inner product. Let us examine
the inner product in Rn more closely. We view it as a mapping that associates with any
two vectors x = (x1 , x2 , . . . , xn ) and y = (y1 , y2 , . . . , yn ) in Rn the real number
x, y = x1 y1 + x2 y2 + · · · + xn yn .
This mapping has the following properties:
For all x, y, and z in Rn and all real numbers k ,
1. x, x ≥ 0. Furthermore, x, x = 0 if and only if x = 0.
2. y, x = x, y .
3. k x, y = k x, y .
4. x + y, z = x, z + y, z .
These properties are easily established using Deﬁnition 4.11.1. For example, to prove
property 1, we proceed as follows. From Deﬁnition 4.11.1,
2
2
2
x, x = x1 + x2 + · · · + xn . Since this is a sum of squares of real numbers, it is necessarily nonnegative. Further,
x, x = 0 if and only if x1 = x2 = · · · = xn = 0—that is, if and only if x = 0.
Similarly, for property 2, we have
y, x = y1 x1 + y2 x2 + · · · + yn xn = x1 y1 + x2 y2 + · · · + xn yn = x, y .
We leave the veriﬁcation of properties 3 and 4 for the reader. i i i i i i i “main”
2007/2/16
page 316
i 316 CHAPTER 4 Vector Spaces Deﬁnition of a Real Inner Product Space
We now use properties 1–4 as the basic deﬁning properties of an inner product in a real
vector space. DEFINITION 4.11.3
Let V be a real vector space. A mapping that associates with each pair of vectors u
and v in V a real number, denoted u, v , is called an inner product in V , provided
it satisﬁes the following properties. For all u, v, and w in V , and all real numbers k ,
1. u, u ≥ 0. Furthermore, u, u = 0 if and only if u = 0.
2. v, u = u, v .
3. k u, v = k u, v .
4. u + v, w = u, w + v, w .
The norm of u is deﬁned in terms of an inner product by
||u|| = u, u . A real vector space together with an inner product deﬁned in it is called a real inner
product space. Remarks
√
1. Observe that ||u|| =
u, u takes a well-deﬁned nonnegative real value, since
property 1 of an inner product guarantees that the norm evaluates the square root
of a nonnegative real number.
2. It follows from the discussion above that Rn together with the inner product deﬁned
in Deﬁnition 4.11.1 is an example of a real inner product space.
One of the fundamental inner products arises in the vector space C 0 [a, b] of all
real-valued functions that are continuous on the interval [a, b]. In this vector space, we
deﬁne the mapping f, g by
b f, g = f (x)g(x) dx, (4.11.4) a for all f and g in C 0 [a, b]. We establish that this mapping deﬁnes an inner product in
C 0 [a, b] by verifying properties 1–4 of Deﬁnition 4.11.3. If f is in C 0 [a, b], then y
y [f (x)]2 b f, f = [f (x)]2 dx. a a b Figure 4.11.2: f, f gives the
area between the graph of
y = [f (x)]2 and the x -axis, lying
over the interval [a, b]. x Since the integrand, [f (x)]2 , is a nonnegative continuous function, it follows that f, f
measures the area between the graph y = [f (x)]2 and the x -axis on the interval [a, b].
(See Figure 4.11.2.)
Consequently, f, f ≥ 0. Furthermore, f, f = 0 if and only if there is zero area
between the graph y = [f (x)]2 and the x -axis—that is, if and only if
[f (x)]2 = 0 for all x in [a, b]. i i i i i i i “main”
2007/2/16
page 317
i 4.11 y
f (x) a Inner Product Spaces 317 Hence, f, f = 0 if and only if f (x) = 0, for all x in [a, b], so f must be the zero
function. (See Figure 4.11.3.) Consequently, property 1 of Deﬁnition 4.11.3 is satisﬁed.
0
0 for all x in [a,b] Now let f, g, and h be in C [a, b], and let k be an arbitrary real number. Then
b g, f = x a b Figure 4.11.3: f, f = 0 if
and only if f is the zero function. b g(x)f (x) dx = f (x)g(x) dx = f, g . a Hence, property 2 of Deﬁnition 4.11.3 is satisﬁed.
For property 3, we have
b kf, g = b (kf )(x)g(x) dx = a b kf (x)g(x) dx = k a f (x)g(x) dx = k f, g , a as needed. Finally,
b f + g, h =
a = b b (f + g)(x)h(x) dx =
b f (x)h(x) dx + a [f (x) + g(x)]h(x) dx a g(x)h(x) dx = f, h + g , h , a so that property (4) of Deﬁnition 4.11.3 is satisﬁed. We can now conclude that Equation (4.11.4) does deﬁne an inner product in the vector space C 0 [a, b].
Example 4.11.4 Use Equation (4.11.4) to determine the inner product of the following functions in
C 0 [0, 1]: f (x) = 8x, g(x) = x 2 − 1. Also ﬁnd ||f || and ||g ||. Solution: From Equation (4.11.4),
1 f, g = 1 8x(x 2 − 1) dx = 2x 4 − 4x 2 0 = −2.
0 Moreover, we have
1 ||f || =
0 8
64x 2 dx = √
3 and
1 ||g || = 1 (x 2 − 1)2 dx = 0 (x 4 − 2x 2 + 1) dx = 0 8
.
15 We have already seen that the norm concept generalizes the length of a geometric
vector. Our next goal is to show how an inner product enables us to deﬁne the angle
between two vectors in an abstract vector space. The key result is the Cauchy-Schwarz
inequality established in the next theorem.
Theorem 4.11.5 (Cauchy-Schwarz Inequality)
Let u and v be arbitrary vectors in a real inner product space V . Then
| u, v | ≤ ||u|| ||v||. (4.11.5) i i i i i i i “main”
2007/2/16
page 318
i 318 CHAPTER 4 Vector Spaces Proof Let k be an arbitrary real number. For the vector u + k v, we have
0 ≤ ||u + k v||2 = u + k v, u + k v . (4.11.6) But, using the properties of a real inner product,
u + k v , u + k v = u, u + k v + k v , u + k v
= u + k v, u + u + k v, k v
= u, u + k v , u + u, k v + k v , k v
= u, u + 2 k v , u + k v , k v
= u, u + 2 k v , u + k k v , v
= u, u + 2 k v , u + k 2 v , v
= ||u||2 + 2k v, u + k 2 ||v||2 .
Consequently, (4.11.6) implies that
||v||2 k 2 + 2 u, v k + ||u||2 ≥ 0. (4.11.7) The left-hand side of this inequality deﬁnes the quadratic expression
P (k) = ||v||2 k 2 + 2 u, v k + ||u||2 .
The discriminant of this quadratic is
= 4( u, v )2 − 4||u||2 ||v||2 .
If > 0, then P (k) has two real and distinct roots. This would imply that the graph of P
crosses the k -axis and, therefore, P would assume negative values, contrary to (4.11.7).
Consequently, we must have ≤ 0. That is,
4( u, v )2 − 4||u||2 ||v||2 ≤ 0,
or equivalently,
( u, v )2 ≤ ||u||2 ||v||2 .
Hence,
| u, v | ≤ ||u|| ||v||.
If u and v are arbitrary vectors in a real inner product space V , then u, v is a real
number, and so (4.11.5) can be written in the equivalent form
−||u|| ||v|| ≤ u, v ≤ ||u|| ||v||.
Consequently, provided that u and v are nonzero vectors, we have
−1 ≤ u, v
≤ 1.
||u|| ||v|| Thus, each pair of nonzero vectors in a real inner product space V determines a unique
angle θ by
cos θ = u, v
,
||u|| ||v|| 0 ≤ θ ≤ π. (4.11.8) i i i i i i i “main”
2007/2/16
page 319
i 4.11 Inner Product Spaces 319 We call θ the angle between u and v. In the case when u and v are geometric vectors,
the formula (4.11.8) coincides with Equation (4.11.3).
Example 4.11.6 Determine the angle between the vectors u = (1, −1, 2, 3) and v = (−2, 1, 2, −2) in
R4 . Solution: Using the standard inner product in R4 yields
√
√
u, v = −5,
||u|| = 15,
||v|| = 13, so that the angle between u and v is given by
√
5
195
cos θ = − √ √ = −
,
39
15 13 0 ≤ θ ≤ π. Hence,
√
θ = arccos − Example 4.11.7 195
39 ≈ 1.937 radians ≈ 110◦ 58 . Use the inner product (4.11.4) to determine the angle between the functions f1 (x) =
sin 2x and f2 (x) = cos 2x on the interval [−π, π ]. Solution: Using the inner product (4.11.4), we have f1 , f2 = π
−π sin 2x cos 2x dx = 1
2 π
π sin 4x dx = 1
(− cos 4x)
8 π
−π = 0. Consequently, the angle between the two functions satisﬁes
cos θ = 0, 0 ≤ θ ≤ π, which implies that θ = π/2. We say that the functions are orthogonal on the interval
[−π, π ], relative to the inner product (4.11.4). In the next section we will have much
more to say about orthogonality of vectors. Complex Inner Products9
The preceding discussion has been concerned with real vector spaces. In order to generalize the deﬁnition of an inner product to a complex vector space, we ﬁrst consider the
case of Cn . By analogy with Deﬁnition 4.11.1, one might think that the natural inner
product in Cn would be obtained by summing the products of corresponding components
of vectors in Cn in exactly the same manner as in the standard inner product for Rn .
However, one reason for introducing an inner product is to obtain a concept of “length”
of a vector. In order for a quantity to be considered a reasonable measure of length, we
would want it to be a nonnegative real number that vanishes if and only if the vector
itself is the zero vector (property 1 of a real inner product). But, if we apply the inner
product in Rn given in Deﬁnition 4.11.1 to vectors in Cn , then, since the components
of vectors in Cn are complex numbers, it follows that the resulting norm of a vector in
9 In the remainder of the text, the only complex inner product that we will require is the standard inner
product in Cn , and this is needed only in Section 5.10. i i i i i i i “main”
2007/2/16
page 320
i 320 CHAPTER 4 Vector Spaces Cn would be a complex number also. Furthermore, applying the R2 inner product to, for
example, the vector u = (1 − i, 1 + i), we obtain
||u||2 = (1 − i)2 + (1 + i)2 = 0,
which means that a nonzero vector would have zero “length.” To rectify this situation,
we must deﬁne an inner product in Cn more carefully. We take advantage of complex
conjugation to do this, as the deﬁnition shows. DEFINITION 4.11.8
If u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ) are vectors in Cn , we deﬁne the
standard inner product in Cn by10
u, v = u1 v 1 + u2 v 2 + · · · + un v n .
The norm of u is deﬁned to be the real number
||u|| = u, u = |u1 |2 + |u2 |2 + · · · + |un |2 . The preceding inner product is a mapping that associates with the two vectors
u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ) in Cn the scalar
u, v = u1 v 1 + u2 v 2 + · · · + un v n .
In general, u, v will be nonreal (i.e., it will have a nonzero imaginary part). The key
point to notice is that the norm of u is always a real number, even though the separate
components of u are complex numbers. Example 4.11.9 If u = (1 + 2i, 2 − 3i) and v = (2 − i, 3 + 4i), ﬁnd u, v and ||u||. Solution: Using Deﬁnition 4.11.8, u, v = (1 + 2i)(2 + i) + (2 − 3i)(3 − 4i) = 5i − 6 − 17i = −6 − 12i,
√
√
||u|| = u, u = (1 + 2i)(1 − 2i) + (2 − 3i)(2 + 3i) = 5 + 13 = 3 2.
The standard inner product in Cn satisﬁes properties (1), (3), and (4), but not property
(2). We now derive the appropriate generalization of property (2) when using the standard
inner product in Cn . Let u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ) be vectors in
Cn . Then, from Deﬁnition 4.11.8,
v, u = v1 u1 + v2 u2 + · · · + vn un = u1 v 1 + u2 v 2 + · · · + un v n = u, v .
Thus,
v , u = u, v .
We now use the properties satisﬁed by the standard inner product in Cn to deﬁne an inner
product in an arbitrary (that is, real or complex) vector space.
10 Recall that if z = a + ib, then z = a − ib and |z|2 = zz = (a + ib)(a − ib) = a 2 + b2 . i i i i i i i “main”
2007/2/16
page 321
i 4.11 Inner Product Spaces 321 DEFINITION 4.11.10
Let V be a (real or complex) vector space. A mapping that associates with each pair
of vectors u, v in V a scalar, denoted u, v , is called an inner product in V , provided
it satisﬁes the following properties. For all u, v and w in V and all (real or complex)
scalars k ,
1. u, u ≥ 0. Furthermore, u, u = 0 if and only if u = 0.
2. v, u = u, v .
3. k u, v = k u, v .
4. u + v, w = u, w + v, w .
The norm of u is deﬁned in terms of the inner product by
||u|| = u, u . Remark Notice that the properties in the preceding deﬁnition reduce to those in
Deﬁnition 4.11.3 in the case that V is a real vector space, since in such a case the
complex conjugates are unnecessary. Thus, this deﬁnition is a consistent extension of
Deﬁnition 4.11.3.
Example 4.11.11 Use properties 2 and 3 of Deﬁnition 4.11.10 to prove that in an inner product space
u, k v = k u, v
for all vectors u, v and all scalars k . Solution: From properties 2 and 3, we have
u, k v = k v , u = k v , u = k v , u = k u, v . Notice that in the particular case of a real vector space, the foregoing result reduces to
u, k v = k u, v ,
since in such a case the scalars are real numbers. Exercises for 4.11 Key Terms
Inner product, Axioms of an inner product, Real (complex) inner product space, Norm, Angle, Cauchy-Schwarz
inequality. Skills
• Know the four inner product space axioms.
• Be able to check whether or not a proposed inner product on a vector space V satisﬁes the inner product space
axioms. • Be able to compute the inner product of two vectors
in an inner product space.
• Be able to ﬁnd the norm of a vector in an inner product
space.
• Be able to ﬁnd the angle between two vectors in an
inner product space. True-False Review
For Questions 1–7, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false. i i i i i i i “main”
2007/2/16
page 322
i 322 CHAPTER 4 Vector Spaces 1. If v and w are linearly independent vectors in an inner
product space V , then v, w = 0.
2. In any inner product space V , we have For Problems 6–7, use the inner product (4.11.9) to determine A, B , ||A||, and ||B ||.
6. A = 2 −1
,B =
35 31
.
−1 2 7. A = 32
,B =
−2 4 11
.
−2 1 k v, k w = k v, w .
3. If v1 , w = v2 , w = 0 in an inner product space V ,
then
c1 v1 + c2 v2 , w = 0.
4. In any inner product space V , x + y, x − y < 0 if
and only if ||x|| < ||y||.
5. In any vector space V , there is at most one valid inner
product , that can be deﬁned on V .
6. The angle between the vectors v and w in an inner
product space V is the same as the angle between the
vectors −2v and −2w.
7. If p(x) = a0 +a1 x +a2 x 2 and q(x) = b0 +b1 x +b2 x 2 ,
then we can deﬁne an inner product on P2 via p, q =
a0 b0 . Problems
1. Use the standard inner product in R4 to determine
the angle between the vectors v = (1, 3, −1, 4) and
w = (−1, 1, −2, 1).
2. If f (x) = sin x and g(x) = x on [0, π ], use the function inner product deﬁned in the text to determine the
angle between f and g .
3. If v = (2 + i, 3 − 2i, 4 + i) and w = (−1 + i, 1 − 3i, 3 −
i), use the standard inner product in C3 to determine,
v, w , ||v||, and ||w||.
4. Let
A= a11 a12
,
a21 a22 B= b11 b12
b21 b22 be vectors in M2 (R). Show that the mapping
A, B = a11 b11 + a12 b12 + a21 b21 + a22 b22
(4.11.9)
deﬁnes an inner product in M2 (R).
5. Referring to A and B in the previous problem, show
that the mapping
A, B = a11 b22 + a12 b21 + a21 b12 + a22 b11
does not deﬁne a valid inner product on M2 (R). 8. Let p1 (x) = a + bx and p2 (x) = c + dx be vectors
in P1 . Determine a mapping p1 , p2 that deﬁnes an
inner product on P1 .
Consider the vector space R2 . Deﬁne the mapping , by
v, w = 2v1 w1 + v1 w2 + v2 w1 + 2v2 w2
(4.11.10)
for all vectors v = (v1 , v2 ) and w = (w1 , w2 ) in R2 . This
mapping is required for Problems 9–12.
9. Verify that Equation (4.11.10) deﬁnes an inner product
on R2 .
For Problems 10–12, determine the inner product of the given
vectors using (a) the inner product (4.11.10), (b) the standard
inner product in R2 .
10. v = (1, 0), w = (−1, 2).
11. v = (2, −1), w = (3, 6).
12. v = (1, −2), w = (2, 1).
13. Consider the vector space R2 . Deﬁne the mapping ,
by
v, w = v1 w1 − v2 w2 , (4.11.11) for all vectors v = (v1 , v2 ) and w = (w1 , w2 ). Verify
that all of the properties in Deﬁnition 4.11.3 except (1)
are satisﬁed by (4.11.11).
The mapping (4.11.11) is called a pseudo-inner product in
R2 and, when generalized to R4 , is of fundamental importance in Einstein’s special relativity theory.
14. Using Equation (4.11.11), determine all nonzero vectors satisfying v, v = 0. Such vectors are called null
vectors.
15. Using Equation (4.11.11), determine all vectors satisfying v, v < 0. Such vectors are called timelike
vectors. i i i i i i i “main”
2007/2/16
page 323
i 4.12 Orthogonal Sets of Vectors and the Gram-Schmidt Process [Hint: ||v + w||2 = v + w, v + w .] 16. Using Equation (4.11.11), determine all vectors satisfying v, v > 0. Such vectors are called spacelike
vectors. (b) Two vectors v and w in an inner product space
V are called orthogonal if v, w = 0. Use (a)
to prove the general Pythagorean theorem: If v
and w are orthogonal in an inner product space
V , then 17. Make a sketch of R2 and indicate the position of the
null, timelike, and spacelike vectors.
18. Consider the vector space Rn , and let v =
(v1 , v2 , . . . , vn ) and w = (w1 , w2 , . . . , wn ) be vectors in Rn . Show that the mapping , deﬁned by ||v + w||2 = ||v||2 + ||w||2 . v, w = k1 v1 w1 + k2 v2 w2 + · · · + kn vn wn (c) Prove that for all v, w in V ,
(i) ||v + w||2 − ||v − w||2 = 4 v, w . is a valid inner product on Rn if and only if the constants k1 , k2 , . . . , kn are all positive.
19. Prove from the inner product axioms that, in any inner
product space V , v, 0 = 0 for all v in V . 323 (ii) ||v + w||2 + ||v − w||2 = 2(||v||2 + ||w||2 ).
21. Let V be a complex inner product space. Prove that
for all v, w in V , 20. Let V be a real inner product space.
||v + w||2 = ||v||2 + 2Re( v, w ) + ||v||2 , (a) Prove that for all v, w ∈ V ,
||v + w||2 = ||v||2 + 2 v, w + ||w||2 . 4.12 where Re denotes the real part of a complex number. Orthogonal Sets of Vectors and the Gram-Schmidt Process
The discussion in the previous section has shown how an inner product can be used
to deﬁne the angle between two nonzero vectors. In particular, if the inner product of
two nonzero vectors is zero, then the angle between those two vectors is π/2 radians,
and therefore it is natural to call such vectors orthogonal (perpendicular). The following
deﬁnition extends the idea of orthogonality into an arbitrary inner product space. DEFINITION 4.12.1
Let V be an inner product space.
1. Two vectors u and v in V are said to be orthogonal if u, v = 0.
2. A set of nonzero vectors {v1 , v2 , . . . , vk } in V is called an orthogonal set
of vectors if
whenever i = j.
vi , vj = 0,
(That is, every vector is orthogonal to every other vector in the set.)
3. A vector v in V is called a unit vector if ||v|| = 1.
4. An orthogonal set of unit vectors is called an orthonormal set of vectors.
Thus, {v1 , v2 , . . . , vk } in V is an orthonormal set if and only if
(a) vi , vj = 0 whenever i = j .
(b) vi , vi = 1 for all i = 1, 2, . . . , k . i i i i i i i “main”
2007/2/16
page 324
i 324 CHAPTER 4 Vector Spaces Remarks
1. The conditions in (4a) and (4b) can be written compactly in terms of the Kronecker
delta symbol as
vi , vj = δij ,
i, j = 1, 2, . . . , k.
2. Note that the inner products occurring in Deﬁnition 4.12.1 will depend upon which
inner product space we are working in.
1
3. If v is any nonzero vector, then
v is a unit vector, since the properties of an
||v||
inner product imply that
1
1
1
1
v, v =
||v||2 = 1.
v,
v=
2
||v|| ||v||
||v||
||v||2
Using Remark 3 above, we can take an orthogonal set of vectors {v1 , v2 , . . . , vk }
1
and create a new set {u1 , u2 , . . . , uk }, where ui =
vi is a unit vector for each i .
||vi ||
Using the properties of an inner product, it is easy to see that the new set {u1 , u2 , . . . , uk }
is an orthonormal set (see Problem 31). The process of replacing the vi by the ui is called
normalization.
Example 4.12.2 Verify that {(−2, 1, 3, 0), (0, −3, 1, −6), (−2, −4, 0, 2)} is an orthogonal set of vectors
in R4 , and use it to construct an orthonormal set of vectors in R4 . Solution:
Then Let v1 = (−2, 1, 3, 0), v2 = (0, −3, 1, −6), and v3 = (−2, −4, 0, 2).
v1 , v2 = 0, v1 , v3 = 0, v2 , v3 = 0, so that the given set of vectors is an orthogonal set. Dividing each vector in the set by
its norm yields the following orthonormal set:
1
1
1
√ v1 , √ v2 , √ v3 .
46
26
14
Example 4.12.3 Verify that the functions f1 (x) = 1, f2 (x) = sin x , and f3 (x) = cos x are orthogonal in
C 0 [−π, π ], and use them to construct an orthonormal set of functions in C 0 [−π, π ]. Solution: In this case, we have
f1 , f2 =
f2 , f3 = π
−π
π
−π sin x dx = 0,
sin x cos x dx = f1 , f3 =
12
sin x
2 π
−π cos x dx = 0, π
−π = 0, so that the functions are indeed orthogonal on [−π, π ]. Taking the norm of each function,
we obtain
||f1 || =
||f2 || =
||f3 || = π
−π
π
−π
π
−π 1 dx = √ 2π , sin2 x dx =
cos2 x dx = π
−π
π
−π √
1
(1 − cos 2x) dx = π ,
2
√
1
(1 + cos 2x) dx = π .
2 i i i i i i i “main”
2007/2/16
page 325
i 4.12 Orthogonal Sets of Vectors and the Gram-Schmidt Process 325 Thus an orthonormal set of functions on [−π, π ] is
1
1
1
√ , √ sin x, √ cos x .
π
π
2π Orthogonal and Orthonormal Bases
In the analysis of geometric vectors in elementary calculus courses, it is usual to use
the standard basis {i, j, k}. Notice that this set of vectors is in fact an orthonormal set.
The introduction of an inner product in a vector space opens up the possibility of using
similar bases in a general ﬁnite-dimensional vector space. The next deﬁnition introduces
the appropriate terminology. DEFINITION 4.12.4
A basis {v1 , v2 , . . . , vn } for a (ﬁnite-dimensional) inner product space is called an
orthogonal basis if
vi , vj = 0
whenever i = j,
and it is called an orthonormal basis if
vi , vj = δij , i, j = 1, 2, . . . , n. There are two natural questions at this point: (1) How can we obtain an orthogonal
or orthonormal basis for an inner product space V ? (2) Why is it beneﬁcial to work with
an orthogonal or orthonormal basis of vectors? We address the second question ﬁrst.
In light of our work in previous sections of this chapter, the importance of our next
theorem should be self-evident.
Theorem 4.12.5 If {v1 , v2 , . . . , vk } is an orthogonal set of nonzero vectors in an inner product space V ,
then {v1 , v2 , . . . , vk } is linearly independent. Proof Assume that
c1 v1 + c2 v2 + · · · + ck vk = 0. (4.12.1) We will show that c1 = c2 = · · · = ck = 0. Taking the inner product of each side of
(4.12.1) with vi , we ﬁnd that
c1 v1 + c2 v2 + · · · + ck vk , vi = 0, vi = 0.
Using the inner product properties on the left side, we have
c1 v1 , vi + c2 v2 , vi + · · · + ck vk , vi = 0.
Finally, using the fact that for all j = i , we have vj , vi = 0, we conclude that
ci vi , vi = 0.
Since vi = 0, it follows that ci = 0, and this holds for each i with 1 ≤ i ≤ k .
Example 4.12.6 Let V = M2 (R), let W be the subspace of all 2 × 2 symmetric matrices, and let
S= 2 −1
11
22
,
,
−1 0
12
2 −3 . i i i i i i i “main”
2007/2/16
page 326
i 326 CHAPTER 4 Vector Spaces Deﬁne an inner product on V via11
a11 a12
bb
, 11 12
a21 a22
b21 b22 = a11 b11 + a12 b12 + a21 b21 + a22 b22 . Show that S is an orthogonal basis for W . Solution: According to Example 4.6.18, we already know that dim[W ] = 3. Using
the given inner product, it can be directly shown that S is an orthogonal set, and hence,
Theorem 4.12.5 implies that S is linearly independent. Therefore, by Theorem 4.6.10, S
is a basis for W .
Let V be a (ﬁnite-dimensional) inner product space, and suppose that we have an
orthogonal basis {v1 , v2 , . . . , vn } for V . As we saw in Section 4.7, any vector v in V
can be written uniquely in the form
v = c1 v1 + c2 v2 + · · · + cn vn , (4.12.2) where the unique n-tuple (c1 , c2 , . . . , cn ) consists of the components of v relative to the
given basis. It is easier to determine the components ci in the case of an orthogonal basis
than it is for other bases, because we can simply form the inner product of both sides of
(4.12.2) with vi as follows:
v, vi = c1 v1 + c2 v2 + · · · + cn vn , vi
= c1 v1 , vi + c2 v2 , vi + · · · + cn vn , vi
= ci ||vi ||2 ,
where the last step follows from the orthogonality properties of the basis {v1 , v2 , . . . , vn }.
Therefore, we have proved the following theorem.
Theorem 4.12.7 Let V be a (ﬁnite-dimensional) inner product space with orthogonal basis {v1 , v2 , . . . , vn }.
Then any vector v ∈ V may be expressed in terms of the basis as
v= v, v1
||v1 ||2 v1 + v, v2
||v2 ||2 v , vn
||vn ||2 v2 + · · · + vn . Theorem 4.12.7 gives a simple formula for writing an arbitrary vector in an inner
product space V as a linear combination of vectors in an orthogonal basis for V . Let us
illustrate with an example.
Example 4.12.8 Let V , W , and S be as in Example 4.12.6. Find the components of the vector
v= 0 −1
−1 2 relative to S . Solution: From the formula given in Theorem 4.12.7, we have
v= 2
6 2
2 −1
+
−1 0
7 10
11
−
12
21 22
,
2 −3 11 This deﬁnes a valid inner product on V by Problem 4 in Section 4.11. i i i i i i i “main”
2007/2/16
page 327
i 4.12 Orthogonal Sets of Vectors and the Gram-Schmidt Process 327 so the components of v relative to S are
12
10
, ,−
.
37
21
If the orthogonal basis {v1 , v2 , . . . , vn } for V is in fact orthonormal, then since
||vi || = 1 for each i , we immediately deduce the following corollary of Theorem 4.12.7.
Corollary 4.12.9 Let V be a (ﬁnite-dimensional) inner product space with an orthonormal basis
{v1 , v2 , . . . , vn }. Then any vector v ∈ V may be expressed in terms of the basis as
v = v, v1 v1 + v, v2 v2 + · · · + v, vn vn . Remark Corollary 4.12.9 tells us that the components of a given vector v relative to
the orthonormal basis {v1 , v2 , . . . , vn } are precisely the numbers v, vi , for 1 ≤ i ≤ n.
Thus, by working with an orthonormal basis for a vector space, we have a simple method
for getting the components of any vector in the vector space.
Example 4.12.10 We can write an arbitrary vector in Rn , v = (a1 , a2 , . . . , an ), in terms of the standard
basis {e1 , e2 , . . . , en } by noting that v, ei = ai . Thus, v = a1 e1 + a2 e2 + · · · + an en . Example 4.12.11 We can equip the vector space P1 of all polynomials of degree ≤ 1 with inner product
p, q = 1
−1 p(x)q(x) dx, √
thus making P1 into an inner product space. Verify that the vectors p0 = 1/ 2 and
√
p1 = 1.5x form an orthonormal basis for P1 and use Corollary 4.12.9 to write the
vector q = 1 + x as a linear combination of p0 and p1 . Solution: We have p0 , p1 =
||p0 || =
||p1 || = 1
−1 1√
√ · 1.5x dx = 0,
2 p0 , p0 =
p1 , p1 = 1
−1
1
−1 2
p0 dx = 2
p1 dx = 1
−1
1
−1 √
1
dx = 1 = 1,
2
32
x dx =
2 13
x
2 1
−1 = √ 1 = 1. Thus, {p0 , p1 } is an orthonormal (and hence linearly independent) set of vectors in P1 .
Since dim[P1 ] = 2, Theorem 4.6.10 shows that {p0 , p1 } is an (orthonormal) basis for
P1 .
Finally, we wish to write q = 1 + x as a linear combination of p0 and p1 , by using
√
Corollary 4.12.9. We leave it to the reader to verify that q , p0 = 2 and q , p1 = 2 .
3
Thus, we have
√
2 p0 + √
2
1
p1 = 2 · √ +
3
2 2
·
3
√
So the component vector of 1 + x relative to {p0 , p1 } is ( 2,
1+x = 3
x.
2
2T
3) . i i i i i i i “main”
2007/2/16
page 328
i 328 CHAPTER 4 Vector Spaces The Gram-Schmidt Process
Next, we return to address the ﬁrst question we raised earlier: How can we obtain an
orthogonal or orthonormal basis for an inner product space V ? The idea behind the
process is to begin with any basis for V , say {x1 , x2 , . . . , xn }, and to successively replace
these vectors with vectors v1 , v2 , . . . , vn that are orthogonal to one another, and to ensure
that, throughout the process, the span of the vectors remains unchanged. This is known
as the Gram-Schmidt process. To describe it, we shall once more appeal to a look at
geometric vectors.
If v and w are any two linearly independent (noncollinear) geometric vectors, then
the orthogonal projection of w on v is the vector P(w, v) shown in Figure 4.12.1. We
see from the ﬁgure that an orthogonal basis for the subspace (plane) of 3-space spanned
by v and w is {v1 , v2 }, where
v1 = v v2 = w − P(w, v). In order to generalize this result to an arbitrary inner product space, we need to
derive an expression for P(w, v) in terms of the dot product. We see from Figure 4.12.1
that the norm of P(w, v) is z
v2 and w
w P(w, v)
v v1
P(w, v)
y ||P(w, v)|| = ||w|| cos θ,
where θ is the angle between v and w. Thus
P(w, v) = ||w|| cos θ x Figure 4.12.1: Obtaining an
orthogonal basis for a
two-dimensional subspace of R3 . v
,
||v|| which we can write as
P(w, v) = ||w|| ||v||
cos θ v.
||v||2 (4.12.3) Recalling that the dot product of the vectors w and v is deﬁned by
w · v = ||w|| ||v|| cos θ,
it follows from Equation (4.12.3) that
P(w, v) = (w · v)
v,
||v||2 or equivalently, using the notation for the inner product introduced in the previous section,
P(w, v) = w, v
v.
||v||2 Now let x1 and x2 be linearly independent vectors in an arbitrary inner product
space V . We show next that the foregoing formula can also be applied in V to obtain an
orthogonal basis {v1 , v2 } for the subspace of V spanned by {x1 , x2 }. Let
v1 = x1
and
v2 = x2 − P(x2 , v1 ) = x2 − x2 , v1
v1 .
||v1 ||2 (4.12.4) i i i i i i i “main”
2007/2/16
page 329
i 4.12 Orthogonal Sets of Vectors and the Gram-Schmidt Process 329 Note from (4.12.4) that v2 can be written as a linear combination of {x1 , x2 }, and
hence, v2 ∈ span{x1 , x2 }. Since we also have that x2 ∈ span{v1 , v2 }, it follows that
span{v1 , v2 } = span{x1 , x2 }. Next we claim that v2 is orthogonal to v1 . We have
x2 , v1
x2 , v1
v , v = x2 , v1 −
v1 , v1
211
||v1 ||
||v1 ||2
x2 , v1
v1 , v1 = 0,
= x2 , v1 −
||v1 ||2 v2 , v1 = x2 − which veriﬁes our claim. We have shown that {v1 , v2 } is an orthogonal set of vectors
which spans the same subspace of V as x1 and x2 .
The calculations just presented can be generalized to prove the following useful
result (see Problem 32).
Lemma 4.12.12 Let {v1 , v2 , . . . , vk } be an orthogonal set of vectors in an inner product space V . If x ∈ V ,
then the vector
x − P(x, v1 ) − P(x, v2 ) − · · · − P(x, vk )
is orthogonal to vi for each i .
Now suppose we are given a linearly independent set of vectors {x1 , x2 , . . . , xm } in
an inner product space V . Using Lemma 4.12.12, we can construct an orthogonal basis for
the subspace of V spanned by these vectors. We begin with the vector v1 = x1 as above,
and we deﬁne vi by subtracting off appropriate projections of xi on v1 , v2 , . . . , vi −1 .
The resulting procedure is called the Gram-Schmidt orthogonalization procedure.
The formal statement of the result is as follows. Theorem 4.12.13 (Gram-Schmidt Process)
Let {x1 , x2 , . . . , xm } be a linearly independent set of vectors in an inner product space V .
Then an orthogonal basis for the subspace of V spanned by these vectors is {v1 , v2 , . . . ,
vm }, where
v1 = x1
x2 , v1
v1
||v1 ||2
x3 , v1
x3 , v2
v3 = x3 −
v1 −
v2
||v1 ||2
||v2 ||2
.
.
. v2 = x2 − i −1 vi = xi −
k =1 xi , vk
vk
||vk ||2 .
.
.
m−1 vm = xm −
k =1 xm , vk
vk .
||vk ||2 Proof Lemma 4.12.12 shows that {v1 , v2 , . . . , vm } is an orthogonal set of vectors. Thus,
both {v1 , v2 , . . . , vm } and {x1 , x2 , . . . , xm } are linearly independent sets, and hence
span{v1 , v2 , . . . , vm } and span{x1 , x2 , . . . , xm } i i i i i i i “main”
2007/2/16
page 330
i 330 CHAPTER 4 Vector Spaces are m-dimensional subspaces of V . (Why?) Moreover, from the formulas given in Theorem 4.12.13, we see that each xi ∈ span{v1 , v2 , . . . , vm }, and so span{x1 , x2 , . . . , xm }
is a subset of span{v1 , v2 , . . . , vm }. Thus, by Corollary 4.6.14,
span{v1 , v2 , . . . , vm } = span{x1 , x2 , . . . , xm }.
We conclude that {v1 , v2 , . . . , vm } is a basis for the subspace of V spanned by x1 , x2 ,
. . . , xm . Example 4.12.14 Obtain an orthogonal basis for the subspace of R4 spanned by
x1 = (1, 0, 1, 0), Solution: x2 = (1, 1, 1, 1), x3 = (−1, 2, 0, 1). Following the Gram-Schmidt process, we set v1 = x1 = (1, 0, 1, 0). Next, we have
v2 = x2 − x2 , v1
2
v = (1, 1, 1, 1) − (1, 0, 1, 0) = (0, 1, 0, 1)
21
2
||v1 || and
x3 , v1
x3 , v2
v1 −
v2
||v1 ||2
||v2 ||2
1
3
= (−1, 2, 0, 1) + (1, 0, 1, 0) − (0, 1, 0, 1)
2
2
111
1
= − , , ,−
.
222
2 v3 = x3 − The orthogonal basis so obtained is
111
1
(1, 0, 1, 0), (0, 1, 0, 1), − , , , −
222
2 . Of course, once an orthogonal basis {v1 , v2 , . . . , vm } is obtained for a subspace
vi
of V , we can normalize this basis by setting ui =
to obtain an orthonormal
||vi ||
basis {u1 , u2 , . . . , um }. For instance, an orthonormal basis for the subspace of R4 in the
preceding example is
1
1
111
1
1
1
√ , 0, √ , 0 , 0, √ , 0, √ , − , , , −
222
2
2
2
2
2 Example 4.12.15 . Determine an orthogonal basis for the subspace of C 0 [−1, 1] spanned by the functions
f1 (x) = x , f2 (x) = x 3 , f3 (x) = x 5 , using the same inner product introduced in the
previous section. Solution: In this case, we let {g1 , g2 , g3 } denote the orthogonal basis, and we apply
the Gram-Schmidt process. Thus, g1 (x) = x , and
g2 (x) = f2 (x) − f2 , g1
g1 (x).
||g1 ||2 (4.12.5) i i i i i i i “main”
2007/2/16
page 331
i 4.12 Orthogonal Sets of Vectors and the Gram-Schmidt Process We have
f2 , g1 = 1
−1 1 f2 (x)g1 (x) dx =
1 ||g1 ||2 = g1 , g1 = −1 −1 x 4 dx = 2
5 331 and x 2 dx = 2 .
3 Substituting into Equation (4.12.5) yields
g2 (x) = x 3 − 3 x = 1 x(5x 2 − 3).
5
5
We now compute g3 (x). According to the Gram-Schmidt process,
g3 (x) = f3 (x) − f3 , g1
f3 , g2
g1 (x) −
g2 (x).
||g1 ||2
||g2 ||2 (4.12.6) We ﬁrst evaluate the required inner products:
f3 , g1 =
f3 , g2 =
||g2 ||2 =
= 1
−1
1
−1
1 f3 (x)g1 (x) dx =
f3 (x)g2 (x) dx =
[g2 (x)]2 dx = −1
11
6
25 −1 (25x 1
−1 1
5 x 6 dx = 2 ,
7
1 −1 x 6 (5x 2 − 3) dx = 11
2
2
25 −1 x (5x 1
5 10
9 − 6
7 = 16
315 , − 3)2 dx − 30x 4 + 9x 2 ) dx = 8
175 . Substituting into Equation (4.12.6) yields
g3 (x) = x 5 − 3 x − 2 x(5x 2 − 3) =
7
9 1
5
63 (63x − 70x 3 + 15x). Thus, an orthogonal basis for the subspace of C 0 [−1, 1] spanned by f1 , f2 , and f3 is
1
x , 1 x(5x 2 − 3), 63 x(63x 4 − 70x 2 + 15) .
5 Exercises for 4.12 Key Terms
Orthogonal vectors, Orthogonal set, Unit vector, Orthonormal vectors, Orthonormal set, Normalization, Orthogonal
basis, Orthonormal basis, Gram-Schmidt process, Orthogonal projection. Skills
• Be able to determine whether a given set of vectors are
orthogonal and/or orthonormal.
• Be able to determine whether a given set of vectors
forms an orthogonal and/or orthonormal basis for an
inner product space. • Be able to replace an orthogonal set with an orthonormal set via normalization.
• Be able to readily compute the components of a vector
v in an inner product space V relative to an orthogonal
(or orthonormal) basis for V .
• Be able to compute the orthogonal projection of one
vector w along another vector v: P(w, v).
• Be able to carry out the Gram-Schmidt process to replace a basis for V with an orthogonal (or orthonormal)
basis for V . i i i i i i i “main”
2007/2/16
page 332
i 332 CHAPTER 4 Vector Spaces True-False Review
For Questions 1–7, decide if the given statement is true or
false, and give a brief justiﬁcation for your answer. If true,
you can quote a relevant deﬁnition or theorem from the text.
If false, provide an example, illustration, or brief explanation
of why the statement is false. For Problems 6–7, show that the given set of vectors is an
orthogonal set in Cn , and hence obtain an orthonormal set of
vectors in Cn in each case.
6. {(1 − i, 3 + 2i), (2 + 3i, 1 − i)}.
7. {(1 − i, 1 + i, i), (0, i, 1 − i), (−3 + 3i, 2 + 2i, 2i)}.
8. Consider the vectors v = (1 − i, 1 + 2i), w = (2 + i, z)
in C2 . Determine the complex number z such that
{v, w} is an orthogonal set of vectors, and hence obtain
an orthonormal set of vectors in C2 . 1. Every orthonormal basis for an inner product space V
is also an orthogonal basis for V .
2. Every linearly independent set of vectors in an inner
product space V is orthogonal.
π
0 3. With the inner product f, g =
f (t)g(t) dt , the
functions f (x) = cos x and g(x) = sin x are an orthogonal basis for span{cos x, sin x }. For Problems 9–10, show that the given functions in
C 0 [−1, 1] are orthogonal, and use them to construct an orthonormal set of functions in C 0 [−1, 1].
9. f1 (x) = 1, f2 (x) = sin πx, f3 (x) = cos πx . 4. The Gram-Schmidt process applied to the vectors
{x1 , x2 , x3 } yields the same basis as the Gram-Schmidt
process applied to the vectors {x3 , x2 , x1 }. 1
10. f1 (x) = 1, f2 (x) = x, f3 (x) = 2 (3x 2 − 1). These
are the Legendre polynomials that arise as solutions
of the Legendre differential equation 5. In expressing the vector v as a linear combination of
the orthogonal basis {v1 , v2 , . . . , vn } for an inner product space V , the coefﬁcient of vi is
ci = v , vi
.
||vi ||2 (1 − x 2 )y − 2xy + n(n + 1)y = 0,
when n = 0, 1, 2, respectively.
For Problems 11–12, show that the given functions are orthonormal on [−1, 1]. 6. If u and v are orthogonal vectors and w is any vector,
then
P(P(w, v), u) = 0.
7. If w1 , w2 , and v are vectors in an inner product space
V , then
P(w1 + w2 , v) = P(w1 , v) + P(w2 , v). Problems
For Problems 1–4, determine whether the given set of vectors is an orthogonal set in Rn . For those that are, determine
a corresponding orthonormal set of vectors.
1. {(2, −1, 1), (1, 1, −1), (0, 1, 1)}.
2. {(1, 3, −1, 1), (−1, 1, 1, −1), (1, 0, 2, 1)}
3. {(1, 2, −1, 0), (1, 0, 1, 2), (−1, 1, 1, 0), (1, −1, −1, 0)}.
4. {(1, 2, −1, 0, 3), (1, 1, 0, 2, −1), (4, 2, −4, −5, −4)}
5. Let v1 = (1, 2, 3), v2 = (1, 1, −1). Determine all
nonzero vectors w such that {v1 , v2 , w} is an orthogonal set. Hence obtain an orthonormal set of vectors in
R3 . 11. f1 (x) = sin πx, f2 (x) = sin 2πx, f3 (x) = sin 3πx .
[Hint: The trigonometric identity
1
sin a sin b = 2 [cos(a + b) − cos(a − b)] will be useful.]
12. f1 (x) =
cos 3πx . cos πx, f2 (x) = cos 2πx, f3 (x) = 13. Let
A1 = 11
, A2 =
−1 2 A3 = −1 1
, and
21 −1 −3
.
02 Use the inner product
A, B = a11 b11 + a12 b12 + a21 b21 + a22 b22
to ﬁnd all matrices
A4 = ab
cd such that {A1 , A2 , A3 , A4 } is an orthogonal set of matrices in M2 (R). i i i i i i i “main”
2007/2/16
page 333
i 4.12 Orthogonal Sets of Vectors and the Gram-Schmidt Process For Problems 14–19, use the Gram-Schmidt process to determine an orthonormal basis for the subspace of Rn spanned
by the given set of vectors.
14. {(1, −1, −1), (2, 1, −1)}. 333 On Pn , deﬁne the inner product p1 , p2 by
p1 , p2 = a0 b0 + a1 b1 + · · · + an bn
for all polynomials 15. {(2, 1, −2), (1, 3, −1)}. p1 (x) = a0 + a1 x + · · · + an x n , 16. {(−1, 1, 1, 1), (1, 2, 1, 2)}. p2 (x) = b0 + b1 x + · · · + bn x n . 17. {(1, 0, −1, 0), (1, 1, −1, 0), (−1, 1, 0, 1)}
18. {(1, 2, 0, 1), (2, 1, 1, 0), (1, 0, 2, 1)}.
19. {(1, 1, −1, 0), (−1, 0, 1, 1), (2, −1, 2, 1)}. 20. If 3 14
A = 1 −2 1 ,
1 52
determine an orthogonal basis for rowspace(A).
For Problems 21–22, determine an orthonormal basis for the
subspace of C3 spanned by the given set of vectors. Make
sure that you use the appropriate inner product in C3 .
21. {(1 − i, 0, i), (1, 1 + i, 0)}.
22. {(1 + i, i, 2 − i), (1 + 2i, 1 − i, i)}.
For Problems 23–25, determine an orthogonal basis for the
subspace of C 0 [a, b] spanned by the given vectors, for the
given interval [a, b].
23. f1 (x) = 1, f2 (x) = x, f3 (x) = x 2 , a = 0, b = 1.
24. f1 (x) = 1, f2 (x) = x 2 , f3 (x) = x 4 , a = −1, b = 1.
f3 (x) = cos x, On M2 (R) deﬁne the inner product A, B by
A, B = 5a11 b11 + 2a12 b12 + 3a21 b21 + 5a22 b22
for all matrices A = [aij ] and B = [bij ]. For Problems 26–
27, use this inner product in the Gram-Schmidt procedure
to determine an orthogonal basis for the subspace of M2 (R)
spanned by the given matrices.
26. A1 = 1 −1
, A2 =
21 27. A1 = 01
, A2 =
10 30. Let {u1 , u2 , v} be linearly independent vectors in an
inner product space V , and suppose that u1 and u2 are
orthogonal. Deﬁne the vector u3 in V by
u3 = v + λu1 + µu2 ,
where λ, µ are scalars. Derive the values of λ and µ
such that {u1 , u2 , u3 } is an orthogonal basis for the
subspace of V spanned by {u1 , u2 , v}.
31. Prove that if {v1 , v2 , . . . , vk } is an orthogonal set of
1
vi
vectors in an inner product space V and if ui =
||vi ||
for each i , then {u1 , u2 , . . . , uk } form an orthonormal
set of vectors.
32. Prove Lemma 4.12.12.
Let V be an inner product space, and let W be a subspace of
V . Set
W ⊥ = {v ∈ V : v, w = 0 for all w ∈ W }.
The set W ⊥ is called the orthogonal complement of W
in V . Problems 33–38 explore this concept in some detail.
Deeper applications can be found in Project 1 at the end of
this chapter.
33. Prove that W ⊥ is a subspace of V .
34. Let V = R3 and let
W = span{(1, 1, −1)}. 2 −3
.
41
01
, A3 =
11 28. p1 (x) = 1 − 2x + 2x 2 , p2 (x) = 2 − x − x 2 .
29. p1 (x) = 1 + x 2 , p2 (x) = 2 − x + x 3 , p3 (x) = 2x 2 − x . 25. f1 (x) = 1, f2 (x) = sin x,
a = −π/2, b = π/2. For Problems 28–29, use this inner product to determine an
orthogonal basis for the subspace of Pn spanned by the given
polynomials. Find W ⊥ .
11
.
10 Also identify the subspace of M2 (R) spanned by
{A1 , A2 , A3 }. 35. Let V = R4 and let
W = span{(0, 1, −1, 3), (1, 0, 0, 3)}.
Find W ⊥ . i i i i i i i “main”
2007/2/16
page 334
i 334 CHAPTER 4 Vector Spaces 36. Let V = M2 (R) and let W be the subspace of 2 × 2
symmetric matrices. Compute W ⊥ .
37. Prove that W ∩ W ⊥ = 0. (That is, W and W ⊥ have
no nonzero elements in common.) [Hint: You may assume that interchange of the
inﬁnite summation with the integral is permissible.]
(c) Use a similar procedure to show that 38. Prove that if W1 is a subset of W2 , then (W2 )⊥ is a
subset of (W1 )⊥ . bm = 39. The subject of Fourier series is concerned with the representation of a 2π -periodic function f as the following inﬁnite linear combination of the set of functions
{1, sin nx, cos nx }∞ 1 :
n=
1
f (x) = 2 a0 + ∞
n=1 (an cos nx (a) Use appropriate trigonometric identities, or some
form of technology, to verify that the set of functions
{1, sin nx, cos nx }∞ 1
n=
is orthogonal on the interval [−π, π ].
(b) By multiplying (4.12.7) by cos mx and integrating over the interval [−π, π ], show that
1
a0 =
π
and
am = 1
π π
−π f (x) dx π
−π 4.13 π
−π f (x) sin mx dx. It can be shown that if f is in C 1 (−π, π), then
Equation (4.12.7) holds for each x ∈ (−π, π).
The series appearing on the right-hand side of
(4.12.7) is called the Fourier series of f , and the
constants in the summation are called the Fourier
coefﬁcients for f . + bn sin nx).
(4.12.7) In this problem, we investigate the possibility of performing such a representation. 1
π (d) Show that the Fourier coefﬁcients for the function
f (x) = x, −π < x ≤ π, f (x + 2π) = f (x),
are
an = 0, n = 0, 1, 2, . . . , 2
bn = − cos nπ,
n n = 1, 2, . . . , and thereby determine the Fourier series of f .
(e) Using some form of technology, sketch the
approximations to f (x) = x on the interval
(−π, π) obtained by considering the ﬁrst three
terms, ﬁrst ﬁve terms, and ﬁrst ten terms in the
Fourier series for f . What do you conclude? f (x) cos mx dx. Chapter Review
In this chapter we have derived some basic results in linear algebra regarding vector
spaces. These results form the framework for much of linear mathematics. Following
are listed some of the chapter highlights. The Deﬁnition of a Vector Space
A vector space consists of four different components:
1. A set of vectors V .
2. A set of scalars F (either the set of real numbers R, or the set of complex numbers
C).
3. A rule, +, for adding vectors in V .
4. A rule, · , for multiplying vectors in V by scalars in F .
Then (V , +, ·) is a vector space over F if and only if axioms A1–A10 of Deﬁnition 4.2.1
are satisﬁed. If F is the set of all real numbers, then (V , +, ·) is called a real vector
space, whereas if F is the set of all complex numbers, then (V , +, ·) is called a complex i i i i i i i “main”
2007/2/16
page 335
i 4.13 Chapter Review 335 vector space. Since it is usually quite clear what the addition and scalar multiplication
operations are, we usually specify a vector space by giving only the set of vectors V .
The major vector spaces we have dealt with are the following:
Rn
Cn
Mn (R)
C k (I )
Pn the (real) vector space of all ordered n-tuples of real numbers.
the (complex) vector space of all ordered n-tuples of complex numbers.
the (real) vector space of all n × n matrices with real elements.
the vector space of all real-valued functions that are continuous and have
(at least) k continuous derivatives on I .
the vector space of all polynomials of degree ≤ n with real coefﬁcients. Subspaces
Usually the vector space V that underlies a given problem is known. It is often one that
appears in the list above. However, the solution of a given problem in general involves
only a subset of vectors from this vector space. The question that then arises is whether
this subset of vectors is itself a vector space under the same operations of addition and
scalar multiplication as in V . In order to answer this question, Theorem 4.3.2 tells us
that a nonempty subset of a vector space V is a subspace of V if and only if the subset
is closed under addition and closed under scalar multiplication. Spanning Sets
A set of vectors {v1 , v2 , . . . , vk } in a vector space V is said to span V if every vector in
V can be written as a linear combination of v1 , v2 , . . . , vk —that is, if for every v ∈ V ,
there exist scalars c1 , c2 , . . . , ck such that
v = c1 v1 + c2 v2 + · · · + ck vk .
Given a set of vectors {v1 , v2 , . . . , vk } in a vector space V , we can form the set of all
vectors that can be written as a linear combination of v1 , v2 , . . . , vk . This collection of
vectors is a subspace of V called the subspace spanned by {v1 , v2 , . . . , vk }, and denoted
span{v1 , v2 , . . . , vk }. Thus,
span{v1 , v2 , . . . , vk } = {v ∈ V : v = c1 v1 + c2 v2 + · · · + ck vk }. Linear Dependence and Linear Independence
Let {v1 , v2 , . . . , vk } be a set of vectors in a vector space V , and consider the vector
equation
c1 v1 + c2 v2 + · · · + ck vk = 0. (4.13.1) Clearly this equation will hold if c1 = c2 = · · · = ck = 0. The question of interest is
whether there are nonzero values of some or all of the scalars c1 , c2 , . . . , ck such that
(4.13.1) holds. This leads to the following two ideas:
There exist scalars c1 , c2 , . . . , ck , not all zero, such that
(4.13.1) holds.
Linear independence: The only values of the scalars c1 , c2 , . . . , ck such that (4.13.1)
holds are c1 = c2 = · · · = ck = 0.
Linear dependence: To determine whether a set of vectors is linearly dependent or linearly independent we
usually have to use (4.13.1). However, if the vectors are from Rn , then we can use
Corollary 4.5.15, whereas for vectors in C k −1 (I ) the Wronskian can be useful. i i i i i i i “main”
2007/2/16
page 336
i 336 CHAPTER 4 Vector Spaces Bases and Dimension
A linearly independent set of vectors that spans a vector space V is called a basis for V .
If {v1 , v2 , . . . , vk } is a basis for V , then any vector in V can be written uniquely as
v = c1 v1 + c2 v2 + · · · + ck vk ,
for appropriate values of the scalars c1 , c2 , . . . , ck .
1. All bases in a ﬁnite-dimensional vector space V contain the same number of
vectors, and this number is called the dimension of V , denoted dim[V ].
2. We can view the dimension of a ﬁnite-dimensional vector space V in two different
ways. First, it gives the minimum number of vectors that span V . Alternatively, we
can regard dim[V ] as determining the maximum number of vectors that a linearly
independent set in V can contain.
3. If dim[V ] = n, then any linearly independent set of n vectors in V is a basis for
V . Alternatively, any set of n vectors that spans V is a basis for V . Inner Product Spaces
An inner product is a mapping that associates, with any two vectors u and v in a vector
space V , a scalar that we denote by u, v . This mapping must satisfy the properties
given in Deﬁnition 4.11.10. The main reason for introducing the idea of an inner product
is that it enables us to extend the familiar idea of orthogonality and length of vectors in
R3 to a general vector space. Thus u and v are said to be orthogonal in an inner product
space if and only if
u , v = 0. The Gram-Schmidt Orthonormalization Process
The Gram-Schmidt procedure is a process that takes a linearly independent set of vectors {x1 , x2 , . . . , xm } in an inner product space V and returns an orthogonal basis
{v1 , v2 , . . . , vm } for span{x1 , x2 , . . . , xm }. Additional Problems
For Problems 1–2, let r and s denote scalars and let v and w
denote vectors in R5 .
1. Prove that (r + s)v = r v + s v.
2. Prove that r(v + w) = r v + r w.
For Problems 3–13, determine whether the given set (together with the usual operations on that set) forms a vector
space over R. In all cases, justify your answer carefully. 3. The set of polynomials of degree 5 or less whose coefﬁcients are even integers.
4. The set of all polynomials of degree 5 or less whose
coefﬁcients of x 2 and x 3 are zero.
5. The set of solutions to the linear system
− 2x2 + 5x3 = 7,
4x1 − 6x2 + 3x3 = 0. i i i i i i i “main”
2007/2/16
page 337
i 4.13 6. The set of solutions to the linear system Chapter Review 337 For Problems 19–24, decide (with justiﬁcation) whether W
is a subspace of V . 4x1 − 7x2 + 2x3 = 0,
5x1 − 2x2 + 9x3 = 0. 19. V = R2 , W = {(x, y) : x 2 − y = 0}. 7. The set of 2 × 2 real matrices whose entries are either
all zero or all nonzero.
8. The set of 2 × 2 real matrices that commute with the
matrix
12
.
02
9. The set of all functions f : [0, 1] → [0, 1] such that
1
3
f (0) = f ( 4 ) = f ( 0 ) = f ( 4 ) = f (1) = 0.
2
10. The set of all functions f : [0, 1] → [0, 1] such that
f (x) ≤ x for all x in [0, 1].
11. The set of n × n matrices A such that A2 is symmetric.
12. The set of all points (x, y) in R2 that are equidistant
from (−1, 2) and (1, −2).
13. The set of all points (x, y, z) in R3 that are a distance
5 from the point (0, −3, 4).
14. Let 20. V = R2 , W = {(x, x 3 ) : x ∈ R}.
21. V = M2 (R), W = {2 × 2 orthogonal matrices}. [An
n × n matrix A is orthogonal if it is invertible and
A−1 = AT .]
22. V = C [a, b], W = {f ∈ V : f (a) = 2f (b)}.
23. V = C [a, b], W = {f ∈ V : b
a f (x) dx = 0}. 24. V = M3×2 (R), ab W = c d : a + b = c + f and a − c = e − f − d . ef
For Problems 25–32, decide (with justiﬁcation) whether or
not the given set S of vectors (a) spans V , and (b) is linearly
independent.
25. V = R3 , S = {(5, −1, 2), (7, 1, 1)}.
26. V = R3 , S = {(6, −3, 2), (1, 1, 1), (1, −8, −1)}. V = {(a1 , a2 ) : a1 , a2 ∈ R, a2 > 0}. 27. V = R4 , S = {(6, −3, 2, 0),(1, 1, 1, 0),(1, −8, −1, 0)}. Deﬁne addition and scalar multiplication on V as
follows: 28. V = R3 , S = {(10, −6, 5), (3, −3, 2), (0, 0, 0),
(6, 4, −1), (7, 7, −2)}. (a1 , a2 ) + (b1 , b2 ) = (a1 + b1 , a2 b2 ),
k(a1 , a2 ) = k
(ka1 , a2 ), k ∈ R. Explicitly verify that V is a vector space over R.
15. Show that
W = {(a, 2a ) : a ∈ R}
is a subspace of the vector space V given in the preceding problem. 29. V = P3 , S = {2x − x 3 , 1 + x + x 2 , 3, x }.
30. V = P4 , S = {x 4 +x 2 +1,x 2 +x +1,x + 1,x 4 +2x +3}.
31. V = M2×3 (R),
−1 0 0
321
−1 −2 −3
S=
,
,
,
011
123
321
−11 6 −5
1 −2 −5 . 16. Show that {(1, 2), (3, 8)} is a linearly dependent set in
the vector space V in Problem 14. 32. V = M2 (R),
12
34
−2 −1
S=
,
,
,
21
43
−1 −2 17. Show that {(1, 4), (2, 1)} is a basis for the vector space
V in Problem 14. −3 0
20
,
03
00 18. What is the dimension of the subspace of P2 given by
W = span{2 + x 2 , 4 − 2x + 3x 2 , 1 + x }? . 33. Prove that if {v1 , v2 , v3 } is linearly independent and
v4 is not in span{v1 , v2 , v3 }, then {v1 , v2 , v3 , v4 } is
linearly independent. i i i i i i i “main”
2007/2/16
page 338
i 338 CHAPTER 4 Vector Spaces −3 −6
.
−6 −12 34. Let A be an m × n matrix, let v ∈ colspace(A) and let
w ∈ nullspace(AT ). Prove that v and w are orthogonal. 41. A = 35. Let W denote the set of all 3 × 3 skew-symmetric
matrices. −1
42. A = 3
7 −4
0
43. A = 6
−2 3
1
44. A = 1
−2 (a) Show that W is a subspace of M3 (R).
(b) Find a basis and the dimension of W .
(c) Extend the basis you constructed in part (b) to a
basis for M3 (R).
36. Let W denote the set of all 3 × 3 matrices whose rows
and columns add up to zero.
(a) Show that W is a subspace of M3 (R).
(b) Find a basis and the dimension of W .
(c) Extend the basis you constructed in part (b) to a
basis for M3 (R).
37. Let (V , +V , ·V ) and (W, +W , ·W ) be vector spaces
and deﬁne
V ⊕ W = {(v, w) : v ∈ V and w ∈ W }.
Prove that
(a) V ⊕ W is a vector space, under componentwise
operations.
(b) Via the identiﬁcation v → (v, 0), V is a subspace
of V ⊕ W , and likewise for W .
(c) If dim[V ] = n and dim[W ] = m, then dim[V ⊕
W ] = m + n. [Hint: Write a basis for V ⊕ W in
terms of bases for V and W .]
38. Show that a basis for P3 need not contain a polynomial
of each degree 0, 1, 2, 3.
39. Prove that if A is a matrix whose nullspace and column
space are the same, then A must have an even number
of columns.
40. Let 62 0
3 1 5 .
21 7 15 03
10 13 .
5 2
5 10 5520
0 2 2 1
.
1 1 −2 −2 0 −4 −2 −2 For Problems 45–46, ﬁnd an orthonormal basis for the row
space, column space, and null space of the given matrix A. 126
2 1 6 45. A = 0 1 2.
102 1 35 −1 −3 1 46. A = 0 2 3 . 1 5 2
1 58
For Problems 47–50, ﬁnd an orthogonal basis for the span
of the set S , where S is given in
47. Problem 25.
48. Problem 26.
49. Problem 29, using p · q = 1
0 p(t)q(t) dt . 50. Problem 32, using the inner product deﬁned in Problem 4 of Section 4.11. b1 b2 B= . .
. and C = c1 c2 . . . cn . bn
Prove that if all entries b1 , b2 , . . . , bn and
c1 , c2 , . . . , cn are nonzero, then the n × n matrix
A = BC has nullity n − 1.
For Problems 41–44, ﬁnd a basis and the dimension for the
row space, column space, and null space of the given matrix
A. For Problems 51–54, determine the angle between the given
vectors u and v using the standard inner product on Rn .
51. u = (2, 3) and v = (4, −1).
52. u = (−2, −1, 2, 4) and v = (−3, 5, 1, 1).
53. Repeat Problems 51–52 for the inner product on Rn
given by
u, v = 2u1 v1 + u2 v2 + u3 v3 + · · · + un vn . i i i i i i i “main”
2007/2/16
page 339
i 4.13 54. Let t0 , t1 , . . . , tn be real numbers. For p and q in Pn ,
deﬁne
p · q = p(t0 )q(t0 ) + p(t1 )q(t1 ) + · · · + p(tn )q(tn ).
(a) Prove that p · q deﬁnes a valid inner product on
Pn .
(b) Let t0 = −3, t1 = −1, t2 = 1, and t3 = 3.
Let p0 (t) = 1, p1 (t) = t , and p2 (t) = t 2 . Find
a polynomial q that is orthogonal to p0 and p1 ,
such that {p0 , p1 , q } is an orthogonal basis for
span{p0 , p1 , p2 }. Chapter Review 339 55. Find the distance from the point (2, 3, 4) to the line in
R3 passing through (0, 0, 0) and (6, −1, −4).
56. Let V be an inner product space with basis
{v1 , v2 , . . . , vn }. If x and y are vectors in V such that
x · vi = y · vi for each i = 1, 2, . . . , n, prove that
x = y.
57. State as many conditions as you can on an n × n matrix
A that are equivalent to its invertibility. Project I: Orthogonal Complement
Let V be an inner product space and let W be a subspace of V .
Part 1 Deﬁnition Let
W ⊥ = {v ∈ V : v, w = 0 for all w ∈ W }. Show that W ⊥ is a subspace of V and that W ⊥ and W share only the zero vector:
W ⊥ ∩ W = {0}.
Part 2 Examples
(a) Let V = M2 (R) with inner product
a11 a12
bb
, 11 12
a21 a22
b21 b22 = a11 b11 + a12 b12 + a21 b21 + a22 b22 . Find the orthogonal complement of the set W of 2 × 2 symmetric matrices.
(b) Let A be an m × n matrix. Show that
(rowspace(A))⊥ = nullspace(A)
and
(colspace(A))⊥ = nullspace(AT ). Use this to ﬁnd the orthogonal complement of the row space and column space of
the matrices below:
(i) A = 3 1 −1
.
6 0 −4 −1 0 6 2
(ii) A = 3 −1 0 4 .
1 1 1 −1
(c) Find the orthogonal complement of
(i) the line in R3 containing the points (0, 0, 0) and (2, −1, 3).
(ii) the plane 2x + 3y − 4z = 0 in R3 . i i i i i i i “main”
2007/2/16
page 340
i 340 CHAPTER 4 Vector Spaces Part 3 Some Theoretical Results
product space V . Let W be a subspace of a ﬁnite-dimensional inner (a) Show that every vector in V can be written uniquely in the form w + w⊥ , where
w ∈ W and w⊥ ∈ W ⊥ . [Hint: By Gram-Schmidt, v can be projected onto the
subspace W as, say, projW (v), and so v = projW (v) + w⊥ , where w⊥ ∈ W ⊥ . For
the uniqueness, use the fact that W ∩ W ⊥ = {0}.]
(b) Use part (a) to show that
dim[V ] = dim[W ] + dim[W ⊥ ].
(c) Show that (W ⊥ )⊥ = W. Project II: Line-Fitting Data Points
Suppose data points (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ) in the xy -plane have been collected.
Unless these data points are collinear, there will be no line that contains all of them. We
wish to ﬁnd a line, commonly known as a least-squares line, that approximates the data
points as closely as possible.
How do we go about ﬁnding such a line? The approach we take12 is to write the line
as y = mx + b, where m and b are unknown constants.
Part 1 Derivation of the Least-Squares Line
(a) By substituting the data points (xi , yi ) for x and y in the equation y = mx + b,
show that the matrix equation Ax = y is obtained, where x1 1
y1 x2 1 y2 m A = . . ,
x=
,
and
y = . .
b . .
.
..
.
xn 1 yn Unless the data points are collinear, the system Ax = y obtained in part (a) has no
solution for x. In other words, the vector y does not lie in the column space of A.
The goal then becomes to ﬁnd x0 such that the distance ||y − Ax0 || is as small as
possible. This will happen precisely when y − Ax0 is perpendicular to the column
space of A. In other words, for all x ∈ R2 , we must have
(Ax) · (y − Ax0 ) = 0.
(b) Using the fact that the dot product of vectors u and v can be written as a matrix
multiplication,
u · v = uT v ,
show that
(Ax) · (y − Ax0 ) = x · (AT y − AT Ax0 ).
(c) Conclude that
AT y = AT Ax0 .
Provided that A has linearly independent columns, the matrix AT A is invertible
(see Problem 34, in Section 4.13).
12 We can also obtain the least-squares line by using optimization techniques from multivariable calculus,
but the goal here is to illustrate the use of linear systems and projections. i i i i i i i “main”
2007/2/16
page 341
i 4.13 Chapter Review 341 (d) Show that the least-squares solution is
x0 = (AT A)−1 AT y
and therefore,
Ax0 = A(AT A)−1 AT y
is the point in the column space of A that is closest to y. Therefore, it is the
projection of y onto the column space of A, and we write
Ax0 = A(AT A)−1 AT y = P y,
where
P = A(AT A)−1 AT (4.13.2) is called a projection matrix. If A is m × n, what are the dimensions of P ?
(e) Referring to the projection matrix P in (4.13.2), show that P A = A and P 2 = P .
Geometrically, why are these facts to be expected? Also show that P is a symmetric
matrix.
Part 2 Some Applications In parts (a)–(d) below, ﬁnd the equation of the least-squares
line to the given data points.
(a) (0, −2), (1, −1), (2, 1), (3, 2), (4, 2).
(b) (−1, 5), (1, 1), (2, 1), (3, −3).
(c) (−4, −1), (−3, 1), (−2, 3), (0, 7).
(d) (−3, 1), (−2, 0), (−1, 1), (0, −1), (2, −1).
In parts (e)–(f), by using the ideas in this project, ﬁnd the distance from the point
P to the given plane.
(e) P (0, 0, 0); 2x − y + 3z = 6.
(f) P (−1, 3, 5); −x + 3y + 3z = 8.
Part 3 A Further Generalization Instead of ﬁtting data points to a least-squares line,
one could also attempt to do a parabolic approximation of the form ax 2 + bx + c. By
following the outline in Part 1 above, try to determine a procedure for ﬁnding the best
parabolic approximation to a set of data points. Then try out your procedure on the data
points given in Part 2, (a)–(d). i i i i ...

View
Full
Document