Unformatted text preview: SOS MATH 235 MIDTERM 2 REVIEW PACKAGE Hello MATH 235 students, my name is Tai Cai and I am the SOS tutor this term for MATH 235. This package is designed to supplement your studying for the second midterm on November 16, 2010. Whenever possible, I have included examples that are not from class or your textbook. This gives you a wide variety of problems to look over, which is very important since this course is all about practice. You can expect a calculation problem from each of the main calculation topics: Gram‐Schmidt, Diagonalization, Least Squares, Triangularization, and Quadratic Forms, although Gram‐Schmidt may be tested in conjunction with another topic, such as Orthogonal Diagonalization. All of these calculation topics can contain messy algebra, so it is very important to be familiar with the process involved. On a test, you will need as much time as possible just to do the calculations involved for these types of questions. Other than these calculation questions, you can expect a few proof questions. These proof questions are a bit harder to predict. They can test any theorem or definition discussed in class, so it is important to go over all the definitions and theorems in your notes and this package. Furthermore, you need to be familiar with matrix algebra in this section. The best way to succeed in this (or any math course) is to get plenty of practice, so go on and practice! THE GRAM‐SCHMIDT PROCEDURE The Gram‐Schmidt procedure allows us to construct an orthonormal basis for an inner product space from ANY basis of . Suppose dim Step 1: Start with any basis Step 2: Use to construct an orthogonal basis , , , , The idea is to let AND Step 3: When we have our orthogonal basis to get our orthonormal basis Sounds simple enough, right? Most of the trouble is the messy algebra involved in step 2, so it’s important to do plenty of practice questions to get familiar with the process. Ex: , Step 1: Start with any basis for Step 2: Construct an orthogonal basis 1 ,1 1 1,1 1 0 2 , , of . . If you are not given a starting basis, it is usually easiest to use the , . standard basis. In this case, let’s start with the standard basis 1, , Let , the set of real polynomials with degree . Find an orthonormal basis for . 3. Let have inner product , , , where , of , we normalize each element of for 1, , . and construct orthogonal to , . Continue by constructing , . orthogonal to , all the way up to orthogonal to EACH of , , , , , , of . , , of . . Note that it is just a coincidence that ,1 1 1,1 ,1 1 1,1 , , , , , 1 , 3 1 1 3 1 3 2 3 2 0 2 3 1 3 . It just means that and were already orthogonal. 1 3 3 1 9 1 0 2 2 5 2 3 0 8 45 1, , 3 5 So our orthogonal basis is , 2 3 Note: we could have saved some work above if we noted that the integral of an odd function from 1 to 1 is 0. You can always check if your answer is correct by verifying if the vectors in are orthogonal, although on a test you usually won’t have time to do this. Step 3: Normalize each vector in . 1 1 1 2 3 1 3 3 5 So our orthonormal basis is You HAVE to be familiar with the process, since on an exam you’ll need as much time as you can to perform the calculations. 1, , , . 2 3 6 5 1 9 9 25 8 45 8 175 Suppose is a ‐dimensional subspace of an ‐dimensional inner product space . Defn: Defn: The subspace of all orthogonal to is called the orthogonal complement of in . Before we do an example, let’s make one thing clear: to show that , that , Why? If , , In English: if is orthogonal to each element in a BASIS of , then it is orthogonal to each element in . Ex: Given Begin by letting and 0 0 , find 1 be any element of . . We must have , 0 , but as we showed , , , 0 0 0 for 0 for 1, , , where 1, , , , ∑ is a basis for . , we get , , then for all 0 , it suffices to show , 0 is orthogonal to if is orthogonal to every vector in . 0 above, it suffices to show that , 0 0. In , we’ll use the dot product as the inner product. So we 1 0 get that 0 0. Since there are no other conditions, we conclude that all elements 0 1 in Lemma: Suppose is a subspace of an ‐dimensional inner product space . 1. 2. dim 3. If then Recall for an orthonormal basis , , , of an ‐dimensional inner product space , we can write , , , , 0 dim , , is an orthonormal basis of and is an orthonormal basis of . , , , is an orthonormal basis of have the form 0 , and so 0 , which is the ‐plane. , This is Theorem 2 in Section 7‐1. Suppose is a subspace of an ‐dimensional inner product space . Let Defn: The projection of onto is defined by , When given an orthonormal basis, you can use the above definition. A more general definition is given on page 36 of your textbook, but that definition is the projection of a vector on another vector. The definition here is the projection of a vector onto a SUBSPACE. Defn: The projection of perpendicular to is defined by , , Notes: 1. Lemma: If 1. , 2. pf: 1. , , , , ∑ , , , , ∑ is an orthonormal basis of with ∑ and ∑ , then 2. 3. is orthogonal to every vector in , and so , , , , , . , , , , , , But we know that , 2. This is just a special case of the first result with , 1 as , , , , 0 for since , , , , is an orthonormal basis. Then the only non‐ , , since as required. . . zero terms in the above summation are is an orthonormal basis. Thus , Thm: Suppose is a subspace of an ‐dimensional inner product space . Let 1. . 2. The only way to write as the sum of a vector in and a vector in is is the unique vector in that is closest to . That is, minimizes the distance . DIRECT SUMS Defn: If and are subspaces of a vector space , then the direct sum of and is defined as  Thm: Ex: Let . Let be the , we have 0 0 ‐plane and be the ‐plane. Then 0 0 where 0 and . 0 because for any vector is a subspace of . , Note that the representation of could’ve also written as a sum of vectors in and is NOT unique because we 0 0 with 0 and 0 . 0 0 Thm: Suppose is a finite dimensional inner product space with subspace . Then Note FUNDAMENTAL SUBSPACES Let . Then the 4 fundamental subspaces of are:  0   5 3 11 7 8 5 19 13 0 1 7 5 0  1. The columnspace (or rangespace) 2. The nullspace 3. The rowspace 4. The left nullspace Ex: Consider We can find its RREF as 2 1 3 1 17 5 . Find the fundamental subspaces of . 1 3 because 0 . 1 0 0 0 0 1 0 0 1 2 0 0 0 0 1 0 1 3 5 0 Using previous theorems, we get (details in first review package): Basis for rowspace of is 1,0,1,0,1 , 0,1, 2,0,3 , 0,0,0,1, 5 1,0,1,0,1 , 0,1, 2,0,3 , 0,0,0,1, 5 Basis for columnspace of is 2 50 1 1 3 , , 3 11 7 5 1 7 From previous courses we know this involves row reducing to its RREF and solving the equivalent relation Since 1 0 0 0 0 1 0 0 1 2 0 0 0 0 1 0 1 3 5 0 0 0. 2 50 1 1 3 , , 3 11 7 5 1 7 To find a basis for the null space of , we are finding a basis for the set of all solutions to We have that all satisfying 0 only needs to satisfy 0, or the following system of equations: 0 2 3 0 5 0 Since , and are pivot variables (their coefficients are leading 1s), and are free variables, and so we write the pivot variables in terms of the free variables: 2 5 2 Thus the solution to 0 is in the form 5 3 1 2 1 0 0 1 3 0 5 1 3 where 1 2 1, 0 0 and can be anything (hence “free” variables). Thus a basis for the nullspace of is 1 1 1 2 3 3 0 and so 1 , 0 . 5 5 0 1 1 0 0. We To find a basis for the null space of , we are finding a basis for the set of all solutions to find the RREF of 1 0 0 0 0 Similar to before, we get the following system of equations: 2 7 Thus the solution to 0 is in the form 2 7 0 2 7 . 1 0 0 0 0 0 1 0 0 0 2 7 0 0 0 0 0 1 0 0 (note that this is not ) as So 2 7 1 0 . Thm: THE FUNDAMENTAL THEOREM OF LINEAR ALGEBRA Let Therefore You might get asked to state this theorem, or verify it using an example. . Then and . LEAST SQUARES APPROXIMATION Suppose you have some data points that you’ve collected through an experiment. You’d like to fit a model to these data points: linear, quadratic, logarithmic, etc. What is the best fit? The figure below shows five data points and a linear fitted line. In statistics, this is a linear regression. For example, could be your height, and to are recorded heights of five people. Each has an associated . For example, to could be the recorded weights of the same five people. Suppose we fit the red line as . This means we are trying to use height ( ) to predict weight ( ). Each person has a true weight, to , but also has an estimated (fitted) weight, model, a person with height is predicted to have weight to . That is, using our fitted , which can be different from that person’s true weight . As you see in the figure below, there exist differences between the true values and the fitted values: the dark blue vertical lines represent those differences. Least squares approximation fits a model that reduces the squared distance between the true ( ) and fitted ( ) values. We square the distance to avoid positive and negative differences cancelling . In the each other out. So in our above model, least squares involves minimizing ∑ general linear case, least squares involves minimizing ∑ , where you have 1 1 , , , then we can rewrite in matrix form as data points. If we define 1 . At the end of the day, we are looking for is called the least squares error. The above example assumed we wanted to fit a linear model. If we wanted to fit a quadratic, then the error wouldbe ∑ , with 1 1 1 and . that minimizes . Note: Note that when constructing our matrix , each column corresponds to a parameter we are trying to estimate. For example, in the quadratic case above, the first column of contains all 1s, which corresponds to the intercept term . If we were trying to fit our data to a quadratic with an intercept term of 0 ( Thm: The solution that minimizes pf: We are trying to minimize 2 Note that usually (by the Chain Rule), we’d expect to have 2 must add the transpose. Setting this derivative equal to zero, we get 0 Ex: Suppose you are given the following data: Fit the data to the following model: Setting up our vectors and matrices, we have 1 8 8 1 1 32 , 32 1 2 1 . 1 0 . 1 2 2 1 2 1 1 0 , but due to matrix rules, we is . ), then , . . We do so by taking its derivative with respect to : , Then the solution is model is . (messy matrix calculations omitted), and so the fitted Note that the data I picked is intended to be linear (although with four data points, it’s hard to tell), and if you graph The actual least squares error is 4. Note that the size of the error itself is only useful when , you will see that is roughly linear from 2 to 2. compared to something else. If we tried fitting the data to a linear model, we would (surprisingly) get a larger least squares error. Does that mean our original model is a better fit? All we can say is that for the interval 2 to 2, the original model is a better fit than a linear model. It’s helpful to know what kind of model we are trying to fit, but in reality that is usually not the case. However, on a test, the question will ask you to fit the data to a specific model. On an exam, you will probably not get something as complicated as this, especially if calculators are not allowed. DIAGONALIZATION OF SYMMETRIC MATRICES Before we begin diagonalization, let’s do a quick review of eigenvectors and eigenvalues (real ones only for now). Defn: Let be an but cannot. The roots of the characteristic polynomial det Defn: The eigenspace of (for a given ) consists of the zero vector and all the eigenvectors of corresponding to . We denote it and note that Defn: Let and be that Defn: A diagonal matrix is a square matrix whose non‐diagonal entries are zero. Diagonal matrices are easy to work with. Defn: A square matrix is diagonalizable if it is similar to a diagonal matrix. In other words, is diagonalizable if there exists an invertible matrix such that By writing this way, we may have an easier time with certain calculations involving . For example: Not all matrices are diagonalizable, as we’ll see below. Thm: The Diagonalization Theorem Let be an 2. matrix. Then: for a diagonal matrix if and only iff the columns of are linearly independent 1. is diagonalizable if and only if has linearly independent eigenvectors. eigenvectors of . for some diagonal matrix . matrices. We say is similar to if there exists an invertible matrix such . Similar matrices have the same determinant and trace.  .  0 is just the nullspace of the matrix are the eigenvalues of . matrix with entries in . A NON‐ZERO vector such that is calledan eigenvector of if there exists a scalar . is called an eigenvalue of . Note that can be zero 3. If for a diagonal matrix , then the diagonal entries of are then eigenvalues of and the order of the eigenvectors in correspond to the order of eigenvalues in . 8 0 00 0 311 Ex: Is diagonalizable? If so, find and . 0 0 80 0 0 02 The characteristic equation is 8 det So the eigenvalues are 8, 3, and 2. We say that the eigenvalue 8 has multiplicity 2 since it is a root repeated two times. To see if is diagonalizable, we need to check if has four linearly independent eigenvectors. We do this by finding the eigenspaces for each eigenvalue. 8, 0 0 0 0 0 11 0 0 0 1 0 0 0 01 1 ~0 0 0 00 6 00 0 and 0 1 0 , or that 0 and 0 0 0 0 1 . So we get that vectors in the 0 0 0, meaning that the general solution to 10 0 , 01 00 0 0 0 0 3 0 0 0 1 8 0 2 0 1 0 8 3 8 2 For nullspace of 0 is must satisfy 1 0 0 0 0 1 0 0 . 0 Similarly, . There is a theorem that says eigenvectors 0 1 corresponding to different eigenvalues are linearly independent, so we see that we are able to find four 1000 1 0 linearly independent eigenvectors, namely , , but we could’ve used scalar , , 0100 0001 multiples of these as well. Note that by definition is a basis for . We can construct and as 1000 80 0 0 1 0 08 0 0 , , or more simply . You can check that 00 30 0100 00 0 2 0001 ). A final note that and are not unique, since we could have (this way you don’t need to find 0 used any ordering. For example we could have also used 0 , 0100 1000 Make sure that the order of the eigenvectors in corresponds to the order of the eigenvalues in . We 0 1 1 0 2 0 0 0 0 8 0 0 0 0 3 0 0 0 . 0 8 also quickly see why part 1 of the Diagonalization Theorem is true – you need the linearly independent eigenvectors to construct . Defn: A square matrix is symmetric if Defn: A square matrix is orthogonally similar to if this is just the definition for “ similar to ” with Defn: A square matrix is orthogonally diagonalizable if In this course we show that every symmetric matrix is orthogonally diagonalizable. The process for orthogonally diagonalizing a matrix is very similar to the process for diagonalizing a matrix. 8 0 00 0 311 Ex: Is orthogonally diagonalizable? If so, find and . 0 0 80 0 0 02 This is the same matrix I showed above. We showed above that this matrix is diagonalizable. However, I’ll tell you now that this matrix is not orthogonally diagonalizable. Note that this matrix is not symmetric. We’ll continue to do this question as if we thought it was orthogonally diagonalizable. We showed that we can find four linearly independent eigenvectors 1000 1 0 ,,, , . When we diagonalize, we can go ahead and use these four , , 0100 0001 vectors as the columns of . But to orthogonally diagonalize, since we require an orthogonal matrix , we need these four vectors to be orthonormal (definition of orthogonal matrix) but to still have the same span. We use Gram‐Schmidt to find these four orthonormal vectors need to first find four orthogonal vectors Gram‐Schmidt, we have , , , , , , . As usual we’ll , and then normalize these vectors. Using, for some orthogonal matrix and . diagonal matrix . Once again, this is just the definition of “diagonalizable” with since for some orthogonal matrix . Note for orthogonal matrices. . 1 0 0 0 0 1 1 0 11 , 0 1 0 0 11 00 , 00 00 01 10 , 00 00 11 00 , 00 00 01 1 0 5, 00 10 11 00 , 00 00 0 1 11 1 0 1 0 0 0 0 1 11 1 0 1 00 10 0 0 1 11 1 0 0 1 0 0 1 0 0 0 0 0 1 1 , 11 0 1 0 0 0 0 1 1 11 , 11 1 1 0 0 0 0 1 1 5 , 11 1 0 0 1 0 0 1 1 11 , 11 1 1 0 0 0 1 11 1 0 0 1 0 0 1 00 10 0 0 11 11 122 11 121 1 0 0 121 122 11 122 0 0 1 5 0 1 1 0 0 0 0 1 11 1 0 0 1 5, 0 1 0 121 122 11 122 0 0 0 0 1 0 121 122 11 122 0 0 121 , 122 11 122 0 0 121 122 11 122 0 0 1 5 0 1 1 00 10 0 0 11 55 122 11 121 1 0 0 121 121 610 122 121 11 122 122 0 0 10 0 0 0 , , which correspond to the eigenvalues , So our four orthogonal vectors are 01 0 00 1 0 8,8, 3,2 – note that the order matters here. Normalizing these vectors gives 1 0 , 0 0 0
√ √ 0 ,
√ √ 0 0 0 0 , 0 1 , and it’s pretty easy to see that this is an orthonormal set of vectors. To 0
√ √ summarize what we did, we used Gram‐Schmidt to find the orthonormal set 0 0 1 0 , 0 0 0 ,
√ √ 0 0 0 0 , 0 1 0 1 1 0 which has the same span as our original set , . We would then finish up this , , 0100 0001 diagonalization exercise like we normally do. That is, we would construct R and as 1 0 0 0 80 0 0 0 0 08 0 0 √ √ , . BUT NOTE THAT . 00 30 0 0 √ √ 00 0 2 0 0 0 1 Using our methods, this matrix is NOT orthogonally diagonalizable. 1 0 was orthogonal to each of , , and used Note: to save work, we could have observed that 0 0 00 000 0 100 1 , to get the orthogonal set Gram‐Schmidt on the set , , , , , . Note 010 010 001 001 and we would’ve that by coincidence, this set is also orthonormal. In that case we would’ve had seen that is not orthogonally diagonalizable since When you are asked to orthogonally diagonalize something on an exam, you’ll most likely be given a symmetric matrix, which is orthogonally diagonalizable. The following lemmas and theorems show that all symmetric matrices are orthogonally diagonalizable. Lemma: Thm: Suppose is symmetric. If and are eigenvectors of corresponding to different eigenvalues and , then and are orthogonal. is symmetric if and only if . ( is not diagonal). Thm: Suppose a real matrix eigenvalues are real. is symmetric. Then has at least one eigenvalue and all its Theorem 3 and 4 of section 8‐1 finishes up this section, with Theorem 4 concluding that all symmetric matrices are diagonalizable. Thm: If is a real matrix which is orthogonally diagonalizable, then is symmetric. This is the converse to Theorem 4 in section 8‐1. This shows that ONLY symmetric matrices are orthogonally diagonalizable. This equivalency is formalized as the Principal Axis Theorem. The proofs to all the theorems discussed so far are either in the textbook or should be your notes. It’s very important to be familiar with all of these theorems to best prepare for proof type questions. TRIANGULARIZATION We showed earlier that not all matrices are diagonalizable. The second best situation is triangularization. Thm: If an The proof to the above theorem is important to read over since it outlines the process for triangularlizing a matrix. Recall that triangular matrices have zero elements either above or below the diagonal. Upper triangular matrices have zero elements below the diagonal, and lower triangular matrices have zero elements above the diagonal. Note that the entire time for diagonalizing, I’ve been writing or All triangularization questions will require the process demonstrated in the following exercise. Make sure you know how to follow this process. Given time constraints, it is doubtful that you will have to triangularize any matrix larger than 3 Ex: Triangularize 10 0 6 1 3 1 . This question should be in some of your notes, but given the 6 02 limited variation in questions, I am repeating it just to outline the process. The original solution started 4 is also an eigenvalue, so we will use 4 for this example, just to be different. 1 You can find the eigenspace for 4 as 1 extend this vector to an orthonormal basis for . So an eigenvector we can use is 1 . We have to . 3. with triangularization, it’s easiest to write it like or . , which is equivalent to . It doesn’t really matter how you write it, but for matrix has all real eigenvalues, then there is an orthogonal matrix and upper . triangular matrix such tha...
View
Full Document
 Spring '10
 WILKIE
 Math, 11:11, Inner product space, Orthonormal basis, Defn

Click to edit the document details