The Spectral Theorem, part 1: Complex Version.
July 29, 2010
(Note) So, the general spectral theorem is pretty sweet, but (as Sheldon Axler does in Linear Algebra Done Right, the book that I’m essentially following in this blog) I’m going to split it up into two parts. In “real” math, I suppose we should consider two cases: when the field is algebraically closed and when it is not. The algebraically closed case is going to be nearly identical to the complex case. But because we don’t know “how” algebraically closed the other field is, I’m not entirely certain that the “not algebraically closed” case follows from the Reals case of the theorem. For example, if we were to use the integers in place of the reals, we would most likely be able to produce examples which did not follow the Reals version of the spectral proof. Either way…we will mostly be using this “in real life” in the case that the field is either the reals or the complexes. Thus, I do not feel too bad for not proving this in its full generality.
So, let’s wonder something for a second: why have I been proving all these random things? What the hell were we looking for again?
Oh, right, we wanted to have a space be partitioned into a bunch of little one-dimensional invariant subspace things so that we have
for some linear map . It’s cute, because it looks like what happens to the elements! But, how the heck do we get one dimensional invariant subspaces again? Oh, right, having be some one-dimensional invariant subspace is the same thing as saying
for some , where is the underlying field. Or, in less math-y terms, this means that there’s some such that every time we apply to some element, we get some multiple of . In other, slightly more sophisticated, words, “ generates “. But then, obviously, we have , for some , and so, in fact, is an eigenvector of ! Okay, nice review, huh?
One New Lemma Before We Begin.
Okay, during the proof of the spectral theorem, I realized that we needed something that I didn’t actually prove. It’s not difficult, so I’ll do it now. It should really seem reasonable: it’s the fact that for all if and only if is normal. It kind of seems like a reasonable statement if we think about what normalcy really “means.” But let’s not get all “deep” here, and let’s just prove it:
Lemma: is normal if and only if we have for all .
Proof. This is not really a hard proof, but it has a clever step. We’re gonna prove this in one fell swoop, and this proof is directly out of Linear Algebra Done Right, simply because I can’t think of any nicer way to do it.
which proves the theorem. .
Note that means “if and only if” so if you read this proof forwards and backwards, you prove both ways.
The New Stuff.
Okay, so, now, if we want to totally decompose into one-dimensional invariant subspaces, we need some things. First, we’re going to say, as usual, is nontrivial and finite-dimensional, and also that . So we want eigenvectors (automatically linearly independent, as we proved before) to create a basis for . Now, this basis is nice, but you know what would be better? If we scaled the eigenvectors such that they were all unit length. Well, we’d need an inner product in order to do that, but they’d be SO MUCH NICER, no? Then we’d have an orthonormal basis for made entirely out of eigenvectors. Holy crap, that’d be awesome, right? Yes, it would be. But when does that crap ever happen? What kind of space does need to be? What kind of map does have to be? Well, funny you should ask that…
Theorem (The Spectral Theorem for Complex Vector Spaces): Suppose that is a nontrivial finite-dimensional complex inner product space (a space which has an inner product), and is a linear map. Then has an orthonormal basis consisting of eigenvectors of if and only if is normal.
Before we prove this, let’s remind ourselves what normal means. A linear map is normal if we have that . Also, we’re going to need one particular fact: if we have , then there does exist an orthonormal basis of such that is upper triangular. We will actually use this, and it’s not too difficult to prove. If you think about it, it’s just a nice application of the gram-schmidt process using some nice original set of vectors. It shouldn’t seem too unbelievable.
Proof. Okay. Let’s prove the easy part first. ( direction). Let’s suppose that has an orthonormal basis that’s all eigenvectors of . Well, then, what’s ? It’s a diagonal matrix. Then what’s ? It’s also a diagonal matrix, which is just the conjugate of . Do these commute with each other? Yes they do, as any two diagonal matrices of the same size commute with each other. If you have not seen this proof, you should do it. It’s kind’a kickin’.
Okay, now, ( direction). Suppose that is a normal linear map. We have, by that note before this proof, that there’s some orthonormal basis such that is upper triangular. In other words, given this basis, we have
where that 0 in the corner means “the lower right-hand corner has all 0’s.” Now, the point is to show that this matrix is “really” a diagonal matrix. How are we going to do that? Well, well, well.
Okay, so, note that which implies that . Yes. Now, here’s a clever little thing: what does look like?
So note that . This means, in particular,
Yeah? BUT, WAIT A SECOND. is normal, and so by that lemma in the previous section, ! Then, we have that
which means that everything besides is 0! WHAT. Yes. Really. Check it. Getting rid of the ‘s from both sides leaves us with
and because all of these values are non-negative, it follows that all of them are zero.
Now, let’s do this for the rest of the ‘s. In general, we have
which means that only is potentially non-zero, and everything else is zero. This means that our matrix
which means that is diagonal. Note, then, that the non-zero diagonal entries are eigenvalues (as ) with the associated eigenvectors ! This means that we actually have that our basis is an orthonomal basis made of eigenvectors. Which is what we wanted! .
This tells us a lot. If we have is normal, then we can say a lot about the way that can be decomposed. In fact, if is normal and is complex, then we can actually have one of the nicest decompositions of !
Next time, we will plow right on through to the reals version of the spectral theorem. The proof (which I will essentially be paraphrasing from Axler, as usual.) is actually significantly different from the proof of the complex case, and it requires a few theorems which will not seem to be at all related until we actually do the proof. The reason I’m doing the real version of the proof, and including all those little weird lemmas, is because it actually does tell us quite a bit about the character of real spaces which are the kind of spaces that I feel we’re most familiar with.
The secret, which I’ll tell you now, is that for real spaces we essentially replace “normal” with “self-adjoint” and the same conditions hold. This is kind of nice, but because self-adjoint is a much stronger condition (self-adjoint actually implies normal. why?) there are fewer maps that actually satisfy the real spectral theorem. Sad.