Instructor (Stephen Boyd):Let me first answer your Ė there was a question about one of the homework exercises due today. Thatís Homework 5? Right. Okay. And what was the question?
Instructor (Stephen Boyd):10.2? I havenít memorized them, so youíll have to tell me what it is.
Instructor (Stephen Boyd):What did we ask [inaudible] Ė Iíd have to see the context.
Instructor (Stephen Boyd):Oh. Itís your choice, but whatís it for. Is it a simple one?
Instructor (Stephen Boyd):Yeah. Okay. Do you know Ė I believe that to sketch a Ė not sketch, I guess to get a Ė thereís a command. This could be completely wrong. I believe itís ďquiver.Ē Is that right? So some people are shaking their heads affirming that this function in MATLAB will actually plot Ė I guess quiver is for a group of arrows or something. Donít ask me. Apparently thatís what it does. That would sketch it, but youíre also absolutely welcome to sketch it by hand, just to draw Ė take a couple of points, see where they go. Thatís the important part. Okay. Any questions about last time? If not, weíll continue. Weíre looking at the idea of left and right eigenvectors. In our next example, very important, you go down [inaudible] pat is a Markov chain. So in a Markov chain, the way we do it, the state is a vector. Itís a column vector, and it actually is a probability distribution. Itís the probability that youíre in each of N states. So PFT is a column vector. Its entries are all positive or not negative, and they add up to one, and they represent a probability distribution. That probability distribution evolves in time by being multiplied by the state transition makers, capital P. I warned you last time that if you see this material in Ė I guess, by the way Ė is that coming out. That is straight there. My monitor has this kinda twisted at a five-degree angle or something like that. Itís okay. So I warned you last time that if you take a course in statistics or in other areas that this is the transpose of what youíll see there. Their matrix P will be the transpose, and they actually donít propagate column vectors. They propagate row vectors. Yes? Yes, what?
Instructor (Stephen Boyd):It is tilted. Ah ha. Well, then. I wonder if you guys can rotate your overhead camera a little bit to make this straight. There we go. Thank you. Okay. So these matrices have the sum over columns equal to one, so thatís what this is. Every column sum is one. And a matrix like that is called stochastic. Now you can rewrite that this way. If I put a vector in front of all ones Ė thatís a row vector multiplied by P Ė I get this row vector here. This is the matrix way of saying that the column sums of P are all one. So this also if you look at it Ė if you like, I could put a lambda in there and say lambda is one. This basically says that P Ė that the vector of all ones is a left eigenvector of P, associated with eigenvalue lambda equals one. It tells you in particular P has an eigenvalue of one. But if it has eigenvalue of one, it also has a right eigenvector associated with lambda equals one. And that means there some nonzero vector with PV equals V. Thatís what it means to have a right eigenvector with eigenvalue one, so PV is V. Some people would say that V is invariant under P. Itís invariant when multiplied by P. Now it turns out that because the entries in P are positive, this is not something weíre gonna look at now. This eigenvector here can actually always be chosen so that its entries are nonnegative, and that means we can normalize it so that the sum of them is one. Now the interpretation then is if you have PV equals V, the interpretation is just beautiful. Itís an invariant distribution. It basically says if youíre in the distribution given by V, that probability distribution, and you update one step in time, you will remain in the same distribution, so itís an invariant distribution. By the way, that obviously does not mean the state is invariant or anything like that. The state is jumping around. This describes the propagation of the probabilities of where the state is. The state is stochastically invariant. Thatís what PV equals V means. Okay. Now weíll see soon in fact that in most cases, no matter how you start off the Markov chain Ė and this depends on some properties of P Ė you will in fact converge this equilibrium distribution. So thatís something weíll see actually very soon. Okay. So letís look at this idea of diagonalization. Letís suppose that you can actually choose a linearly independent set of eigenvectors for N by N matrix A. So weíre gonna call those V1 through VN. By the way, weíre soon gonna see this is not always possible. It is not always possible to choose an independent set of eigenvectors of A, but assuming right now it is possible, you write this as AVI equals lambda IVI, and will express this set of matrix vector multiplies as Ė weíll just concatenate everything and make it in matrix language. And it says basically AV Ė A times a matrix formed Ė thatís a square matrix formed by just concatenating the columns Ė is equal to V and to multiply each of these Vs, that is multiplication on the right by a diagonal matrix, so we have this equation. So you can write this this way. AT equals T lambda, so there you go: five ASCII characters which expresses the fact that the columns of T are eigenvectors associated with the eigenvalues of lambda I. So a very, very snappy notation to write this. So actually, now I can explain this business. I said mysteriously earlier that you can write an eigenvector this way. And in fact, thatís the way most people write it, but in fact, if you wanna sort of follow this, and make this work for T being a single column or something like that, you can actually write it as this, and thatís Ė if you see somebody write that, itís for one of two reasons. Either theyíre just kind of like weird, or being perverse by writing the scalar on the right of the vector not the left. Also of course, this requires Ė to interpret, this requires loose parsing. Or theyíre actually quite sophisticated, and theyíre simply writing down the scalar version of this eigenvector eigenvector. I should call this the eigenvectors equation. Thatís AT equals T lambda. Okay. Now T, because these are independent Ė T, which we havenít used yet Ė this matrix T is invertible because itís a bunch of Ė itís N independent vectors. Thatís nonsingular. It can be inverted. I multiply this equation on the left by T inverse, and I get T inverse AT is lambda. Okay? So thatís the big Ė This youíve seen before. That is a similarity transformation, and thatís a diagonal matrix. Itís the eigenvalues, and in fact some people would say the matrix A is diagonalized by a similarity transformation. And in fact, the diagonal matrix is the matrix of the diagonal. Diagonals are the eigenvalues, and the matrix that diagonalizes A is actually the eigenvector matrix, so it looks like that. Okay. Now this is just Ė itís really in some sense just a change of notation, but itís okay. Suppose you had any matrix T that diagonalized A by similarity. So T inverse AT is diagonal. Well, letís see. Iíll call these diagonal ones Ė why donít I just call them lambda 1 through lambda N? And why donít I call the columns of V Ė of T V1 through VN? If I do that, and I take this equation, I rewrite this as AT is T lambda. Now I examine these column by column, and column by column that says this. So basically, if you see an equation like T inverse AT equals lambda, or AT equals T lambda Ė if you see either of these equations or these like that, it means the same thing. Theyíre just different ways of writing out in different matrix form what this means. Okay? So actually technically, these two make sense. These two make sense even if the VI are not independent. But for this last one to make sense, obviously you have to have a VI independent. Okay? So thatís the idea. Okay. So weíll say that a matrix is diagonalizable if there is a T for which T inverse AT is diagonal, and thatís exactly the same as saying A has a set of linearly independent eigenvectors. This is identical. Now sometimes youíre gonna hear Ė I think itís an old word. I hope itís going away. It says that if A is not diagonalizable, itís sometimes called defective. So I donít know where or how that came about, but thatís simply what itís called. You still do hear that occasionally. Otherwise, itís called diagonalizable. And itís very important to quickly point out not all matrices are diagonalizable. Hereís the simplest example, zero, one, zero, zero, characteristic polynomials S squared, eigenvalue Ė only eigenvalue is lambda equals zero. There is no other eigenvalue here. Letís try to find now two independent eigenvectors, each with an associated eigenvalue zero. Thatís the only eigenvalue there is. Well, to say that youíre an eigenvector of A with eigenvalue zero, thatís just a longwinded way to say youíre in the vel space. So letís try to find vectors that satisfy zero, one, zero, zero, V1, V2 equals zero if you like times V1, V2, like that, same thing. Well, the first equation, if you look at the first row, it says V2 is zero. So V2 has to be zero so V1 is the only Ė thatís the only thing we have to mess with here. The second row is always satisfied. And thereís no way you can pick two vectors of this form that are gonna be independent. Itís impossible. Okay? So hereís the canonical example of an A which is not diagonalizable. Now weíre gonna see actually later today that this issue Ė itís an interesting one. Itís a bit delicate. Weíll get to it. Itís gonna turn out nondiagonalizability Ė hey, that came out right. I was about halfway through that and wondering whether it was gonna come out right, but I think it did. I wonít attempt it again. Anyway, that property is not Ė itís something that does Ė it actually does come up. It actually has real implications, but in many cases matrices that you get are diagonalizable. Not all sadly because that would mean a whole chunk of Ė a bit of the class we could cut out, which I would not only cut out, but cut out with pleasure. Sadly, canít do that. So there are a couple Ė weíre gonna see the general picture very soon, but for now we can say the following. If a matrix has distinct eigenvalues, so if the matrix Ė if theyíre all different, all N eigenvalues are different, A is diagonalizable. Okay? And weíll show this later, but thatís just something to know. By the way, the converse is false. You can have repeated eigenvalues but still be diagonalizing. Actually, somebody give me an example of that. Whatís the simplest Ė just give a simple matrix with repeated eigenvalues and yet Ė identity, there we go. So the identity in R Ė whatís your favorite number? Seven. Thank you. So thatís an R seven by seven, and very nice choice by the way. I donít know if you chose that, but good taste. R seven by seven, so itís got Ė well, it depends if you count multiplicities Ė well, I can tell you right now the characteristic polynomial is S minus one to the seventh. That is the characteristic polynomial. It has you would say seven eigenvalues at S equals one. So thatís about as repeated as you can get. All the eigenvalues are the same. Does I have a set? Can it be Ė well, I could ask stupid questions like this. Can I be diagonalizable? Can it be diagonalized? And the answer is it already is. So youíd say well what would be T? You could T equals five. So you could take as eigenvectors E1 through EN. By the way, is it true that any set of N eigenvectors of I will diagonalize I or as independent? Let me try Ė Iíll try the logic on you one more time. You can choose a set of seven independent eigenvectors for I. For example, E1, E2, E3, up to E7. Now the other question is this. Suppose you pick seven eigenvectors of I. Are they independent? No. Obviously not, because I can pick E1, 2E1, 3E1 and so on, up to 7E1. Theyíre all eigenvectors. By no means are they independent, so okay. All right. Now weíre gonna connect diagonalization and left eigenvectors, and itís actually a very cool connection. So weíre gonna write T inverse AT equals lambda. Before, we had multiplied by T and put it over here, AT is T lambda. But in this case, weíre gonna write it as T inverse A equals lambda T inverse. Now weíre gonna write this out row by row. Iím gonna call the rows of T inverse, W1 transpose through WN transpose. Iím gonna call those the rows. And then in this way, you can right matrix matrix multiply as a batch operation in which you multiply the rows here by all the rows of the first thing by this matrix here. In other words, this is nothing but this. This one is this. And now, if you simply multiply out on the left, the rows of the left hand side are simply this. And the rows of the right hand side are this because Iím multiplying a diagonal matrix by a matrix. If you multiply a diagonal matrix on the left, it means youíre actually scaling the rows of the matrix. If you multiply a diagonal matrix on the right, youíre actually scaling the columns. You might ask how do I remember that? Well, I remember that when Iím teaching 263 mostly, but then I forget. And then what I usually do is if I have to express row multiplication, I will secretly Ė because this is not the kind of thing you wanna do in public, let anyone see you doing. I secretly sneak off to the side and I write down the two by two example to see if row multiplication is on the left or the right. And then itís on the left, Iíll find out, and Iíll say I knew that, and thatís how I do it sometimes. Just letting you know thatís how I know. But at least for the purpose of Ė while we are doing 263, I will likely remember. All right, so this says this. If you look at that equation, thatís very simple. This says nothing more than the row Ė than these WIs are left eigenvectors, period. Now in this case, the rows are independent. Why? Because T inverse is invertible. Its inverse is T, so the rows are independent, and this means that you have an independent set of left eigenvectors. And theyíre chosen Ė in this case, their scaling is very important. Actually, let me just review what this says. It says that if you take a linearly independent set of eigenvectors for a matrix and concatenate them column by column into an N by N matrix, you invert that matrix and you look at the rows of the result. Those are left eigenvectors, so thatís what that tells you. Those are left eigenvectors. But itís not as simple as just saying that itís any set of left eigenvectors. Theyíre actually normalized in a very specific way. Theyíre normalized so that WI transpose VJ is delta IJ. Right? Which is Ė basically this is saying T inverse times T equals I. Okay? So what you would say is in this case, if you choose the left Ė if you scale the left eigenvectors this way, theyíre dual bases. Now I should make a warning here, and that is that if a vector is an eigenvector, so is three times it, and so is seven times it. In fact, any scalar multiple, so when you actually ask for something like left eigenvectors or something like that in some numerical computation, they have to be normalized somehow. Generally speaking, I believe this is the case, most codes Ė which by the way all go back to something called LAPACK Ė will return an eigenvector normalized in norm. Notice itíll simply be Ė have Norm 1. If you just walk up to someone in the street and say ďPlease normalize my eigenvector,Ē or ďIím thinking of a direction. Whereís the direction?Ē people will just normalize it by the two norm. Thereís other cases where they donít. I wanna point out that that will not produce this result. These are not normalized here, so be very careful. These are not normalized. Okay. Weíll take a look and see how this comes out. Thisíll come up soon. Okay. Now we can talk about the idea of modal form. So letís suppose A is diagonalizable by T. From now on, you are to simply recognize this statement as saying this is exactly the same as suppose A has an independent set of eigenvectors V1 through VN. If you shove them together into a matrix, Iím gonna call that T. Thatís what this means. Well, if you take new coordinates to be X equals TX tilde, what this means is you are Ė X tilde are the coordinates of X in the T expansion. Or if the columns of T are the Vs, itís in the Vs. So you would actually say X tilde gives the coordinates of X in what people would call in this case the modal expansion, or the eigenvector expansion. In other words, instead of writing X, which is in the standard bases or the coefficients, X tilde are the mixture of V1 through VN that you need to reconstruct Ė to construct X. Thatís what X tilde is. Well, this thing is X dot, and thatís A, X, X is TX tilde, so X tilde dot is T inverse AT. Thatís diagonal, and you get this. So in that new coordinate system, you get X tilde dot equals lambda X tilde. And that means that by this change of coordinate that the autonomous linear system, X dot equals AX is decoupled. So in X dot equals AX, the way you should imagine that is a bank of integrators with a matrix A in a feedback loop. But a matrix A basically takes N inputs and produces N outputs, but itís got all these horrible cross gains. I mean if A has all entries non-zero, it means every output depends on every input, and so if youíre mixing the dynamics [inaudible] basically whatís happening. When you diagonalize like this, you have completely decoupled all the dynamics and it looks like this. Okay? And that says that at least in this coordinate system, itís very simple. The trajectories give you N independent modes, and the modes are just simply Ė well, obviously they have nothing to do with each other. Theyíre totally decoupled. And this is called modal form. Thatís a very common form. Now these can become complex, in which case itís a bit weird, and you have to explain a little bit, and make a story about how the real and the imaginary parts are separately solutions, and all that kind of stuff. Another very common form youíll see is real modal form, and youíll see this for example in mechanical engineering a lot as real modal form for example for a structure. Thatís how they would describe the dynamics of a structure by giving real modal form. Now in this case, you can Ė thereís actually a way to construct a real matrix S so that S inverse AS is not diagonal, but itís diagonal with one by one or two by two blocks like this. Okay? So every time in T inverse AT, every entry in this which is complex Ė that means for every complex eigenvalue, youíll actually collect that and its conjugate, and then actually you can take the real and imaginary apart and so on, and youíll actually get a form like that. Iím not sure Ė I donít remember if weíve actually Ė hey, we donít tell you how to construct S. That would make an excellent homework exercise. Yeah, okay. So thatís the [inaudible], so find S. Itís a good exercise to see if any of this makes sense, mess with matrices and things like that. Okay. So you get all these little blocks Ė by the way, these little blocks like this, you should by now start recognizing. So a little block that looks like sigma N, sigma N, omega N, omega N, the characteristic polynomial of this is S minus sigma N Ė sorry Ė squared, plus omega N squared like that. Now assuming here that omega is less than sigma Ė I believe thatís the condition here. Oh sorry. Absolute Ė no, sorry Ė let me keep this out of here. The roots of this thing are gonna be minus Ė no, theyíre gonna be sigma N plus minus square root of Ė letís see Ė I guess itís omega N squared minus sigma N squared, something like that. Okay? So thatís what itís gonna look like. Now these things are the complex eigenvalues, so that is actually negative. Otherwise, this would be two real parts and can be split. And it should kinda make sense because this is the self-exciting component between X Ė if this were a two-dimensional system with X1 and X2, and these are the cross components which actually give you the rotation. So this condition basically says you get enough rotation. Otherwise, it splits into two. I can tell by facial Ė just a quick gross look at facial expressions, Iíve now confused almost everyone, except [inaudible]. Yes?
Instructor (Stephen Boyd):How did I look at it and immediately say it? Well, it wasnít totally immediate, but letís Ė I Ė two by two [inaudible] two by two matrices, thatís where there are basically no excuses, right? So for two by two matrices you should be able to work out one of the eigenvalues with the inverse and things like that. Above that, no one could possibly hold you responsible for it. But two by two, letís just two it. Itís det SI minus ABCD. I mean these are the kind of thing I guess you should know, like that. And so I did this in my head, but I sort of knew what the answer was, so you get this times Ė minus BC. And now if I like, I could write my quadratic formula which I donít dare do. That would be some horrible expression. This was easier because after all, B was equal to minus C, and A was equal to D, so fewer things to keep track of, but this is what I did in principle. Now one of these two by two blocks, which is a complex mode, they look like this, and theyíre really quite pretty. They are essentially cross-coupled. Itís a pair of integrators cross-coupled. And by the way, if sigma is zero, you get the pure oscillation, and this is something weíve seen before. In fact, that matrix is zero omega minus omega zero. Thatís when sigmaís zero, so you get this. You get that. And this you should recognize by now as a rotation matrix, I mean maybe Ė sorry, this is not a rotation matrix. Well, it is, but in this case Ė this you should recognize as a harmonic oscillator. By the way, you donít have to recognize this as a harmonic oscillator, but we talked about it, and the little bit of playing around with this would convince you that it is. Yes?
Instructor (Stephen Boyd):Yeah. Sorry. Whatís that? Oh, this is not it. Thank you. This is this? Like that? Sorry? With a square.
Instructor (Stephen Boyd):Really? Good. Great. Fine. Sorry. Actually, youíre right. Of course, this is Ė fine, so itís sigma N plus minus J omega N. There we go. Itís that? Thank you for fixing it. How come no one else caught that? Hm. Well, thatís one more demerit for this class. Thatís bad. All right. So diagonalization, it simplifies a lot of matrix expressions. So diagonalization, Iíll say a few things about it. Itís mostly a conceptual tool. There are a few places where it actually has some actual use. Itís widely used for example in mechanical engineering. There in fact is a very famous Ė thereís very famous codes that will take a description of a mechanical thing, and then spit out the modal form, so itíll list eigenvalues, and itíll actual give you the eigenvectors, which in fact are these modal ones, not the complex ones, the two by two blocks. But in fact, mostly itís a conceptual tool, and letís see why. Itís gonna simplify a lot of things. So if you have SI minus A inverse, thatís the resolvent. Generally, if A is two by two, no problem, you can work out what this is. If A is three by three, this is already not a picture Ė itís nothing you Ė you donít wanna see the expression for SI minus A inverse. Trust me. You can compute it easily for some particular A, but thatís another story. However, we can do the following. Wherever you see A, you replace it with T lambda T inverse like that. So we do that here, and now Iíll show you a trick. This identity matrix you write as T T inverse. Okay? Now I pull a T out on the left, and a T inverse out on the right in this inner matrix. I invert a triple product thatís the same as the inverse the other way around, so I get T SI minus lambda inverse T inverse. By the way, inverting a diagonal matrix, thatís fine. Thatís easy to do. You invert the diagonals, and you get this. Thatís the resolvent. Okay? By the way, this is sometimes called the spectral resolution of the identity or some Ė thereís some name for it. Thereís a name for this way to represent the resolvent. Actually, let me say a little bit about that. Some of you might know about the idea of residues in complex analysis. Then again, maybe none of you know about residues or partial fraction expansions. Partial fraction expansions? Come on. Somebodyís heard of that. Did you learn about that in some stupid signals and systems course? Is that Ė yes. Is that where you heard about it? Okay, great. So this says the partial fraction expansion of the resolvent is this. Itís really quite cool. Let me try to get it right. Oh, I think I can get it right. Itís this, and Iím using informal syntax here. So thatís the partial fraction expansion of this. Partial fraction expansion of a rational function is to write it out as a sum of terms each of which is one over one minus Ė S minus a pol, so thatís the partial fraction expansion. Okay? The other way to do is say that these Rank 1 matrices, VI WI transpose are the residues of this function at Ė thatís the residue of this function at the pol lambda I. Okay. So the diagonalization simplifies tremendously this, the resolvent. Itís also true for the powers. For example, if you raise a matrix to power, if you know itís eigenvectors and eigenvalues, itís very straightforward because you simply write T lambda T inverse to the K. Now when you do this, you just line these K times, and the T inverse here annihilates the T to the left of it. And this happens in all of these, and you get T lambda to the K T inverse. Okay? So that means itís very straightforward in fact to calculate powers of a matrix. And in fact, this already is a method perhaps not Ė maybe not a good one, but at least conceptually this gives you a very good way for example of simulating a dynamical system. Or if someone walks up to you on the street, and asks whatís A to the one million, probably the worst thing you could do would be to write a loop that keeps multiplying an N by N matrix by A and let it run a million times. That would be one way to do it. And the cost of that would be ten to the six times N cubed if you did that loop. Okay? Now you can also do the following. You could also calculate the eigenvectors and eigenvalues, and although Iím not gonna get into the details, that could also be done in order N cubed, like five N cubed or something like it. This doesnít matter, but I just wanna make a point here. Once you calculate this and this, which costs N cubed, letís say five, this is nothing but a bunch of calls. It costs basically zero Ė it costs N, which is basically dominated by the N cubed. So this can be done in five N cubed. Okay? And the point is that is a whole lot smaller than that. So diagonalization in this case actually gives you a very serious advantage in how to compute something. Okay. Letís look at exponential. You wanna look at E to the A in general. Calculating E to the A is a pain. Itís not fun, unless A is two by two or has some other special structure like diagonal or something like that. Itís actually quite difficult. Letís write out the power series, and if you write out the power series here, not surprisingly when you make a Ė weíve already seen when you have a power, itís the same as simply putting the T and the T inverse here, and having these things Ė these are all diagonal matrices, and thatís nothing but this. Exponential of the diagonal matrix is ex and diag commute for a matrix exponential. You get this like that, and that gives you a very simple way to compute the exponential. Thatís not quite how itís done, but itís one of the methods that can be used is this. Okay. Now in fact, this idea extends to analytic function, so if you have a function which is given by a power series, thatís an analytic function Ė you donít need to know this, but itís just kind of cool, so it doesnít hurt. So if you have an analytic function thatís a function given by a power series Ė it could be a rational function. It could be exponential, anything else Ė then itís absolutely standard to overload an analytic function to be called on N by N matrices. So F of A is given by beta zero I where the power series expansion of F is this. Okay? Youíve seen one specific example. Youíve seen so far the exponential. This allows you to work out this thing. So here for example, we would have the following. We would have F of A is T times Ė well, let me write it this way: diag of F of lambda I times T inverse. Thereís your formula like that. Okay? So it gives you a very quick way. This is actually something good to know about because there are a lot of times when you do see things that are polynomials of matrices, rational functions of matrices, and things like that. Those do come up a lot, and itís good to know that if you diagonalize, they will simplify. Theyíll simplify actually just to this single analytic function applied to the eigenvalues. Okay? Actually, Iím not quite sure why this part didnít get into the notes. Thatís very strange. Okay. Letís look at the solution of X dot equals AX. Well, we already know what it is. Itís simply this. Itís X of T is the X of TA times X of zero. Thatís what it is. Thatís the solution. And we already have a rough idea of what this looks like and things like that, or we have some idea of it anyway. Although, this is not the kind of thing a person can look at right of the bat Ė just take a look at a four by four matrix or for that matter forty by forty matrix Ė look at it and say, ďOh, yeah. Thatís gonna be a rough ride in that vehicle,Ē or something like that, or ďYeah, thatís gonna be some business cycles. I can see them.Ē I mean thereís just no way anybody can do that, unless A is diagonal or something like that. Okay. Well, letís start with X dot equals AX. T inverse AT is lambda. X of T is E to the TA X of zero. Thatís the solution, but this thing is this, and then this has really the most beautiful interpretation because T inverse X of zero, I write it this way: WI transpose X of zero. Thatís a number. Then itís multiplied by this thing, which actually tells you for that eigenvalue eigenvector whether itís Ė this tells you whether it grows, shrinks, if itís complex, oscillates, and all that kind of stuff. And then this gives you the VI, reconstructs it. So the interpretation of this formula is really quite beautiful, and every single can be interpreted. Itís this. What happens is you take the initial state X of zero, you multiply by WI transpose, and you get something very specific. That is the component of X of zero in the VI expansion. So for example, if W3 transpose X of zero is zero, that has a meaning. It says if you expand X of zero in a V expansion, you will not have any V3. Thatís what this says. Okay? Because thatís what this does. It decomposes it this way. In fact, let me write that down right now. X of zero is sum WI transpose X of zero times VI. Thatís the expansion. And thatís true for any X zero. Okay? This looks fancy. This comes from the fact that Ė I mean this is actually quite straightforward. This basically is a restatement of this. Thereís nothing [inaudible]. Okay? So WI transpose, which are the left eigenvectors, they decompose a state into the modal components, if you wanna call the V1 and VN the modal components. Thatís what this does. All right, thatís fine, so that you decompose it. Then this thing, time propagates that mode. Thatís what E to the lambda T does. It propagates the Ith mode in time, very simple formula. Why? Thatís the point of a mode. A mode propagates in a very simple way. It grows. It shrinks. It oscillates. Thatís about it so far, by the way. And then it comes out along a direction VI, so all the parts of this are supposed to make sense, so thatís the interpretation of this. Okay, but now we can actually ask some really interesting questions and answer them. You might ask this. You have X dot equals AX. Now remember, we say by definition A Ė the system is stable. By the way, you would normally say the system X dot equals AX is stable. Sometimes, as a matter of slang, youíll hear people talk about A being stable, but thatís Ė it should be understood thatís slang. So this system is stable provided all solutions of X dot equals AX converge to zero, they all decay. But now weíre asking this. For a general A, for what initial conditions do you have X of T goes to zero as T goes to infinity? By the way, this one answer, you can always give, no matter what A is. Whatís that answer? Zero. If X of zero is zero, then it stays zero, period. And therefore, it goes to zero. So the initial state zero, no matter what A is, gives you at least one trajectory that converges to zero. I mean converge is a little bit technical there. It is always zero, but that means it converges to zero. Okay. Now the way to answer this is you dived the eigenvalues into those with negative real part, so letís say thatís the first S of them, and the others, so these have nonnegative real part. Now we can answer the question lots of different ways. One is this: you say now thatís just a formula for X of T. This thing will go to zero, provided the following holds. The first S terms in here shrink. The remaining S plus one through N do not shrink. Therefore, this will go to zero provided these numbers are zero for S plus one, S plus two, and so on. Thatís one way to say. So one way to say it is that these numbers should be zero for S plus one, S plus two, up to S equals N. By the way, that is identical to saying that youíre in the span of V1 through VS. Why? Because X of zero is equal to sum WI transpose X of zero times VI like that. You have this. Therefore, to say that these are zero from S plus one to N means that X of zero is a linear combination of V1 through VS. Okay? So these are two different ways to say this. And thereíd be all sorts of names people would call this. They would say that Ė they would refer by the way to this span. They would call that the stable eigenspace or something like that. That would be one Ė or some people would just call it the stable subspace, and the idea would be this. If you start in this subspace, the trajectory will go to zero. It might oscillate. Itíll go to zero. If youíre not in this subspace, you will not. So thatís how that works. Okay, so thatís the kind of question you can answer now. And finally weíll handle this issue of stability of discrete time systems. So suppose the matrix is diagonalizable, and you have the linear dynamic [inaudible] XT plus one is AX of T, then the solution is trivial. Itís just powers of A. But if you write A as T lambda T inverse, then A to the K is this. Now I understand Ė I know how powers of complex Ė I know what powers of complex numbers do. That I can actually handle, and so you get this. Powers of complex numbers go to zero only if their absolute value is less than one. Their imaginary part tells you about how much of a rotation in degrees you get at each step, but their magnitude tells you how the magnitude scales, and you realize that XT plus one is AX of T is stable if and only if the eigenvalues are less than one in absolute value. Okay? And it turns out this is gonna be true even when A is not diagonalizable, so I donít mind stating it as a fact right now. XT plus one is AX of T is stable if and only if all the eigenvalues have magnitudes less than one, so thatís the condition. Actually, as in the continuous time case, thereís a much more subtle statement. The spectral radius of a matrix is defined to be the maximum of the absolute values of the eigenvalues. Okay? And so one way Ė this is called a spectral radius. Itís denoted row. This is relatively standard notation. What this says is X Ė the discrete time autonomous system XT plus one is AX of T is stable if and only if the spectral radius of the dynamics matrix A, or update matrix, or whatever you wanna call it Ė the spectral radius is less than one. Thatís the condition here. Now more generally, row of A gives you the growth or decay magnitude, asymptotic. So for example, if row is 1.05 Ė in other words, there is at least one eigenvalue with a magnitude of 1.05, it says that X of T will grow asymptotically. It depends on the initial condition, but it can grow as fast as 1.05 to the T. If the spectral radius is 0.7, it says that after ten steps roughly, the state has decayed roughly by 0.7 to the ten. Thatís a small number. Okay? So this is the continuous time analog of the maximum of the real part of the eigenvalues of a matrix. Thatís what this gives you. Okay, so enough on all that. Weíll now do a bit on Ė maybe Iíll even cover it, the Jordan canonical form. So here I actually have to ask, how many people have seen Ė or perhaps right verb is been subjected to the Jordan canonical form? So a handful of people. Did it make any sense at the time? How should we interpret this?
Instructor (Stephen Boyd):It did? Okay. Good. Letís look at it. So the Jordan canonical form is essentially Ė itís as close as you can get to a diagonalization when you canít diagonalize the matrix. So let me explain that. Itís this. Any N by N matrix Ė any, no exceptions Ė can be put in something called Jordan canonical form by a similarity transformation.
So thereís a matrix T, obviously invertible because Iím about to refer to T inverse, for which T inverse AT is J. J Ė thatís the Jordan form Ė is a block matrix. Each of these blocks, which is called a Jordan block, looks like this. Itís a bunch of lambdas with ones on the superdiagonal. So thatís a so-called Jordan block. By the way, a one by one Jordan block is a little matrix that looks like this. Okay? So a diagonal matrix is a special case of a Jordan form. Itís the special case when there are N Jordan blocks and each block is one by one. So basically, you have these little ones in the superdiagonal. Weíll see what that means soon, what the ones mean. So thatís a Jordan block. Now a couple of comments about Jordan blocks. First of all, the Jordan form is upper bidiagonal. Thatís a name meaning itís got a diagonal, and itís got Ė one diagonal above it is nonzero. Itís upper Ė itís much more than that because in fact thereís ones in the upper Ė itís in the zero one, the upper diagonal, and not only that, it can only be one if the lambdas are repeated there. So diagonal is a special case of N Jordan blocks. And itís gonna turn out the Jordan form is unique. Now you have to interpret that very carefully. It is not of course on the details of the mathematics of linear algebra, so itís not like weíre gonna get crazy with all this, but itís important to understand what it means to say itís unique. It says that basically if two people calculate a Jordan form for a matrix, they actually can be different. One difference is simply this. They might order the blocks in a different way. However, the following is true. If two people work out a Jordan form, they have different Ts here possibly, then thereís a permutation Ė a block permutation which will change one Jordan form into the other. So the way you would say this is you would say the Jordan form is unique up to permutations of the blocks. So the things people can Ė the types of things people cannot agree on is what is J1. No one can agree on that because it depends on what you chose to put as the first Jordan block. However, you canít Ė no one can possibly disagree about the numbers of Jordan blocks for example, and the sizes of them, and the sizes associated with a certain eigenvalue. So for example, if you say this has three eigenvalues, this eigenvalue is associated with one Jordan block of size two by two, everyone computing a Jordan decomposition will actually Ė will agree with that. Okay. Now I should also mention Jordan canonical form is Ė itís an interesting thing. It is almost strictly a conceptual tool. So itís used to show things, to illuminate ideas, and things like that. It is actually not used in almost any numerical computations. Okay? So if you go to the web or something like that Ė if you go to Google and type letís say Ė if you type something like ďsource code Jordan canonical form,Ē you will get Ė actually, what youíll mostly get is youíll get a bunch of harangues about how terrible it is, and no one should ever compute the Jordan canonical form by Ė numerically, and so on and so forth. Thatís probably what youíll get, but youíll find some strange things there, but itís basically not done. Even when you do find algorithms for it, every paper will start this way. It will say, ďItís well-known that you basically Ė it doesnít make any sense numerically to compute the Jordan form.Ē It goes on, and it says, ďBut letís suppose you did. You really had to. Then this paperís about how you might do it, if you were to do it, but we donít recommend it.Ē So that would be the kind of abstract youíd find. Not that this matters. Iím just mentioning it. Okay. Maybe itís not never, but itís awfully close, and boy do you have to justify yourself if you actually do anything like this in any numerical application. All right. Now the characteristic polynomial of A is Ė of course, if J is block diagonal Ė so the characteristic polynomial of Ė actually, under a similarity transformation is the same. Wasnít that a homework problem? It wasnít? Thatís terrible. Well, similarity Ė wait a minute. Oh well. Thatís Ė maybe it shouldnít have to be. Was it one? Well, Iím glad to see though that everyone thought about that problem a long time and really Ė in fact, thatís great because itís actually below even your consciousness now. Itís so ingrained Ė
Student:I think itís the current homework.
Instructor (Stephen Boyd):Itís what? Oh, the current homework. Oh well, that would explain it because that wouldíve been terrible. I mean itís a quick calculation, but the characteristic polynomial under a similarity transformation, it doesnít change the eigenvalues. So the eigenvalues of A are the eigenvalues of this thing. Thatís a block matrix. Eigenvalues of a block matrix are the eigenvalues of all the blocks stuck together. Eigenvalues of that matrix, thatís upper triangular. Eigenvalues of this matrix are lambda I with a multiplicity NI. The characteristic polynomial in fact of this is S minus lambda I to the NI here. Thatís just Ė SI minus JI. Okay. So basically, this tells you the following. If you wanna get the characteristic polynomial of the matrix, you take Ė itís the eigenvalues associated with the blocks raised to the block size. And now we immediately see the following. Once you believe in the Jordan canonical form, which I will not show how Ė I will not go through the week long proof that any matrix has a Jordan canonical form, especially because the computational algorithmic payoff is Ė to say dubious is putting it very nicely, so I wonít go through that, but assuming you believe it, and you should Ė that is after all what we have mathematicians for, and they assure us that itís true. Then it says immediately that if a matrix is diagonalizable, its Jordan form must be Ė you can only have block sizes one by one. To say that a Jordan form has block sizes one by one says itís diagonalizable. Thatís basically what it says. Okay. Now this Ė when you see repeated eigenvalues now Ė so in fact, let me explain how this works. If you see repeated eigenvalues, it means maybe you have a nontrivial Jordan form. Oh, I should mention something here. If the Jordan blocks are all one by one, this is diagonal. People would call that a Ė if any block is two by two or bigger, people call that a nontrivial Jordan form, meaning diagonal is just diagonalizable. So if you see Ė what this says is the following. If the eigenvalues are distinct, your matrix is diagonalizable. And if someone says, ďYeah? Whatís the Jordan form?Ē Youíd say, ďI just said itís diagonalizable.Ē Okay. Jordan form is N Jordan blocks, each one by one. Thatís the trivial Jordan form. If you see repeated eigenvalues, it does not guarantee that the Jordan Ė youíre gonna have nontrivial Jordan form. In fact, somebody quickly give me an example of a matrix that has repeated eigenvalues and yet has a trivial Jordan form.
Instructor (Stephen Boyd):I. What size? This is very important. We talked about this earlier. Seven. Thank you. So seven by seven Ė the seven by seven identity matrix has seven absolutely equal eigenvalues. Its Jordan form is trivial, which is a pedantic way of saying itís diagonalizable, and so Ė on the other hand, you can have more Ė in fact, letís talk about seven by seven. Thatís kinda big. Letís talk about like four by four. So here, thatís the identity. Eigenvalues all are one. Itís got four Jordan blocks of size one by one. Okay? How about this one? Eigenvalues are all the same. The eigenvalues of that matrix are one, one, one, one, period. Whatís the Jordan block structure? Well, thereís one block here, and it looks like that. So you can have a block of two, and then two separate blocks of one. Are there others for this one? Well, I could do this. I could have a block of Ė letís see if I can get it right. No. Yes. If Iíd given myself enough room, it wouldíve been right. How about that? That matrix Ė whatís the block Ė what am I doing? My god. Okay, let me just get rid of that. I didnít do that. There we go. Okay, here. Since I canít even get it straight, Iíll show you the blocks. Okay? What about this one? Whatís the Jordan Ė I mean the block size here you would describe as itís got two Jordan blocks, one three by three, one one by one. By the way, eigenvalues, this matrix, this matrix identity, all the same. Any others? One more one. So we could have a single Jordan block Ė I donít know what Iím doing. Here we go. One, one, one Ė there we go. Itís a single block of size four by four. And that would be the Ė and any others? Letís list all possible Jordan forms of a four by four matrix with four eigenvalues one. Thereís one we missed. What is it?
Student:Two two by two.
Instructor (Stephen Boyd):Two two by twos, so thatís Ė exactly. So you can have this like that, and thatís it. These along with I are the five possible Jordan forms for a matrix with four eigenvalues of one. Okay? Of course, a natural question in your mind would be Ė well, let me list some. The first might be who cares. So weíll get to that. And the second might be Ė and itís related Ė is whatís the implications. What would it mean? How would you know? How would this show up for example in dynamics or something like that of a system? How would you know you had one of these, and what would be any consequences of it? Okay. And weíll get to that. I promise. So the connection between the eigenvalues and the Jordan blocks and sizes is a bit complicated, but it all comes from this. It says that basically if you have Ė the characteristic polynomial is a product of S minus lambda I to the block size I. And the null space for example of lambda I minus A is the number of Jordan blocks with eigenvalue lambda. And we can check that because what happens is if you look at the null space, if you look at lambda I minus A Ė I will multiply by T inverse in T like this, and T inverse in T goes in there and annihilates itself. Itís basically Ė thatís lambda I minus J, and that is equal to a block matrix that looks like this. Itís lambda minus lambda one. And then thereís some minus ones on the superdiagonal like that. And I wonít draw the other blocks. Okay? Now if you wanna know whatís the null space of this matrix, you have columns Ė at the leading edge of each Jordan block, you have a column whose only nonzero entry Ė possibly nonzero entry is lambda minus lambda I. So if lambda is equal to lambda I, you get a zero column, and that means that matrix is gonna drop rank. It is not gonna be invertible. So thatís what Ė this happens. So in fact, this will happen. Every match at the beginning of a Jordan block, you will get a zero column, and that says in fact the dimension of the null space of lambda I minus A is exactly equal to the number of Jordan blocks associated with lambda I. So over here, letís look at that. What is the Ė letís look at the null space of lambda, which is one, minus the matrix A. And letís look in different Ė this is one times I minus A. Letís look at the different cases. If you take I, and I ask you whatís the null space of one times I minus A, thatís the null space of the four by four matrix zero. Whatís the null space of the four by four matrix zero?
Instructor (Stephen Boyd):Itís what? Itís R4. Itís all four vectors. So itís four-dimensional in this case. What is the null space of I minus A for this matrix? What is it?
Instructor (Stephen Boyd):Well, I wouldnít say R1 because that means Ė thatís just the set of real numbers. Itís the set Ė itís one-dimensional, which is I think what you meant. Itís one-dimensional, and itís all vectors of the form something zero zero zero Ė yes, [inaudible] something zero zero zero. Itís one-dimensional in this case. Why? Thereís one Jordan block. It makes perfect sense.
This case, if you take I minus this matrix, these become minus ones and these become zeroes, and then you ask what is the dimension of the null space of that matrix, and the answer is two. Thatís two Jordan blocks. Same here, and in here itís actually Ė the dimension of the null space is three. Okay? So basically, the amount by which Ė the amount of rank that lambda I drops Ė lambda I minus A drops when lambda is an eigenvalue tells you something Ė well, actually it tells you exactly the number of Jordan blocks. Thatís not enough by the way to give you the full block structure. That comes out of lambda I minus A raised to various powers. And Iím not gonna go into this, but we can Ė in fact, Iím not gonna go into that, except that we can actually just Ė let me just ask a couple of questions. Suppose a matrix has eigenvalues minus one Ė with multiplicities, minus one, three, and five Ė five by five matrix. Letís enumerate all possible Jordan forms for that matrix. Letís start. What are the possible Jordan forms? Whatís the simplest one?
Instructor (Stephen Boyd):The trivial one which is just diagonal, minus one, minus one, minus one, three, five Ė and if someone asked you how many Jordan blocks, what are the sizes, what would you say here? How would you describe the trivial Ė itís diagonalizable here.
Student:Five one by one.
Instructor (Stephen Boyd):Yeah, so youíd say itís five one by one. But youíd also say by the way, itís pedantic to talk about Jordan blocks when a matrix is diagonalizable. That should also Ė that should be the second part of your reply when someone asks you about this. Okay. Are there any others? Any other possible Jordan forms? For example, could I have Ė could the eigenvalue three correspond to a Jordan block of size two? No. Out of the question because its multiplicity is one. Same for five. So these two Ė no matter what happens, this matrix has two Jordan blocks of size one by one, one associated with eigenvalue three, one with five, period. And the only one where thereís any ambiguity would be this little block of repeated ones, and what are all possible Jordan forms for three repeated eigenvalues of minus one? Weíve got one thatís diagonal. What else? Two in one and what?
Instructor (Stephen Boyd):And three. Okay. In this case, if I asked you Ė if I told you what the dimension of the null space of minus one minus I minus A is Ė if I told you that number, could you then uniquely determine the Jordan form of A? Iím getting this out and ready. What do you think? You could. And the reason is the dimension of the null space of minus I minus A can be either one, two, or three. If it is three, it means A is diagonalizable, end of story. If it is two, it means this Ė there is one block of size two by two and one of size one by one. If it is one, it says thereís a single Jordan block, period. And therefore, you have determined it. Warning! That is not always the case. If I told you that a matrix is four by four, has four eigenvalues of one, and the dimension of the null space of one I minus A is two, that does not tell you the Jordan form of A. Why? Because you donít know if it is this one or this one. Each of these has two Jordan blocks, and you donít know which it is. Okay? So thatís the idea. Okay. Letís look at Ė well naturally, the columns of T in T inverse AT equals J are called generalized eigenvectors. These Ė you group these according to the blocks, and these are the columns that you associate to the Jordan blocks. If you call these Ė if you split these out as columns, then youíll get something like this. The first one comes out just the way you think it does. Or sorry, not the Ė yeah, the first one comes out just the way you think it does. Itís AV1 is lambda times V1 here. Thatís the first one. But the next ones because of that upper triangular Ė sorry, that upper diagonal, you inherit this one. Each AVIJ is actually the previous one plus lambda times this, and so these are called generalized eigenvectors. You actually wonít need to know this. They donít come up that often, but you will see this every now and then. Youíll see people refer to generalized eigenvectors. Now for a Ė if you have X dot is AX, if you put a change of coordinates, if you put it in Jordan block form, basically that splits the dynamics into Ė it splits the dynamics into I guess K separate Ė independent blocks. Each one is a Jordan block. Now you should have an overwhelming urge to draw a block diagram of a Jordan block system, and this is it. Itís a chain of integrators. The chain of integrators by the way corresponds Ė thatís what that upper block of Ė that upper diagonal of ones is this Ė gives you a chain. So you should start thinking of an upper block of ones as giving you things like shift. Itís a shift or itís a chain in this case like that, and so on. And then the lambdas are simply wrapped around this way. So interestingly, people who do engineering and mathematicians both refer to sometimes Jordan blocks as Jordan chains for totally different reasons. People in engineering refer to it as a chain because itís got this chain of Ė itís dynamics built around a chain of integrators. And in math, itís a Jordan chain because itís a chain of subspaces. So this only shows why if youíre in engineering, so thatís the dynamics you see. By the way, when you see this, if you remember things like Ė so actually, let me actually explain a little bit now because the main thing is to get a rough idea of what on earth does it mean if you have X dot equals AX and A has a Jordan block. This says that some of the dynamics is sort of connected in Ė would you call that series, or cascade, or something like that? Thatís what it means. It means that some of the dynamics feeds into the other. Remember what the diagonal system looked like. Itís N boxes that look like this. So the Jordan blocks are actually gonna Ė itís gonna turn out itís gonna have to do with dynamics blocks that cannot be decoupled. Thatís what itís gonna be. Itís gonna be dynamics blocks that cannot be decoupled because theyíre connected in cascade, not in parallel. Okay. And we can look at things like the resolvent and the exponential of a Jordan block. If you look at the resolvent, you see you have this upper thing here, but if you take the inverse of that, the inverse of an upper triangular matrix is not that bad to work out, and it looks like this. Actually, itís quite interesting because now you can see something else. You can see that when you take the resolvent of a Jordan block, youíre gonna get powers Ė youíre gonna get S minus lambdas to negative higher powers. Didnít have that before in the resolvent. So it turns out itís gonna correspond to Ė repeated pols in the resolvent are gonna correspond to Jordan blocks. Could work out the Laplace transform, and this will actually at least give you part of the idea of what the meaning of these things is. When you work out the exponential of a Jordan block, it turns out sure enough you get this E to the T lambda part. Weíre hardly surprised to see that, but now you can see what a Jordan block does. It gives you polynomials. So I think what Iíll do is Ė let me say a little bit here. This tells you what you needed to know. When you see X dot equals AX, and letís make it simple Ė letís say all the eigenvalues are lambda, period. Okay? Now I can tell you what it means for this Ė what the Jordan blocks in A Ė if A is diagonalizable, the only thing you will see in the solution will be things that look like that, period. If thereís a Jordan block of size two by two, you will not only see exponentials, but you will see terms of this form like that. Thatís only if thereís a Jordan block of size two by two or larger. If thereís a Jordan block of size three by three, you will see not only TE to the lambda T, but T squared E to the lambda T. Another way Ė you can turn it around and say that if you see a solution that looks like that here, that says that there is a Jordan block of size K plus one there. Did I say that right? Yes, K plus one. Thatís what it says. So Jordan blocks are basically gonna be the matrix attribute which youíre gonna associate with T to the K Ė these terms which are a polynomial times an exponential. Okay? And letís actually just look at one example just for fun, and then weíre gonna quit. Letís look at X dot Ė I believe this might have come up yesterday in your section, so there. I allocated on the page enough for a four by four. There you go. Thank you. Itís fixed. Letís look at that. What are the eigenvalues? What on earth have I done? My god. That was a terrible crime. There we go. Okay. But you didnít violate the eight-second rule or something like that. When you write something that stupid down, something should say something within some amount of time. Okay, fine. What are the eigenvalues?
Instructor (Stephen Boyd):All zero. Okay. So just based on that, what do you expect to see on the solution when you look at X? Constants, right? And someone says, ďAnything else?Ē Now in this case, what do the solutions look like? The solutions here Ė thatís a Jordan block of size Ė a single one. You are gonna see solutions that look like this. E to the zero T, thatís one. Youíre also gonna see T, T squared, and T cubed. The solutions of this X dot equals AX for this thing are gonna be polynomials. Everybody cool on that? And theyíre polynomials of up to degree three. Now letís do one more thing. Letís change that so that it looks like this. Hereís the block structure. What do you expect to see? Not expect. What would you see?
Instructor (Stephen Boyd):You donít have this, and you donít have that, but you do have this. And finally, if it was just this, if itís all zero, you donít even expect Ė you just see constants. And of course, thatís correct because X dot equals zero Ė the solution is that everything is just constant. Okay? So the neural Ė I mean you really wanna kinda understand all of this, but then the real neural connection you needed to make is dynamically Jordan blocks are these annoying terms of the form Ė they correspond to these. Thatís what they Ė that tells you thereís a Jordan block somewhere in the neighborhood. Okay. So weíll quit here.
[End of Audio]
Duration: 74 minutes