Instructor (Stephen Boyd):Our main screen is not working today. Iíll ó for the first ten minutes while theyíre desperately trying to get our big screen up and running, Iíll say some things about the mid-term to give them as much time as possible.
You can go down to the pad here. Let me say a couple things about it. Youíre welcome to move to a seat where the monitor is more visible or something like that. Thereís probably plenty back there. If you can go down to the pad, I could make a couple of announcements.
I mean, the first thing is Iím ó well, Iím sure there is no one who doesnít know that the mid-term is coming up. In fact, thereís ó itís even possible weíre gonna have an alpha tester take it tonight, which will be interesting. So itís coming along very well. Itís of course this ó end of this week. Let me say what it covers just to remind you. It covers group homework four, thatís the one youíre working on now, and that will be printed on Thursday, and it will include through lecture eight, thatís todayís lecture. In fact, weíre gonna finish lecture eight probably before the class is over. So weíll finish lecture eight and that ó and itíll cover all material in all lecture up to there including even materials that we accidentally forgot to exercise you on in the homework. So there were some glaring omissions. That was just our fault but weíll ó we still ó itís still valid material ó itís fair game for the mid-term. Okay. Letís see. Iím gonna hold extra office hours from Thursday 1:00 to 3:00. I could do it also today from 1:00 to 3:00 if anyone was gonna come by or something like that. Itís not an ó oh, a hand went up. Okay. Sure, Iíll do it today too. Why not? There we go. So Iíll be around both today and on Thursday from 1:00 to 3:00. Watch, I probably have some meetings scheduled today but weíll see. If ó maybe Iíll be there. No, Iíll probably be there. Letís see. I was ó those who are taking the course remotely via S.C.P.D., we would strongly encourage you if youíre local to come and pick up the exam like everyone else and drop it off. Thatís what weíd really prefer to do. If, however, thatís inconvenient or something like that, we will send you a PDF of the exam, but please send email to the T.A.s to let them know ó or, sorry, well, to the staff address to let us know when you would like to take the exam so that we can do that. Donít just sit there, wherever you are, waiting for it to arrive. So ó and make sure you get a response from us saying, ďAcknowledged. Weíre sending you the exam on this time ó at this date at this time.Ē Letís see. Homework 4 ó weíve posted homework three solutions last night. Weíll post homework four solutions; those are the ones youíre working on now. And what weíll do is this. Weíll post those Thursday evening. So Thursday youíll hand in homework four. Weíll post homework four solutions. Now in the past, weíve always let a few people with generally speaking very, very good excuses, such as joining the class late or whatever, turn in a homework a bit late. Unfortunately, we wonít be able to do that for homework four. So homework four, you hand them in, within hours weíre gonna post the solutions. Thatís the ó yeah, Thursday evening. Okay. I donít know if anyone is ó we did post last years mid-term just so that you get to see what a mid-term looks like. I think as part of that you found out where homework problems come from or how homework problems are born. Theyíre born general ó often as mid-term and final exam problems. I also have a question for you. And the question is, when should we post the solutions for last yearís midterm?
Instructor (Stephen Boyd):Now, okay. I bel ó this ó it includes, like, one or two problems on homework four, right? Something ó or maybe one ó is it just gonna have one on it? It ó one overlaps? Okay, fine, no problem. Weíll post it now. Great. So thatís fine. So that means that ó that suggests that people have actually looked at it.
Instructor (Stephen Boyd):Good. Well, we know some people have looked at it because we got request to post the M files required to do it. So thatís ó well, thatís not absolute proof that people looked at it but itís ó weíll take it as a good sign.
Okay. Any more questions?
Student:What time will the exam start? Instructor:
I think weíve posted that on the website. The question was, ďwhen will the exam start?Ē It will start ó I think itís bet ó I think you pick it up between 5:00 and 5:30 or something like that; is that right? But, again, you should never trust me. You should trust the website.
Oh, I do want to say thank you. We got several ó we rearranged the website a little bit last week. And I guess I was caught in the ó I was in the middle of doing 50 things and didnít come back and messed a few things up. And actually, weíre very happy that ó people that ó people caught my mistakes very quickly and fixed them. So thank you for those of you ó so if you ever find anything thatís off on the website like a missing link or something like that, please do let us know because often itís just because, well, we messed up. So thanks to those who corrected that last week.
Okay. Any other questions? If not, weíll continue our discussion of least norm solution. Now, there ó come on, thereís no way anybody can read that. Can you actually read that? No, okay. So you could ó thereís a couple things you could so. You could move close ó every ó if you canít read, you can move closer to a monitor or you can extract just enough information out of it ó out of this little TV to get a rough idea of where I am actually in the notes. Thatís your other method, and do it sort of a correspondence. But, anyway, your choice. But youíre free also to just move somewhere where you can read it. So ó yeah, you can either crowd up here or in back at one of those. I guess theyíre working on trying to get the big screen routed. Okay.
So least norm solution. As I said last time, this is something like the dual of least squares approximate solution. So in least norm solution weíre studying the equation AX=Y. But in this case, A is fat. And weíre assuming itís full rank, so that means you have M equations that can strain a variable X. But you have fewer equations and unknowns, so it means you have extra degrees of freedom. What that means is that AX=Y actually has lots of solutions. There are lots of solutions. It means the null space of A is more than just a zero vector. In fact, itís exactly N minus M dimensional, the null space. So thereís a lot of freedom in choosing X. So one particular X that satisfies AX=Y is the vector of least norm. So thatís the least norm solution and thatís XLN and it has the ó itís just given by the following formula, A transpose AA transpose inverse Y. So thatís the least norm solution. Itís easy to see itís a solution because if you multiply this by A, you get AA transpose times A transpose inverse times Y, and the transpose and the other one, they annihilate each other and you get Y. So you get a solution thatís clear.
This relies on the fact that if A is fat and full rank, AA transpose is invertible. Thatís a basic fact. And actually, what you can show now easily using QR factorization. And in fact, for all practical purposes, weíre gonna do that ourselves in a few minutes.
Okay. So this is a least norm solution. Itís a solution. Now, watch out because the least squares ó I mean, the main thing you want to do with this material is make sure that ó although it looks very similar to the least squares approximate solution. Formulas look the same. A lo ó everything looks similar. But be careful to sort out in your mind, which is which just because they look so dangerously close. So this X least norm is actually a solution of AX=Y, whereas in general XLS, which is A transpose A quantity A inverse times A transpose Y, and that formula is only for a skinny full rank matrix A. In that case, thatís generally not a solution of AX=Y. It is the X that minimized essentially the hit distance or the error or the residual so ó and is generally not a solution of AX=Y, whereas here this one certainly is. Okay.
So this point, X least norm, essentially solves this optimization problem. It says among the vectors that satisfy AX=B ó I donít know where the B came in but AX=Y. You should minim ó among those, you should min ó take the one of minimum norm and thatís this optimization problem. The solution is unique and it is given by X least norm. Now, we can show this directly by direct argument ó thatís easy. Letís let X be any other solution of AX=Y. Well, then AX minus X least norm is zero because AX is Y and so is AXLN. Theyíre both Y, so you subtract them and get zero. And now letís calculate the inner product of X minus X least norm and X least norm. Well, you just ó simply just plug this in and do some matrix manipulations here. Here you have this thing transposed times A transpose. But the product of two transpose is the same of the product in reverse order quantity transposed. So I write it this way. Now, this is actually ó this is gonna be zero because AX minus AXLN is zero. And so actually, the right-hand side doesnít even matter. This vector is zero, so thatís zero. That says that the X minus X least norm and X least norm are perpendicular. Now, when two vectors are perpendicular, it means that you ó if you want to calculate the norm squared of the sum, itís very simple. Itís the sum of the norm squared of the individual components. So some people call that Pythagor ó the generalized Pythagoras theorem or something. Anyway, itís nothing. You write out the formula for the norm squared of a sum and the cross term goes away. So it says that ó if we write out X as ó in a strange way, X least norm plus X minus X least norm, no one could argue with that. But this thing and this are orthogonal, and therefore the norm squared of the sum is the sum of the squares of their norms squared separately. So you get this thing plus that. Well, that says this thing, of course, is going to be non-negative. And you can see immediately that the norm squared of X is bigger than the norm squared of X least norm. And that tells you this, since X was any solution of Y, that tells you that any solution of Y is gonna have a norm at least as big as X least norm. And this is the proof now that X least norm, in fact, minimized the norm among all solutions of AX=Y. So thatís just sort of a direct argument. And the geometry is pretty easy to see.
The set ó you consider a set of vectors that satisfy AX=Y. Now, I mean, this is silly because itís an R2 and here this is a one-dimensional set, itís an affine set. In general, itís just an affine set here. In fact, with a dimension which is N minus M in gen ó in the general case here. And so you can imagine that as a plane or something if this is an R3 with a ó actually just one equation. Itís a plane. And then youíre asked to find the one of least norm. Thatís the point on that plane or hyper plane or affine set which is closest to the origin. Itís the one of least norm. And thatís this one here. And you can see if you shift this, you get the null space of A. Thatís ó that actually gives you the part thatís sort of the ó itís the parallel part of AX=Y. Itís shifted to the origin. And you can see, in fact, just visually here that X least norm is actually gonna be orthogonal to the null space of A, and thatís this orthogonally condition. And of course, you can have a projection interpretation. X least norm is the projection of the point zero on the solution set of AX=Y. So thatís it. Okay.
Now, this is a ó this formula, A transpose A transpose inverse thatís the ó thatís also the pseudo-inverse. But this is the pseudo-inverse of a full rank fat A. So far the symbol, dagger, I guess has two overloadings. Itís overloaded and it applies in two contexts. A dagger applies when the matrix A is skinny and full rank, in which case a dagger means A transpose A inverse A transpose and itís associated with least squares approximate solutions. You also have now an interpretation of a dagger or a definition of a dagger when A is fat and full rank, in which case itís A transpose times AA transpose inverse. And itís actually something that gives you the least norm salutation. So thatís a dagger. By the way, in about three weeks we will complete the overloading of dagger. I think the machine just turned all the way ó okay, gonna reboot it. Or some ó or does that mean youíre giving up? Okay. No, sounds like itís ó yeah, itís reboot minus H, thatís hard. Okay. Okay. So we ó in a couple of weeks weíre gonna complete our overloading of A dagger and weíre actually gonna assign a meaning to A dagger, to any matrix except the zero matrix. So all non-zero matrices will actually have a pseudo-inverse. Only zero will not.
Hey, great. So ó and, yeah, great. Thank you. Okay. Great, all right. Okay. So weíll get to that. But for the moment, the only contexts in which you know about the pseudo-inverse are full rank matrices. So all full rank matrices have a pseudo-inverse. They have different formulas that apply in different contexts. Thatís what overloading means. Okay.
Now, this matrix, A transpose A transpose inverse, thatís a right inverse of A, we know that. I minus A transpose AA transpose inverse A gives a projection onto the null space of A. By the way, this matrix alone gives projection onto the null space of A for this thing, the orthogonal complement. Okay. So this is A transpose AA transpose inverse A. Okay. Now, the same formulas for a full rank skinny matrix are not the same. The analogous formulas are something like this, A dagger or the pseudo-inverse, or I guess in the U.K. the Moore-Penrose inverse, is A transpose A inverse A transpose. And thatís a left inverse and interestingly in this case itís A times A transpose A inverse A is projection on range of A. So the anomaly you see is the ĎI minusí here. Thatís the anomaly essentially, so thatís it. Okay. So do watch out for these. I always check. My mnemonic is real simple. If you see this ó let me see if I can do it right. Iíll try to draw it right. If you see this, everything is cool. You know what I mean by that? So skinny times fat inverse ó well, sorry, itís not cool but itís not obviously uncool. Okay. This is always trouble. See that? That is never cool ever. Okay. So just ó oh, and by the way if I ó if ó of course if these multiplied out and became non-square thatís super uncool because thatís a syntax error. Okay. So my mnemonic is this ó and you might ask, really? You mean I actually ó when Iím working and doing stuff I actually ó yes, I do. So I draw this picture. I donít let anyone see it, you know, because itís embarrassing a little bit. But this is what I do, okay. Thatís cool. That is totally uncool. Not totally. Totally uncool is this times ó see if I can get it right. There we go. See that? Thatís uncoo ó thatís really uncool. Okay. By the way, I think now you should be able to read the little note on the web ó on the course website thatís called Crimes Against Matrices, so you should just read it. Should make sense. Okay.
Well, letís see how the solution connects to QR factorization. It does. A is skinny and full rank therefore A transpose is ó sorry, A is fat and full rank, therefore A transpose is skinny and full rank. And that means that when you write out ó when you do the QR factorization of a skinny full rank matrix, here is what it looks like. Youíre gonna have A is ó you get Q and then you get R. But in ó but R not is invertible. R is square and itís invertible in this case. Okay. So itís ó well, itís invertible. Okay. So itís non-singular, R. And it turns out, you work out the formulas. You just plug in QR for A transpose, so A is R transpose Q transpose and you just plug in the formulas and let things ó I mean, carefully. So you should do this yourself. Iím not gonna do it now. You should just do this carefully. Carefully let things cancel watching out for the usual things. Like Q transpose Q, thatís I, but QQ transpose is not. So just ca ó when you do this carefully, you find out not surprising that this A dagger works out to be nothing but QR minus transpose or R inverse transpose, like that. So thatís what it works out to be. And I forget what the formula is for the least squares one, but itís very similar and itís just kinda got ó maybe itís ó I donít know. Does anyone remember? Itís ó maybe itís R inverse Q transpose. Itís something like this. So this is the ó from a few lectures ago. Is it this? That ó this is in the context of least squares. Is that it? You have the notes there. Is that right? Yeah, so [inaudible] close. Okay. So ó and, you know, after a while youíre gonna get used to these things where these things look similar, but the order is different and some things are transposed and all that sort of stuff. So youíre gonna ó so itís why you have to be careful. Okay. Oh, and the norm of the least norm solution is in fact the norm of the inverse ó itís simply the norm of R minus transpose Y. So that gives you, in fact, the norm. Okay. So thatís the idea.
Okay. Now, I want to now talk about ó essentially ó actually, want we want to do is do the parent of all of these, is go up in abstraction to the parent of both least norm and least squares. Because itís actually quite ó itís useful to know because theyíre both ó theyíre obviously relat ó deeply related. Letís see how theyíre related.
Well, the least norm ó weíll start by handling the least norm problem and solving it in a more conventional way. If you want to minimize X transpose X, thatís of course the norm squared subject to AX=Y, the standard method I guess in ó I guess since the early 19th century, actually earlier than that is to do the following. You take the objective and to that you add a Lagrange multipliers times the constraint. So here is a vector of constraints and we take a vector multiplier lambda. By the way, I donít mean for this to be obvious about how all these Lagrange multiplies work. To tell you the truth, I never understood it myself. In fact, itís generally taught as a behave ó a set of behaviors, right, that a monkey can do. I guess itís generally taught, like, in high school. No one has a clue what it means, what the pictures are or anything. Is that correct? Does anyone here actually ó did anyone, like, draw pictures of this that anyone understood? Actually, how many people have seen, like, Lagrange multipliers for constrained optimization? So how many was it taught absolutely simply a set of behaviors, this is what you do. Wait, does that mean that the rest of you actually understand it? No, itís possible. Maybe things have changed since I was subjected to this. Itís possible. Okay. All right. Anyway, I donít mind saying, I never understood it until, well, a while ago. But I certainly didnít understand it for a while. So here Iím not going to go into it. Iím not gonna go into it. Iím just ó weíre just gonna say, hereís how Lagrange multiplies ó hereís what you do. So hereís what you do. You form this Lagrangian like this, and then the optimality conditions are that the gradient of this with respect to both X and also with respect to lambda should vanish. If I take the gradient of this with respect to lambda, I get AX minus Y and I find that should vanish. Well, that was really super duper useful because it tells me that the optimal solution must satisfy AX=Y. Well, I knew that because that was a constraint. Okay. So this was not exactly informative. Over here, though, itís actually very interesting. If I take the gradient with respect to X, I find out that itís 2X, thatís the gradient of this. And thatís why, by the way, a lot of people will just put in a Ĺ here just to clear the twos out of formulas and things. That ó so youíll see that. You get the gradient of that and the gradient of this thing with respect to X is actually A transpose lambda. So we get 2X + A transpose lambda is zero. Well, thatís interesting. So you solve that. And it says that X is -1/2A transpose lambda. Letís take this and plug it into this, which was hardly a revelation, AX=Y and you get a formula for lambda. So lambda is -2AA transpose inverse Y. Now, I take this lambda and I plug it right in there and I have my final solution which is this. So weíve re-derived by a mysterious method the same thing we derived by a direct algebra three pages ago. Okay.
So this is just to do this because weíre gonna use Lagrange multipliers to look at the general case. So letís do some examples of least norm. This is a stupid and silly one but itís, you know, just ó thatís a good way to start. So we go back to our mass and weíre gonna apply forces on it for ten one second periods consecutively. And weíre interested in the position at the end of ten seconds and the velocity. So you have Y=AX where A is 2 by 10 and A is fat. And I think you even should remember some of the entries in A. I think the top row of A that ó the entries are shrinking as you go along it and the bottom one, theyíre all ones or something like that. Okay. And weíre gonna find the least norm force that transfers the mass unit distance with zero final velocity. So itís got to take the mass. Itís got to accelerate it, and then itís got to decelerate it over here. Although, we leave open the possibility that the right thing to do would be to take the mass and move in the other direction and then ó I mean, that doesnít sound too plausible. Itís actually not the case but anyway weíre leaving that open. We donít require it to simply move the ó although it does. Okay. Now, when you work out the solution ó in fact, this one has an analytic solution and itís really ó itís ó when you work it out, it turns out you should apply a force thatís is aff ó thatís an affine function of the time or of the discrete time. So basically you should push it on the first instance and the first second you push it hard, less hard, less hard, right at T equals ó right around T equals five, you switch ó or sorry, right around T equals five you switch from pushing it very neatly, so this is basically up to ó for five seconds you accelerate the mass, although you push ó you would ó you push less hard later. And we can make ó I mean, we can anthropomorphize this easily. Why is the least norm solution doing this? Why would you push harder at first than later? Whatís that?
Instructor (Stephen Boyd):Just a vague ó this is gonna be a hand waving answer but you just need a vague one. Why would you push harder at first? Why shouldnít it just be like this? Why shouldnít you just push hard and then ex ó and then pull?
Instructor (Stephen Boyd):Thatís it. Thatís it exactly. Okay. So it is more efficient in terms of meters per Newton to push early on. Thatís what it is. So this weights ó this weights the force with the efficiency. So youíre pushing harder at first because you get more meters per Newton of push at the beginning, okay. And then itís symmetrical so you ó the ó you accelerate and you decelerate like that and thatís the picture. Okay.
Let me ask you a couple ó as long as weíre on this one topic, Iím gonna ask you a couple of other questions just for fun. I think once before I admitted publicly that least squares type objectives, and in particular the sum of the XI squared here ó the sum of the forces squared here generally speaking, actually are of no particular practical relevance. Itís generally not what you want to do, right? So thrusters donít come with a box on the label or a tag hanging off the side that says, ďno matter what you do, do not apply a signal whose sum of squares is more than this.Ē They donít come that way. Okay. So what they ó the way they really come is they have things like this, thereís a maximum force you can apply or thereís an amount of fuel you use. Now, by the way, these have names. The ó this is just for fun. All right. But just to give ó just to let you know a little bit about this. The infinity norm ó I think we encountered this once. This is ó itís the maximum of the absolute value. So in fact the way you would say this, for example, in electrical engineering is itís the peak of the vector. Itís the peak of the ó if that is a signal, thatís the peak of the signal. And thatís an absolute value. Thatís a norm and itís also the one norm, which is the sum of the absolute values. Now, this one here tells you how ó essentially how big a thruster you actually need to apply the forces. This norm actually is a very good first order approximation. For example, if you really were using thrusters to position this mass, this would be something related to fuel use because thatís generally how it works. Fuel use is generally proportional to the force that you apply. Okay. You can have more complicated things but for a thruster, thatís a pretty good approximation. Okay. Now, these are both norms like ó by the way, our good old friend the Euclidian norm, in this context inherits a two at the bottom so that you can distinguish it. These are norms. These are all three norms. They all three measure how big a force program is. This one measures it by the peak, this measurers it by the ó essentially the sum of the apostolate values which you can think of as fuel usage. This measures it by the sum of the squares which we often say is energy and thatís mostly to hide the fact that in fact we donít really care about this. Itís just ó this is whatís easy to do mathematically. Okay. Thatís the real reason.
Now, I have a question for you. I would like to know the following. What do you think ó supposed I asked ó instead of minimizing this over moving a mass one meter, Iíd like to know, what happens if you minimize the maximum, and I want you just to guess. What do you think is the optimal thing to do? Whatís the minimum? So you can ó we can call this the gentlest transfer because Iím applying the smallest maximum force to the mass. So this you could call the minimum energy transfer. Thatís what we just worked out here. And I want to know, whatís the gentlest transfer?
Instructor (Stephen Boyd):Exactly. So the minimum ó I donít know the level but itís whatever it has to be. Itís gonna be this. Youíre gonna apply a force, a constant force up to five. Youíre gonna constantly accelerate until five seconds at which point you will decelerate like that, with the exact ó with the same force. Okay. But thereís a name for this. This is very famous. Itís called bang-bang control for obvious reasons. Itís always up at the limit each time. And let me ask you this. You all use disk drives constantly and those are ó in a disk drive, what happens is the little thing is sitting there, track 23, and a signal ó a command comes in to seek track 125 and you have to move it there. Okay. The ó I got news for you. Thatís this problem, okay? And you have to do it, by the way, in a handful of milliseconds. Once you get there, you have to get rid of all the shaking and stuff like that. You have to be tracking something within microns or less. This is serious stuff. Okay. What do you think the current signal in a disk head drive positioning system looks like? Does it look like this or does it look more like that? Iím just ó just guess. Whatís that? Yeah, the answer is, it looks much more like this. Actually, itís not sharp like that. Itís actually got a little bit of a rounded thing there because itís a little bit more complicated, and itís taking into account all sorts of other vibration modes and stuff like that. But basically it looks like that. Why? Because the amplifier will source or sink a maximum amount of current and the goal is to seek as fast as possible. And ó so you donít ó youíre not ó your goal is not to minimize the sum of the squares of the currents in your thing. By the way, if youíre worried about power, the power is closer to this in a disk drive so ó okay.
Now, let me ask you this. How about this one? What if I asked you ó so we worked out what the gentlest ó well, I donít know if youíd call that gentle. But the gentlest in terms of the maximum force you ever apply on the mass transfer is this ó what about the most fuel-efficient? Again, just go ahead and take a guess. People in aero-astro could probably guess this. If youíve studied satellite ó if youíve actually studied how satellites are, for example, moved back on orbit then you might know ó any other ó any guesses? You have a guess. Whatís your guess?
Instructor (Stephen Boyd):Whatís that?
Instructor (Stephen Boyd):You got it. So the optimal here is a giant force there. And ó oh, thatís not right. There. So the optimal ó the X that minimizes the sum, which is ó which would be something like the fuel use, is gonna be this. Itís an impulse. Well, I mean, this is silly. Itís not an impulse. It lasts for a second. You do a fuel burn at the beginning, and then what that does is it just accelerates the mass. And then this is actually called the ballistic phase in the middle. Ballistic means itís just moving with no forces on it other than gravity of what ó in this case there is no gravity. So itís just floating along. And then right in the last second, you apply a counteracting braking force. And this minimizes the fuel. Okay. And by the way, youíll see this if you actually look at a satellite or something like that positioning itself. Youíll see little puffs come out. Youíll see, like, little puff, puff, puffs come out one side and then a little bit on the other side and stuff like that. Thatís exactly this so ó I can tell you have absolutely no idea what Iím talking about but thatís fine. Okay.
So all of this was an aside, just to say ó or if you want to learn about these things, then youíd learn about this stuff in 364, which is probably not exactly the top thing on your mind at this moment. But thatís where they ó so it turns out that you can actually solve these things not with analytical formulas but itís totally straightforward to actually work out these things. Okay. Any more questions about this? Okay.
So the next thing I want to do is connect ó is make some connections between regularized least squares ó actually connect least squares and least norm solutions. And the way they connect is this. Suppose we have a fat full rank matrix. Letís imagine now a two objective problem and it looks like this. J1 is AX-Y norm squared and J2 is norm X squared. Well, the least norm solution basically requires that you be a solution so it requires AX=Y so it says, plea ó it says minimize J1 absolutely to the limit and you get ó and it minimizes J2. So in a tradeoff plot thatís one of ó the least norm solution is one point on the tradeoff curve between these two. The other point, by the way, is X equals zero, which is not very interesting but still, itís the other point. Okay. Now, letís imagine doing this. Letís take a weighted sum objective which is J1 + ?J2 like this and letís minimize it. Thatís A ó this is AX-Y norm squared + ? norm X and weíre gonna let ó the solution to that is A transpose A+BY. Now, what ó by the way, when A is fat and someone writes A transpose A, your ó first of all, your height ó heart rate should increase slightly. You should start breathing sort of shallow breaths and things like that and why is that? If you have a fat matrix and someone writes, ďA transpose A,Ē your vocal cords should get ready to cry out in protest. Your autonomic response should be triggered. What am I talking about? Do you know what Iím talking about? Yeah, good. Okay. Thatís all. Okay. You should ó because when someone takes ó writes ó has a fat matrix and writes ó yeah, is that right? Yes. Then this is actually ó this is the product that passes the syntax scan but is ó youíre just waiting. Especially if you see that left bracket there, thatís when you should be tot ó youíre like ó you should be like, [Makes Noise], like that. But everything is fine here because of this. Okay. So thatís all Iím saying. Okay. So this is actually cool, although itís very close to something thatís not cool. And itís only cool if ? is positive. Itís really not cool if ? is zero here. I mean, really not. Okay. So now what happens is weíre gonna let ? go to zero. That says I care less and less about the size of X. Now, when ? is zero, I actually know how to ó when ? is zero ó if someone just walks up to you and says, ďplease minimize J1,Ē actually someone can hand you back legally any solution of AX=Y. So if someone hands you back two solutions of AX=Y and the specs actually only call for minimizing J1, thatís absolutely valid. Because someone says, ďBoy, thatís crazy. Someone else gave me this solution of AX=Y where X is much smaller.Ē And you go, ďSorry. I checked the specs. I didnít see any mention of the norm of X.Ē So minimizing just J1, there are lots of solutions and, in fact, any solution of AX=Y does the trick, big, small or otherwise. The minute you put in ? here ó for example, even if itís 10 to the -8, now thereís a difference between the two. So if you now find a solution of AX=Y with a big norm X, youíre gonna pay slightly more ó and therefore, as long as ? is positive, itís gonna come up ó itís gonna show up in the composite objective. So what that tells us is that as ? goes to zero, X? should go to X least norm. And, in fact, thatís exactly what happens. Now, you want to be super careful here because as ? goes to zero, this matrix becomes singular. So you want ó thatís ó you want to be very careful. Thatís essentially a denominator going to zero. Thatís what it is. So youíre gonna have to be very, very careful here. And it turns out, itís not that hard to show. It turns out that for a full rank fat matrix A, it turns out that A transpose A + ? on inverse, A transpose goes to A transpose AA transpose inverse. So it actually converges to that. And itís not too hard but itís a little bit tricky in the sense that you donít simply plug in ? equals zero. Because if you plug in ? equals zero, the left-hand formula doesnít even make sense because youíre inverting something which is not invertible, okay? Nevertheless, itís ó this is the case, so okay. So thatís the connection between those two. That explains one of the points on those trade off curves. And now weíre gonna go to the parent of both least squares and least norm because itís not bad to know it. So here is the common parent. The common parent is minimize the norm AX-B subject to CX=B. So minimize a normal ó a general norm of an affine function subject to a linear equality constraint. So thatís the parent of both of them. And letís see. So in this problem how would I reconstruct, for example ó well, least squares, itís just you forget the objectives. You just ó sorry, you forget the constraint. How do I make this into least norm? What would I choose to make this a least norm problem? This thing. Iíd take A=I and B=0. If I take A=I and B=0, thatís a general least norm problem because Iím minimizing then just norm X subject to some linear equations. Okay. So how do we solve this? Well, as usual we square the norm because minimizing the norm is the same as minimizing the square. And when you minimize the square, itís nice because we have a nice formula for the square in terms of inner products. Then that Ĺ goes in front. Why? Because it makes all the formulas prettier because weíre gonna differentiate, basically, a square and we didnít want the two polluting all our formulas so this is what we do. You form a Lagrangian now. Thatís the objective plus ? transpose times CX-D. Thatís this Lagrangian. And then we rewrite ó we expand everything out and then it looks like that. So this term is from ó that first term there, that cross term is from here. This term is the third term from here and then these are the two terms there. Now, one of these ó thatís the gradient with respect to ? being zero just recovers our equality constraints. Itís not interesting. The other one says that the gradient with respect to X of the Lagrangian, thatís A transpose AX minus A [inaudible] is zero. Thatís actually a real equation right there. Now, you can actually solve all of these equations. Iím gonna do it on the next page but itís not pretty. And it turns out thereís a better way to do this. Itís to write it as an equa ó a joint equation in both X and ?. So weíre gonna do that. This top equation is AA transpose times X plus C transpose times ?. Thatís this term and this term. Equals ó and then this goes over to the right-hand side and you get A transpose B. This equation, CX-D=0, well, thatís really just the constraint. I write that down here this way as C times X plus 0 times ? equals D. So you get this equation here. Thatís a square matrix, but itís a very famous matrix that comes up in lots and lots of contexts all over the place. It comes up in, like, economics and, oh, tons of areas. It ó I mean, this form of matrix. Now, if this matrix is invertible, we get the solution immediately and thatís this. Itís X and ?. So both the optimal X and the optimal ? are ó you get them at ó simultaneously and itís given by simply ó well, obviously itís the inverse of this matrix times that. Okay. And now, I actually strongly recommend this is ó that this is the one you should keep in mind. Itís the right one. By the way, some people call this a primal dual formulation, and I can say why. X is thought of as a primal variable here and this Lagrange multiplier is a dual variable. And so in this formulation, youíre really jointly finding both the primal and the dual variables. I mean, that doesnít matter but Iím just saying thatís what this is. Now, this will recover all of our forms. So this is the common parent of both least squares and least norm. And you can recover all of our formulas. So for example, if A transpose A is invertible, that means, of course, that A has to be skinny and full rank. Then you can get a ó you can actually block solve these equations here or you can just block solve these equations. So what you do is if AA transpose is invertible, I multiply this equation by AA transpose and I get X equals, you know, AA transpose inverse A transpose B and so on. Thatís here. You get this formula for X in terms of ?. Now, this form ó now, you take this X and you plug it back into CX=D and you get this equation. And now you can get ?. ? is this. Itís CA transpose A inverse C transpose inverse times this thing. And now finally you go back to this formula. It gives you X in terms of ? and you get that. So actually, really, itís your choice. You can remember this one here or that. So itís really your choice. I mean, of course theyíre the same thing. This is just working out in detail what solving a block two by two system gives you. Okay. So this is the picture. You can check, by the way, if you go back to the original parent problem here. You can check. It recovers everything, absolutely everything. So for example, if A is I and B is 0, you can go down here and plug this into the horrible formulas here ó down here. If you have B is 0, a lot of things simplify, right? That goes away, that goes away. If A is I, all these things that say ó they all go away. And I think ó yeah, sure, it looks like ó except Iím seeing a ó no, Iím not seeing a minus sign. Thereís a minus here and a minus there that cancel each other and youíre recovering it. So it does kinda recover all the equations. This is useful. I think we made a terrible mistake and didnít assign any homework problems that required this. Is that true? I think itís true that we failed to assign any homework problems that use this. But we just kept to least norm and least squares type things. But you should know this. Okay. So that finishes up all the material that will be on the mid-term. And it finishes up in fact the first, I donít know, 40 percent of the course or something like that. So that finishes up a whole block. Iím gonna start the next material because weíre actually in a very good position. Sometimes we donít finish the material until, like, Thursday.
Student:[Inaudible] least squares problem?
Instructor (Stephen Boyd):Oh, how do you recover the least squares problem? Well, thereís actually a couple of ways to do it. So the simplest way is to just not have C there. And I believe this will actually ó it will actually work in that case. So you make C an empty matrix, whatever that is. So, yeah, it works. Look. If I just pretend C is ó I actually canít pretend C is zero. That actually wonít work because this matrix wonít be invertible because it will have rows down here that are all zero. So what we have to do is C is null, so itís not even there. If C is not here, you do get this, right? You get this thing inverse times A transpose B and it looks good to me. I mean, itís not totally straightforward but thatís the right thing to do when C is null as opposed to being zero. Are you buying that? No, youíre not. What part of it are you not buying?
Instructor (Stephen Boyd):Sorry. How does what? Oh, you mean up above? Oh, yeah, thatís easy. Letís go back to that. Oh, how did ó Iíve lost it. There it is, okay. So here if you want to make this least ó the least squares problem all we do is we eliminate that. Thatís least squares. Okay. Now are you buying my other one? Okay. Good, great. Any other questions about this material?
Student:Just one more question.
Instructor (Stephen Boyd):Yep.
Student:Isnít there a diagram about [inaudible]?
Instructor (Stephen Boyd):Yep, thatís the ó back here somewhere. Iíll find it. Iíve lost it. Here it is. There you go.
It was what?
Instructor (Stephen Boyd):Oh, sorry. Did you mean this for the mass? No. Oh, do you mean the geometric picture? Okay. Iíll draw it again because itís gonna be faster than my finding it. Okay. So here is ó you know, the pictures are somewhat unexciting, right, because theyíre generally in R2. So hereís a set of X such that AX=BY, I guess we use here. Like that. Okay. Thatís all these points satisfy AX=Y. I mean, this is silly because A is actually ó AA is A transpose and there is A. Okay. So thatís what it looks like. The least norm solution is the ó so any point on here satisfies AX=Y. This point right here is the point of closest approach to the origin. That point actually has least norm. And that would be ó this point would be X least norm for this problem.
Student:And how did you get the null space of A?
Instructor (Stephen Boyd):Oh, and how did I get the null space of A? Well, the null space of A in this case is this. And I can do that several ways. A is A transpose. A is a row vector here. And A is a ó is the normal of this hyper plane. So all the thing ó if you look at all the points that are orthogonal to A, itís this line right here, okay. Now, thereís another way to see it. This in ó this is the solution set of AX=Y. And the point there is that the difference of any ó if you ask ó if someone comes up with ó one person has an A, and another person has an X and they both satisfy X=Y, the one thing you can be absolutely sure of is that the difference is in the null space. And, in fact, thatís if and only if. You know, so in other words if one person has a solution and it has an element of null space, you add it, you get a new solution.
So what that said, that sort of makes sense here because it says that when youíre moving in this direction, youíre really moving in the null space. And so thatís another way to understand why this ó why the null space would be the same thing but translated to the origin. Okay. So my claim is you know quite a lot now. And itís not that much math in it, but itís not trivial. You know a fair amount. And these methods ó maybe youíre convinced, maybe not. These ó you can already do serious things. You can do all sorts of stuff that you could not do by some heuristic or hacking method. Just with the least norm, least squares, throw in a little regularization, a little multi objective, throw in a smoothing parameter, youíd be surprised what you could do. Thatís you, of course, and computers and high quality open source software, I might add. Because you canít do a whole lot ó people did least squares before they had computers. It was not pretty. Okay. It was basically you would do these things with a calculator ó I mean, with a mechanical calculator, and thatís if youíre really lucky if you had the mechanical calculator. So it was done. Itís a lot easier now. You should be glad you werenít born 80 years go, something like that, longer, a hundred. Okay.
If thereís no more questions about that, weíll move on and actually cover just kinda some of the boring stupid stuff for the next topic which is autonomous linear dynamical systems. So if you can go to ó which is I guess what the class is nominally about so we got to it finally. Okay. So what weíll do is Iíll just go over some of the nomenclature. Iíll talk about some of the basic ideas and get that over with.
So autonomous means that it goes by itself and that means, in fact, that thereís no input here. So what weíre missing from the general formulization is this ó thatís just gone for a while. So weíll first understand just what happens if you have Xdot=AX. It looks very simple. Itís a first order vector differential equation. And we should probably just as a warm-up, answer the following question. If A is one by one ó would you say if X is scalar, letís get this out right now. Whatís the solution of Xdot=AX in that case? Well, itís an exponential, right? Itís something like this. Itís X of T equals E to the TA X of 0. Something like that. No, no, itís not something like ó it is that. Okay. Thatís the solution when A is lower case, which is to say itís a number. Okay. So you can expect something like this to come up. By the way, the qualitative behaviors of the scalar differential equations are kind of boring. Letís talk about them now. If A is 0, Xdot is ó X is a constant. It just says Xdot is 0 so X is a constant. If A is positive, this ó you get a growing exponential. And if itís negative, you get a shrinking exponential, okay? So thatís it. Thatís my discussion of Xdot=AX where A ó where X is scalar, okay? Thereís basically three qualitative types of behavior. Theyíre all kind of boring. You canít really have anything that interesting, okay? So just file that away. Because what weíre gonna do now is youíd think if you overload this idea to vectors, how much more interesting can it be? And youíll find out very soon. Actually, itís pretty much as interesting as any dynamical system can get almost. Thereís another level but weíll get that later.
Okay. Now, here X of T is called the state. N is the state dimension or informally itís the number of states. So it is slang to refer to XI as the I state. However, itís widely used slang. Basically, you wouldnít say that, I think. You wouldnít write that but you would say it. Now, of course in a lot of applications like in dynamics of structures or aircraft or something like that, the Xs actually have names like, you know, X1 or they have meanings, in which case you would actually talk about that. You know, what is the YA and what is the YA rate and what is your angle of attack and all this kind of stuff, your altitude. So in that case, of course, you would ó itís ok ó well, itís still slang but you would talk about that ó those as individual states. Okay. So N is the state dimension or the number of states. A is called the Dynamics Matrix. By the way, in lots of different fields itís got a different name. Letís see. I was just talking about aeronautics. So what is A called in aero ó thereís somebody ó thereís a bunch of people here in aeoro-astro. What is A called when this is a predictational model of a flight ó some steady state flight? You know, I mean, the entries of A have ó A has a name and the entries have names.
Instructor (Stephen Boyd):Thatís it. So in that case the entries of A ó in that case A is called the matrix of stability derivatives. I am not sure where that came from except that indeed it will depend ó the entries in that matrix will determine whether the flight is ó that flight mode is stable or not. So theyíre called the stability derivatives. And I guess itís obtained from linearization of a non-linear system, so that would explain the derivatives, so okay. And other fields have other names for it. In circuit design itís called the small signal dynamics matrix or I donít ó who knows. But anyway, lots of fields have different names for it. Okay.
So hereís a pictures. Itís very stupid and extremely useful. Itís this. So hereís your state at X of T. And itís very useful to do the following. Of course AX of T, itís just a linear and basically A maps X ó basically where you are into where youíre going because X is where ó essentially where you are in state space. Xdot is where youíre going. So A maps X into Xdot. Oh, by the way, what are the physical units of A? Assuming letís say all the Xs are in, you know, some common units. Letís just leave it that way. So all the Xs have some units which are irrelevant. What are the units of A?
Instructor (Stephen Boyd):Itís inverse seconds, exactly. Itís a frequency. Itís a rate. Thatís what it is. But, I mean, this is kind of obvious but that ó so A is a rate. A is an inverse seconds. I mean, depends on the units in X, but generally itís an inverse seconds. By the way, that means that big A and this is a fast system and small A is a slow system. I guess this is kind of obvious, so Iím gonna move on. Letís go back over here.
So Xdot which is AX is where youíre going. And itís extremely useful to take that vector and to glue its base to X. And so you have a picture like that. So if youíre over here, AX might point in that direction, okay? And if youíre over here, AX might point in that direction, okay? And what it says ó it does not mean, of course, that X is gonna be traveling along this line. What it means is that along the solution of X at this point whatever that curve is, itís tangent to this line. And the length of that line gives you the actual speed at that point, okay? This is kind of obvious. All right.
Now, if you draw a picture of Xdot for a whole bunch of points X in a plane, you get a picture called one ó oh, this ó so thereís a name for this. Actually, itís a vector field, okay? So thatís both, by the way, a mathematical description. That describes something which on some set at each point gives you a derivative on the set. Thatís formerly a vector field. So, in fact, Xdot=AX, you would actually call in mathematics a vector field, okay? But itís also used informally to mean something like this where you have a field of points and it ó and sort of at each point conceptually ó of course you donít draw it at each point. You draw a little arrow that gives you a rough idea of where youíre going and how fast. Okay. So this is the example for Xdot=-1021X. You ó we can check things. But the cool thing about this is when you see this vector field, you can actually start visualizing the trajectories. Thatís actually very important to understand really whatís going on. So letís see what it says. It says if youíre here, youíre moving up and to the left and youíre moving at a pretty good clip at least compared to over here. So although youíre not gonna end up here, you know, you donít know where youíre gonna end up but it might be like here. And you can see now that youíll actually keep moving up. You might even ó it looks to me like itís even accelerating. So you can imagine a point starting here as actually kind of moving up like that, okay? On the other hand, if youíre sort of over here, if you start here, you can sort of imagine now various things. You know, you might slow down. Youíre not gonna actually hit zero. Youíd slow down a lot, and then it looks like you might actually start accelerating as you go along there, okay? So these two are just the kinds of things you would get. And by the way, if you ever have a system and you want to quickly figure out what it does, you need to look at pictures like this. It only works in two dimensions. Actually, it depends on your visualization skills. You could probably do this in three, but it would be tricky, I guess. Okay.
Hereís another example. Another little baby two by two matrix, and in this case itís this. You will later come to understand that youíll look at that matrix so ó the same way but so far you look at just a matrix and you know what it means in terms of its input, output entries, right? If I write a matrix down, thereís zeros, you know what it means. If there is large entries, you know what it means. If thereís negative numbers, you know what it means. In terms of just how the input affects the output. So that much you have. That should be wired into you by now. You will actually develop something like that for dynamics matrices. So certainly for two by twos and three by threes youíll start getting a real ó very good idea. Youíll look at that and get a rough idea. Thereís gonna have to be some complication to really know what happens but thatís the idea. So hereís the vector field here and you can kind of get a pretty good idea for it. Here it looks like the trajectories are kind of elliptical. Now, Iíll tell you what you canít tell by your eyeball here is ó unless you were super duper careful. You canít tell if the trajectories are actually ó are they winding in or are they winding out? Youíd have to really kind of trace this very carefully and figure out if ó when you kinda come around one cycle, youíre bigger or smaller than you were before. Okay. So thatís, I think, not obvious from here. It will be very obvious to you in a week as to how to do that. But thatís the idea. Okay.
Now, another very useful thing is a block diagram. So you can write Xdot=AX this way. By the way, itís done not with differentiators but with integrators. So thatís ó and thereís historical reasons for it. Well, Iíll tell you what the historical reason ó actually, does anyone know the historical reasons for it? Itís entirely likely that youíre all too young to have any ó this is in the deep ó this is ó weíre talking slide ó weíre talking before slide rules here. Anyone here ever use a slide rule? Cool, zero, you did. That is so cool. Did you do it as a joke or, no, you really used it?
Student:Well, it was my dadís.
Instructor (Stephen Boyd):It was your dadís, well, there you go. So all right. So but still itís cool, though. Do you actually know how to use it?
Student:Itís probably [inaudible].
Instructor (Stephen Boyd):Cool. Thatís about the right ó thatís about how it should stay. Okay. So I can tell you the ó Iíll tell you the historical reason for this. So first let me just say what this is. This is a vector. These are vector signals and itís sometimes common ó I guess this is from digital circuit design ó to take in a signal flow graph to put a little note with a line through it. I donít know why this tradition came up. And this tells you the dimensions. So thatís a vector signal with N components, X of T. It goes into A so what comes out here is AX. And that goes into ó 1/S is actually ó you really should write that as I/S because this is ó so you would interpret this because itís a vector signal in and vector signal out as a ó you would actually ó the slang for this on the street should be a bank of integrators. That would be the slang for this. Because it I exploded this out and showed the individual components, it would really look like this. It would look like that. Letís say if itís two by two. So thatís if I clicked on that box and asked for the detail, I would get this, okay? So it looks like that. So it would be a bank of integrators. These are now scalar integrators here. Okay.
And now, let me get to Y integrators. So nowadays you will soon see how to actually solve the equation Xdot=A of X. It wonít be surprising to you that you can work out the whole trajectory for X1000, 2000. I mean, these are just enormous systems. Just immediately on a laptop. I mean, 2000 is not immediately, all right. But a thousand, even 500 is extraordinary, okay? So a 500 state model will model a lot of things. I mean, thatís actually a fairly detailed structural model of a lot of things. You can actually just solve Xdot=AX. Itís nothing. Itís gonna be two lines of code, something like that. If that ó itíll run on a laptop, not yet a phone but thatís coming. Itíll ó and just get it so itís like ó itís sort of like Lee squares. For you itís nothing, itís a backslash, right? For your parents, it was much more complicated. It was a half day of coding Fortran. Donít even ask what your grandparents had to do to do Lee squares. They ó maybe not your exact grandparents but somebodyís grandparents did it and it wasnít that cool. It was mechanical calculators or sheets or slide rules. Lots of people in rooms. So it was done. So all right. Back to this. In the, I donít know, in the 20s, 30s ó actually even earlier than that. I think this ó anyone know? You want to do a Wikipedia on differential engine, Vannevar Bush, differential engine, differential analyzer, there you go. So that actually ó I believe it might even be late 19th century. So in the late 19th century it was already recognized that Xdot=AX was an ó was that if you understood what the solutions of that did, you could actually say a lot about how a machine or something like that was gonna work. That was all or how ó 19 what?
Instructor (Stephen Boyd):Well, I was off as usual. Itís a good thing Iím not in the history department. But Iím allowed to ó I got it vaguely right. It was a long time ago, so 1927. So in 1927 ó oh, but maybe thatís the ó is that the mechanical one? Okay. So this guy built a mechanical system that will actually give you the approximate solution of Xdot=AX, okay? Not long after that people built vacuum tube computers like analog computers. This means nothing, thank God, actually, to anyone. Nothing, no one has even heard of this. That is so good. Usually ó youíve heard of it? Thatís so good. Did you actually see one? No, okay. Thatís too ó I should bring in some pictures just so you know how lucky you are now. Yeah, whatís that?
Instructor (Stephen Boyd):Yeah, theyíre typically in basements now or storage closets. Yes, thatís right. You saw ó
Instructor (Stephen Boyd):Yeah, sure, they really used them. Okay. So what it was was this. It was an electric ó you had a big patch panel and you had electronic integrators. I guess anyone here in electrical engineering knows how to do that with an op-amp and a capacitor in the feedback loop, you get an integrator. And you had a big ó you had a whole bunch of integrators and then you had a little like banana plug things and you could plug these up and you could wire them up. They had little gain units that you would dial in. Theyíre really quite beautiful. I ó actually, I never touched one so ó just so you know. So ó and you dial in little gains and things like that and youíd have a whole ó so how would you actually program this analog computer? Youíd do it by actually physically hooking wires up between these things, okay? And then thereíd be a big button and you ó a big button and youíd press start. And also the red lights would come on meaning that things just overflowed their ranges, right? And that either means you messed up the programming which in this case literally means plugging wires in or it means you probably shouldnít build that aircraft. It means one or the other. Youíd have to figure out which it was. Oh, and the way you would ó the way if you had like a class like a homework exercise on the analog computer, the way it would work was actually kind of cool. Youíd have gra ó your program would be this big thing like this with all your wires on it. And you would detach the whole thing and then walk around with it. And then the other ó another student would come in and plug theirs into the analog computers.
Are you at least a little bit grateful now about when you were born and stuff? I mean, I hope so. Yeah, Iíll take that as a small sign of gratitude. Watch out because provoke me and homework eight, analog computer. And if you donít think I can find one on E-Bay, you are wrong. Okay. So all right. Whatís that?
Instructor (Stephen Boyd):Oh, take a bite. Iím gonna take that as an open challenge. You ó
Instructor (Stephen Boyd):Okay, cool. No, you know what weíll do. Whatís that?
Instructor (Stephen Boyd):Oh, okay. Yeah, actually the T.A.s have to take this course on, you know, ethical actions and this, so Jacob is exercising his right to not be involved in such an escapade. But that would be great. No, maybe weíll do it with like white proto-boards and op-amps and capacitors. That would be great. Okay, all right. But itís noted that thereís been a challenge. All right. Back to this.
So there ó the reason was that you find integrators here is because of this. Oh, and by the way, I guess you mentioned ó someone, you knew what the story was. So this was used in 1939 through 1945 at M.I.T. They put these things together. They werenít quite linear. They actually had other terms in. In fact, it was just basically, youíd fire a shell and wanted to work out firing tables. And a firing table was for a certain shell, if you fired at this angle and thereís a wind in a certain direction, the question is where does it land? And you would solve a difference equation like this. It wasnít quite linear. It had one non-linear term in it. And this was done, in fact, in secret in a basement in M.I.T. with analog computers. And then results, you just tabulate ó just ran all the time and theyíd worked out the things and when you wanted to use it, you checked the wind, figure out, you know, go in the table, find the range and find out you should elevate it 22.63 degrees. Okay. So thatís what this was. Okay. All right. So thatís a ó just a historical comment about why you see integrators and why this block diagram would strike fear into the hearts of your parents and grandparents if they did this kind of thing. But not you. So ó and why ó thatís why you should be grateful. Okay. All right. Okay.
So if you draw a block diagram out, if you explode the block diagram of A, you can actually get interesting information. Hereís an example. Suppose you have Xdot is AX where A is block upper triangular. Well, by now if you just see this ó if you see Y=AX and A looks like that, you know exactly what it means. Without even thinking you would say, hey, how interesting. The bottom half of ó letís suppose thatís Y. Youíd say that the bottom half of Y doesnít depend on the first half ó it doesnít have to be half, of course ó but the first part of X. Thatís what youíd say when you see that. But now, thatís the derivative which is actually more interesting. So you read this equation this way. Youíd say something like where the bottom half of X is going, thatís English for X2dot, doesnít depend on X1, thatís the zero. Okay. And when you draw the block diagram itís super ó itís totally obvious because you draw it this way. Hereís X1. X1dot is A11X1+A12X. Oh, I didnít say something here. The rule here is this. You want to know how do you get X1dot if on ó if all you have are integrators. You look at the output of an integrator and you ask, well, what went ó if thatís an integrator and what came out is X, what had to go in was Xdot. So thatís how you do. So you simply ó you go backwards through the integrator if you want to get Xdot. So the inputs to integrators are derivatives. So this is X1dot and this is X2dot. And this says X1dot is A ó itís a sum of two things, thatís what the summing junction does. Itís A11X1+A12X2. Now, when you stare through this block diagram, something exceedingly obvious comes up and thatís this. If I draw a dashed line like that, you see something really interesting and that is that information flows from the bottom to the top but not vice versa. So nothing that happens up top ever has an effect on what goes on down here. Okay. And well, and basically it says X2 affects X1 but X1 has no affect whatsoever on X2. That means all sorts of interesting things. Weíve concluded things like this. It says that X2 ó you can actually calculate the solution of X2 separately because it has no ó it is in way affected by X1. Thatís what this says and thatís what you get out of looking at that equation. Itís kind of ó well, weíll see lots of other ways to do it but this is kind of the idea of the way to get the intuition for how this works. Everybody see this? So thatís the picture here. So letís see.
Letís look at a couple of examples. I think Iíll just look at just one, which is a linear circuit. So here I have a cir ó a linear static circuit. Now, that means itís a circuit that can contain things like resistors, transformers. It can have, oh, letís see. Well, it depends on your model of a transformer. If itís an inductive model you have to put it out here. So weíll skip transformers. But it can have things like dependant sources and things like that. So thatís whatís in here. And I pulled the capacitors out to the left and the inductors off to the right. And the equations here are very simple. It doesnít matter if youíre not in E.E. and donít know these equations. So that doesnít really matter. Itís just an example. So here Iím gonna ó the equations here for each capacitor are this. Itís CDDBT is the charging current and Iíve drawn the charging current to go into the capacitors like that. For the inductors, itís the same thing. Itís LDIDT is the charging voltage. So for an inductor ó again, Iím addressing people who do E.E., right? For an inductor, you think about voltage as charging it. When you apply voltage to an inductor, it ramps up the current. When you apply a current to a capacitor it ramps up its voltage. Okay. So you get these equations. And then this thing is some horrible complicated thing. But the point is, itís linear. So itís a set of linear equations that relate ó these are the port variables, the voltage and the current and the voltage and the current at these ports. Thatís called a port when you hang two wires out of a circuit. Itís a port. Okay. And thereís a linear relation that covers these ó the voltage and currents at the port, the port variables. And weíre gonna write that in this way. Weíre gonna say that the inductor ó sorry, the capacitorís current and the inductor voltage ó actually, these are the charging ó these are basically the charging variables is some matrix times VCNIL. So weíre gonna write it that way. All right. And weíll let C be a diagonal matrix with these capacitors and L, this thing, so that I can write these out as matrix equations. And if you have state, VCNIL. So the state is the voltage on the capacitor and the inductor current. Then you can write out everything here as ó itís very simple. Itís ó this is CVdot here is IC and this is LIdot ó uh-oh, thatís hard to make a dot and make it clear ó equals VL. And you simply put those equations into here, take C inverse on the left-hand side and you get a set of equations like this. Okay? So this tells you that you can write out and this is, of course, an autonomous linear system. Thatís AXA is this matrix here. Okay. So ó by the way, this is already of huge interest. It says that, for example, again, this is addressed to people in E.E. It says, for example, if you have an interconnect circuit and some leading edge, you know, 45 nanometer design it says that if you want to analyze the interconnect in some digital circuit, in some high performance circuit, which you can model as ó certainly with some inductance, capacitance and resistance. It says you write that as Xdot equals AX period. Thatís what it does. So that means itís already of extreme interest in practice to know what the solutions of this do. Okay. So weíll quit here.
[End of Audio]
Duration: 74 minutes