**Instructor (Stephen Boyd)**:Our main screen is not working today. I’ll — for the first ten minutes while they’re desperately trying to get our big screen up and running, I’ll say some things about the mid-term to give them as much time as possible.

You can go down to the pad here. Let me say a couple things about it. You’re welcome to move to a seat where the monitor is more visible or something like that. There’s probably plenty back there. If you can go down to the pad, I could make a couple of announcements.

I mean, the first thing is I’m — well, I’m sure there is no one who doesn’t know that the mid-term is coming up. In fact, there’s — it’s even possible we’re gonna have an alpha tester take it tonight, which will be interesting. So it’s coming along very well. It’s of course this — end of this week. Let me say what it covers just to remind you. It covers group homework four, that’s the one you’re working on now, and that will be printed on Thursday, and it will include through lecture eight, that’s today’s lecture. In fact, we’re gonna finish lecture eight probably before the class is over. So we’ll finish lecture eight and that — and it’ll cover all material in all lecture up to there including even materials that we accidentally forgot to exercise you on in the homework. So there were some glaring omissions. That was just our fault but we’ll — we still — it’s still valid material — it’s fair game for the mid-term. Okay. Let’s see. I’m gonna hold extra office hours from Thursday 1:00 to 3:00. I could do it also today from 1:00 to 3:00 if anyone was gonna come by or something like that. It’s not an — oh, a hand went up. Okay. Sure, I’ll do it today too. Why not? There we go. So I’ll be around both today and on Thursday from 1:00 to 3:00. Watch, I probably have some meetings scheduled today but we’ll see. If — maybe I’ll be there. No, I’ll probably be there. Let’s see. I was — those who are taking the course remotely via S.C.P.D., we would strongly encourage you if you’re local to come and pick up the exam like everyone else and drop it off. That’s what we’d really prefer to do. If, however, that’s inconvenient or something like that, we will send you a PDF of the exam, but please send email to the T.A.s to let them know — or, sorry, well, to the staff address to let us know when you would like to take the exam so that we can do that. Don’t just sit there, wherever you are, waiting for it to arrive. So — and make sure you get a response from us saying, “Acknowledged. We’re sending you the exam on this time — at this date at this time.” Let’s see. Homework 4 — we’ve posted homework three solutions last night. We’ll post homework four solutions; those are the ones you’re working on now. And what we’ll do is this. We’ll post those Thursday evening. So Thursday you’ll hand in homework four. We’ll post homework four solutions. Now in the past, we’ve always let a few people with generally speaking very, very good excuses, such as joining the class late or whatever, turn in a homework a bit late. Unfortunately, we won’t be able to do that for homework four. So homework four, you hand them in, within hours we’re gonna post the solutions. That’s the — yeah, Thursday evening. Okay. I don’t know if anyone is — we did post last years mid-term just so that you get to see what a mid-term looks like. I think as part of that you found out where homework problems come from or how homework problems are born. They’re born general — often as mid-term and final exam problems. I also have a question for you. And the question is, when should we post the solutions for last year’s midterm?

**Student:**Now.

**Student:**Now.

**Instructor (Stephen Boyd)**:Now, okay. I bel — this — it includes, like, one or two problems on homework four, right? Something — or maybe one — is it just gonna have one on it? It — one overlaps? Okay, fine, no problem. We’ll post it now. Great. So that’s fine. So that means that — that suggests that people have actually looked at it.

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:Good. Well, we know some people have looked at it because we got request to post the M files required to do it. So that’s — well, that’s not absolute proof that people looked at it but it’s — we’ll take it as a good sign.

Okay. Any more questions?

**Student:**What time will the exam start?
Instructor:

I think we’ve posted that on the website. The question was, “when will the exam start?” It will start — I think it’s bet — I think you pick it up between 5:00 and 5:30 or something like that; is that right? But, again, you should never trust me. You should trust the website.

Oh, I do want to say thank you. We got several — we rearranged the website a little bit last week. And I guess I was caught in the — I was in the middle of doing 50 things and didn’t come back and messed a few things up. And actually, we’re very happy that — people that — people caught my mistakes very quickly and fixed them. So thank you for those of you — so if you ever find anything that’s off on the website like a missing link or something like that, please do let us know because often it’s just because, well, we messed up. So thanks to those who corrected that last week.

Okay. Any other questions? If not, we’ll continue our discussion of least norm solution. Now, there — come on, there’s no way anybody can read that. Can you actually read that? No, okay. So you could — there’s a couple things you could so. You could move close — every — if you can’t read, you can move closer to a monitor or you can extract just enough information out of it — out of this little TV to get a rough idea of where I am actually in the notes. That’s your other method, and do it sort of a correspondence. But, anyway, your choice. But you’re free also to just move somewhere where you can read it. So — yeah, you can either crowd up here or in back at one of those. I guess they’re working on trying to get the big screen routed. Okay.

So least norm solution. As I said last time, this is something like the dual of least squares approximate solution. So in least norm solution we’re studying the equation AX=Y. But in this case, A is fat. And we’re assuming it’s full rank, so that means you have M equations that can strain a variable X. But you have fewer equations and unknowns, so it means you have extra degrees of freedom. What that means is that AX=Y actually has lots of solutions. There are lots of solutions. It means the null space of A is more than just a zero vector. In fact, it’s exactly N minus M dimensional, the null space. So there’s a lot of freedom in choosing X. So one particular X that satisfies AX=Y is the vector of least norm. So that’s the least norm solution and that’s XLN and it has the — it’s just given by the following formula, A transpose AA transpose inverse Y. So that’s the least norm solution. It’s easy to see it’s a solution because if you multiply this by A, you get AA transpose times A transpose inverse times Y, and the transpose and the other one, they annihilate each other and you get Y. So you get a solution that’s clear.

This relies on the fact that if A is fat and full rank, AA transpose is invertible. That’s a basic fact. And actually, what you can show now easily using QR factorization. And in fact, for all practical purposes, we’re gonna do that ourselves in a few minutes.

Okay. So this is a least norm solution. It’s a solution. Now, watch out because the least squares — I mean, the main thing you want to do with this material is make sure that — although it looks very similar to the least squares approximate solution. Formulas look the same. A lo — everything looks similar. But be careful to sort out in your mind, which is which just because they look so dangerously close. So this X least norm is actually a solution of AX=Y, whereas in general XLS, which is A transpose A quantity A inverse times A transpose Y, and that formula is only for a skinny full rank matrix A. In that case, that’s generally not a solution of AX=Y. It is the X that minimized essentially the hit distance or the error or the residual so — and is generally not a solution of AX=Y, whereas here this one certainly is. Okay.

So this point, X least norm, essentially solves this optimization problem. It says among the vectors that satisfy AX=B — I don’t know where the B came in but AX=Y. You should minim — among those, you should min — take the one of minimum norm and that’s this optimization problem. The solution is unique and it is given by X least norm. Now, we can show this directly by direct argument — that’s easy. Let’s let X be any other solution of AX=Y. Well, then AX minus X least norm is zero because AX is Y and so is AXLN. They’re both Y, so you subtract them and get zero. And now let’s calculate the inner product of X minus X least norm and X least norm. Well, you just — simply just plug this in and do some matrix manipulations here. Here you have this thing transposed times A transpose. But the product of two transpose is the same of the product in reverse order quantity transposed. So I write it this way. Now, this is actually — this is gonna be zero because AX minus AXLN is zero. And so actually, the right-hand side doesn’t even matter. This vector is zero, so that’s zero. That says that the X minus X least norm and X least norm are perpendicular. Now, when two vectors are perpendicular, it means that you — if you want to calculate the norm squared of the sum, it’s very simple. It’s the sum of the norm squared of the individual components. So some people call that Pythagor — the generalized Pythagoras theorem or something. Anyway, it’s nothing. You write out the formula for the norm squared of a sum and the cross term goes away. So it says that — if we write out X as — in a strange way, X least norm plus X minus X least norm, no one could argue with that. But this thing and this are orthogonal, and therefore the norm squared of the sum is the sum of the squares of their norms squared separately. So you get this thing plus that. Well, that says this thing, of course, is going to be non-negative. And you can see immediately that the norm squared of X is bigger than the norm squared of X least norm. And that tells you this, since X was any solution of Y, that tells you that any solution of Y is gonna have a norm at least as big as X least norm. And this is the proof now that X least norm, in fact, minimized the norm among all solutions of AX=Y. So that’s just sort of a direct argument. And the geometry is pretty easy to see.

The set — you consider a set of vectors that satisfy AX=Y. Now, I mean, this is silly because it’s an R2 and here this is a one-dimensional set, it’s an affine set. In general, it’s just an affine set here. In fact, with a dimension which is N minus M in gen — in the general case here. And so you can imagine that as a plane or something if this is an R3 with a — actually just one equation. It’s a plane. And then you’re asked to find the one of least norm. That’s the point on that plane or hyper plane or affine set which is closest to the origin. It’s the one of least norm. And that’s this one here. And you can see if you shift this, you get the null space of A. That’s — that actually gives you the part that’s sort of the — it’s the parallel part of AX=Y. It’s shifted to the origin. And you can see, in fact, just visually here that X least norm is actually gonna be orthogonal to the null space of A, and that’s this orthogonally condition. And of course, you can have a projection interpretation. X least norm is the projection of the point zero on the solution set of AX=Y. So that’s it. Okay.

Now, this is a — this formula, A transpose A transpose inverse that’s the — that’s also the pseudo-inverse. But this is the pseudo-inverse of a full rank fat A. So far the symbol, dagger, I guess has two overloadings. It’s overloaded and it applies in two contexts. A dagger applies when the matrix A is skinny and full rank, in which case a dagger means A transpose A inverse A transpose and it’s associated with least squares approximate solutions. You also have now an interpretation of a dagger or a definition of a dagger when A is fat and full rank, in which case it’s A transpose times AA transpose inverse. And it’s actually something that gives you the least norm salutation. So that’s a dagger. By the way, in about three weeks we will complete the overloading of dagger. I think the machine just turned all the way — okay, gonna reboot it. Or some — or does that mean you’re giving up? Okay. No, sounds like it’s — yeah, it’s reboot minus H, that’s hard. Okay. Okay. So we — in a couple of weeks we’re gonna complete our overloading of A dagger and we’re actually gonna assign a meaning to A dagger, to any matrix except the zero matrix. So all non-zero matrices will actually have a pseudo-inverse. Only zero will not.

Hey, great. So — and, yeah, great. Thank you. Okay. Great, all right. Okay. So we’ll get to that. But for the moment, the only contexts in which you know about the pseudo-inverse are full rank matrices. So all full rank matrices have a pseudo-inverse. They have different formulas that apply in different contexts. That’s what overloading means. Okay.

Now, this matrix, A transpose A transpose inverse, that’s a right inverse of A, we know that. I minus A transpose AA transpose inverse A gives a projection onto the null space of A. By the way, this matrix alone gives projection onto the null space of A for this thing, the orthogonal complement. Okay. So this is A transpose AA transpose inverse A. Okay. Now, the same formulas for a full rank skinny matrix are not the same. The analogous formulas are something like this, A dagger or the pseudo-inverse, or I guess in the U.K. the Moore-Penrose inverse, is A transpose A inverse A transpose. And that’s a left inverse and interestingly in this case it’s A times A transpose A inverse A is projection on range of A. So the anomaly you see is the ‘I minus’ here. That’s the anomaly essentially, so that’s it. Okay. So do watch out for these. I always check. My mnemonic is real simple. If you see this — let me see if I can do it right. I’ll try to draw it right. If you see this, everything is cool. You know what I mean by that? So skinny times fat inverse — well, sorry, it’s not cool but it’s not obviously uncool. Okay. This is always trouble. See that? That is never cool ever. Okay. So just — oh, and by the way if I — if — of course if these multiplied out and became non-square that’s super uncool because that’s a syntax error. Okay. So my mnemonic is this — and you might ask, really? You mean I actually — when I’m working and doing stuff I actually — yes, I do. So I draw this picture. I don’t let anyone see it, you know, because it’s embarrassing a little bit. But this is what I do, okay. That’s cool. That is totally uncool. Not totally. Totally uncool is this times — see if I can get it right. There we go. See that? That’s uncoo — that’s really uncool. Okay. By the way, I think now you should be able to read the little note on the web — on the course website that’s called Crimes Against Matrices, so you should just read it. Should make sense. Okay.

Well, let’s see how the solution connects to QR factorization. It does. A is skinny and full rank therefore A transpose is — sorry, A is fat and full rank, therefore A transpose is skinny and full rank. And that means that when you write out — when you do the QR factorization of a skinny full rank matrix, here is what it looks like. You’re gonna have A is — you get Q and then you get R. But in — but R not is invertible. R is square and it’s invertible in this case. Okay. So it’s — well, it’s invertible. Okay. So it’s non-singular, R. And it turns out, you work out the formulas. You just plug in QR for A transpose, so A is R transpose Q transpose and you just plug in the formulas and let things — I mean, carefully. So you should do this yourself. I’m not gonna do it now. You should just do this carefully. Carefully let things cancel watching out for the usual things. Like Q transpose Q, that’s I, but QQ transpose is not. So just ca — when you do this carefully, you find out not surprising that this A dagger works out to be nothing but QR minus transpose or R inverse transpose, like that. So that’s what it works out to be. And I forget what the formula is for the least squares one, but it’s very similar and it’s just kinda got — maybe it’s — I don’t know. Does anyone remember? It’s — maybe it’s R inverse Q transpose. It’s something like this. So this is the — from a few lectures ago. Is it this? That — this is in the context of least squares. Is that it? You have the notes there. Is that right? Yeah, so [inaudible] close. Okay. So — and, you know, after a while you’re gonna get used to these things where these things look similar, but the order is different and some things are transposed and all that sort of stuff. So you’re gonna — so it’s why you have to be careful. Okay. Oh, and the norm of the least norm solution is in fact the norm of the inverse — it’s simply the norm of R minus transpose Y. So that gives you, in fact, the norm. Okay. So that’s the idea.

Okay. Now, I want to now talk about — essentially — actually, want we want to do is do the parent of all of these, is go up in abstraction to the parent of both least norm and least squares. Because it’s actually quite — it’s useful to know because they’re both — they’re obviously relat — deeply related. Let’s see how they’re related.

Well, the least norm — we’ll start by handling the least norm problem and solving it in a more conventional way. If you want to minimize X transpose X, that’s of course the norm squared subject to AX=Y, the standard method I guess in — I guess since the early 19th century, actually earlier than that is to do the following. You take the objective and to that you add a Lagrange multipliers times the constraint. So here is a vector of constraints and we take a vector multiplier lambda. By the way, I don’t mean for this to be obvious about how all these Lagrange multiplies work. To tell you the truth, I never understood it myself. In fact, it’s generally taught as a behave — a set of behaviors, right, that a monkey can do. I guess it’s generally taught, like, in high school. No one has a clue what it means, what the pictures are or anything. Is that correct? Does anyone here actually — did anyone, like, draw pictures of this that anyone understood? Actually, how many people have seen, like, Lagrange multipliers for constrained optimization? So how many was it taught absolutely simply a set of behaviors, this is what you do. Wait, does that mean that the rest of you actually understand it? No, it’s possible. Maybe things have changed since I was subjected to this. It’s possible. Okay. All right. Anyway, I don’t mind saying, I never understood it until, well, a while ago. But I certainly didn’t understand it for a while. So here I’m not going to go into it. I’m not gonna go into it. I’m just — we’re just gonna say, here’s how Lagrange multiplies — here’s what you do. So here’s what you do. You form this Lagrangian like this, and then the optimality conditions are that the gradient of this with respect to both X and also with respect to lambda should vanish. If I take the gradient of this with respect to lambda, I get AX minus Y and I find that should vanish. Well, that was really super duper useful because it tells me that the optimal solution must satisfy AX=Y. Well, I knew that because that was a constraint. Okay. So this was not exactly informative. Over here, though, it’s actually very interesting. If I take the gradient with respect to X, I find out that it’s 2X, that’s the gradient of this. And that’s why, by the way, a lot of people will just put in a ½ here just to clear the twos out of formulas and things. That — so you’ll see that. You get the gradient of that and the gradient of this thing with respect to X is actually A transpose lambda. So we get 2X + A transpose lambda is zero. Well, that’s interesting. So you solve that. And it says that X is -1/2A transpose lambda. Let’s take this and plug it into this, which was hardly a revelation, AX=Y and you get a formula for lambda. So lambda is -2AA transpose inverse Y. Now, I take this lambda and I plug it right in there and I have my final solution which is this. So we’ve re-derived by a mysterious method the same thing we derived by a direct algebra three pages ago. Okay.

So this is just to do this because we’re gonna use Lagrange multipliers to look at the general case. So let’s do some examples of least norm. This is a stupid and silly one but it’s, you know, just — that’s a good way to start. So we go back to our mass and we’re gonna apply forces on it for ten one second periods consecutively. And we’re interested in the position at the end of ten seconds and the velocity. So you have Y=AX where A is 2 by 10 and A is fat. And I think you even should remember some of the entries in A. I think the top row of A that — the entries are shrinking as you go along it and the bottom one, they’re all ones or something like that. Okay. And we’re gonna find the least norm force that transfers the mass unit distance with zero final velocity. So it’s got to take the mass. It’s got to accelerate it, and then it’s got to decelerate it over here. Although, we leave open the possibility that the right thing to do would be to take the mass and move in the other direction and then — I mean, that doesn’t sound too plausible. It’s actually not the case but anyway we’re leaving that open. We don’t require it to simply move the — although it does. Okay. Now, when you work out the solution — in fact, this one has an analytic solution and it’s really — it’s — when you work it out, it turns out you should apply a force that’s is aff — that’s an affine function of the time or of the discrete time. So basically you should push it on the first instance and the first second you push it hard, less hard, less hard, right at T equals — right around T equals five, you switch — or sorry, right around T equals five you switch from pushing it very neatly, so this is basically up to — for five seconds you accelerate the mass, although you push — you would — you push less hard later. And we can make — I mean, we can anthropomorphize this easily. Why is the least norm solution doing this? Why would you push harder at first than later? What’s that?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:Just a vague — this is gonna be a hand waving answer but you just need a vague one. Why would you push harder at first? Why shouldn’t it just be like this? Why shouldn’t you just push hard and then ex — and then pull?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:That’s it. That’s it exactly. Okay. So it is more efficient in terms of meters per Newton to push early on. That’s what it is. So this weights — this weights the force with the efficiency. So you’re pushing harder at first because you get more meters per Newton of push at the beginning, okay. And then it’s symmetrical so you — the — you accelerate and you decelerate like that and that’s the picture. Okay.

Let me ask you a couple — as long as we’re on this one topic, I’m gonna ask you a couple of other questions just for fun. I think once before I admitted publicly that least squares type objectives, and in particular the sum of the XI squared here — the sum of the forces squared here generally speaking, actually are of no particular practical relevance. It’s generally not what you want to do, right? So thrusters don’t come with a box on the label or a tag hanging off the side that says, “no matter what you do, do not apply a signal whose sum of squares is more than this.” They don’t come that way. Okay. So what they — the way they really come is they have things like this, there’s a maximum force you can apply or there’s an amount of fuel you use. Now, by the way, these have names. The — this is just for fun. All right. But just to give — just to let you know a little bit about this. The infinity norm — I think we encountered this once. This is — it’s the maximum of the absolute value. So in fact the way you would say this, for example, in electrical engineering is it’s the peak of the vector. It’s the peak of the — if that is a signal, that’s the peak of the signal. And that’s an absolute value. That’s a norm and it’s also the one norm, which is the sum of the absolute values. Now, this one here tells you how — essentially how big a thruster you actually need to apply the forces. This norm actually is a very good first order approximation. For example, if you really were using thrusters to position this mass, this would be something related to fuel use because that’s generally how it works. Fuel use is generally proportional to the force that you apply. Okay. You can have more complicated things but for a thruster, that’s a pretty good approximation. Okay. Now, these are both norms like — by the way, our good old friend the Euclidian norm, in this context inherits a two at the bottom so that you can distinguish it. These are norms. These are all three norms. They all three measure how big a force program is. This one measures it by the peak, this measurers it by the — essentially the sum of the apostolate values which you can think of as fuel usage. This measures it by the sum of the squares which we often say is energy and that’s mostly to hide the fact that in fact we don’t really care about this. It’s just — this is what’s easy to do mathematically. Okay. That’s the real reason.

Now, I have a question for you. I would like to know the following. What do you think — supposed I asked — instead of minimizing this over moving a mass one meter, I’d like to know, what happens if you minimize the maximum, and I want you just to guess. What do you think is the optimal thing to do? What’s the minimum? So you can — we can call this the gentlest transfer because I’m applying the smallest maximum force to the mass. So this you could call the minimum energy transfer. That’s what we just worked out here. And I want to know, what’s the gentlest transfer?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:Exactly. So the minimum — I don’t know the level but it’s whatever it has to be. It’s gonna be this. You’re gonna apply a force, a constant force up to five. You’re gonna constantly accelerate until five seconds at which point you will decelerate like that, with the exact — with the same force. Okay. But there’s a name for this. This is very famous. It’s called bang-bang control for obvious reasons. It’s always up at the limit each time.
And let me ask you this. You all use disk drives constantly and those are — in a disk drive, what happens is the little thing is sitting there, track 23, and a signal — a command comes in to seek track 125 and you have to move it there. Okay. The — I got news for you. That’s this problem, okay? And you have to do it, by the way, in a handful of milliseconds. Once you get there, you have to get rid of all the shaking and stuff like that. You have to be tracking something within microns or less. This is serious stuff. Okay.
What do you think the current signal in a disk head drive positioning system looks like? Does it look like this or does it look more like that? I’m just — just guess. What’s that? Yeah, the answer is, it looks much more like this. Actually, it’s not sharp like that. It’s actually got a little bit of a rounded thing there because it’s a little bit more complicated, and it’s taking into account all sorts of other vibration modes and stuff like that. But basically it looks like that. Why? Because the amplifier will source or sink a maximum amount of current and the goal is to seek as fast as possible. And — so you don’t — you’re not — your goal is not to minimize the sum of the squares of the currents in your thing. By the way, if you’re worried about power, the power is closer to this in a disk drive so — okay.

Now, let me ask you this. How about this one? What if I asked you — so we worked out what the gentlest — well, I don’t know if you’d call that gentle. But the gentlest in terms of the maximum force you ever apply on the mass transfer is this — what about the most fuel-efficient? Again, just go ahead and take a guess. People in aero-astro could probably guess this. If you’ve studied satellite — if you’ve actually studied how satellites are, for example, moved back on orbit then you might know — any other — any guesses? You have a guess. What’s your guess?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:What’s that?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:You got it. So the optimal here is a giant force there. And — oh, that’s not right. There. So the optimal — the X that minimizes the sum, which is — which would be something like the fuel use, is gonna be this. It’s an impulse. Well, I mean, this is silly. It’s not an impulse. It lasts for a second. You do a fuel burn at the beginning, and then what that does is it just accelerates the mass. And then this is actually called the ballistic phase in the middle. Ballistic means it’s just moving with no forces on it other than gravity of what — in this case there is no gravity. So it’s just floating along. And then right in the last second, you apply a counteracting braking force. And this minimizes the fuel.
Okay. And by the way, you’ll see this if you actually look at a satellite or something like that positioning itself. You’ll see little puffs come out. You’ll see, like, little puff, puff, puffs come out one side and then a little bit on the other side and stuff like that. That’s exactly this so — I can tell you have absolutely no idea what I’m talking about but that’s fine. Okay.

So all of this was an aside, just to say — or if you want to learn about these things, then you’d learn about this stuff in 364, which is probably not exactly the top thing on your mind at this moment. But that’s where they — so it turns out that you can actually solve these things not with analytical formulas but it’s totally straightforward to actually work out these things. Okay. Any more questions about this? Okay.

So the next thing I want to do is connect — is make some connections between regularized least squares — actually connect least squares and least norm solutions. And the way they connect is this. Suppose we have a fat full rank matrix. Let’s imagine now a two objective problem and it looks like this. J1 is AX-Y norm squared and J2 is norm X squared. Well, the least norm solution basically requires that you be a solution so it requires AX=Y so it says, plea — it says minimize J1 absolutely to the limit and you get — and it minimizes J2. So in a tradeoff plot that’s one of — the least norm solution is one point on the tradeoff curve between these two. The other point, by the way, is X equals zero, which is not very interesting but still, it’s the other point. Okay. Now, let’s imagine doing this. Let’s take a weighted sum objective which is J1 + ?J2 like this and let’s minimize it. That’s A — this is AX-Y norm squared + ? norm X and we’re gonna let — the solution to that is A transpose A+BY. Now, what — by the way, when A is fat and someone writes A transpose A, your — first of all, your height — heart rate should increase slightly. You should start breathing sort of shallow breaths and things like that and why is that? If you have a fat matrix and someone writes, “A transpose A,” your vocal cords should get ready to cry out in protest. Your autonomic response should be triggered. What am I talking about? Do you know what I’m talking about? Yeah, good. Okay. That’s all. Okay. You should — because when someone takes — writes — has a fat matrix and writes — yeah, is that right? Yes. Then this is actually — this is the product that passes the syntax scan but is — you’re just waiting. Especially if you see that left bracket there, that’s when you should be tot — you’re like — you should be like, [Makes Noise], like that. But everything is fine here because of this. Okay. So that’s all I’m saying. Okay. So this is actually cool, although it’s very close to something that’s not cool. And it’s only cool if ? is positive. It’s really not cool if ? is zero here. I mean, really not. Okay. So now what happens is we’re gonna let ? go to zero. That says I care less and less about the size of X. Now, when ? is zero, I actually know how to — when ? is zero — if someone just walks up to you and says, “please minimize J1,” actually someone can hand you back legally any solution of AX=Y. So if someone hands you back two solutions of AX=Y and the specs actually only call for minimizing J1, that’s absolutely valid. Because someone says, “Boy, that’s crazy. Someone else gave me this solution of AX=Y where X is much smaller.” And you go, “Sorry. I checked the specs. I didn’t see any mention of the norm of X.” So minimizing just J1, there are lots of solutions and, in fact, any solution of AX=Y does the trick, big, small or otherwise. The minute you put in ? here — for example, even if it’s 10 to the -8, now there’s a difference between the two. So if you now find a solution of AX=Y with a big norm X, you’re gonna pay slightly more — and therefore, as long as ? is positive, it’s gonna come up — it’s gonna show up in the composite objective. So what that tells us is that as ? goes to zero, X? should go to X least norm. And, in fact, that’s exactly what happens. Now, you want to be super careful here because as ? goes to zero, this matrix becomes singular. So you want — that’s — you want to be very careful. That’s essentially a denominator going to zero. That’s what it is. So you’re gonna have to be very, very careful here. And it turns out, it’s not that hard to show. It turns out that for a full rank fat matrix A, it turns out that A transpose A + ? on inverse, A transpose goes to A transpose AA transpose inverse. So it actually converges to that. And it’s not too hard but it’s a little bit tricky in the sense that you don’t simply plug in ? equals zero. Because if you plug in ? equals zero, the left-hand formula doesn’t even make sense because you’re inverting something which is not invertible, okay? Nevertheless, it’s — this is the case, so okay. So that’s the connection between those two. That explains one of the points on those trade off curves. And now we’re gonna go to the parent of both least squares and least norm because it’s not bad to know it. So here is the common parent. The common parent is minimize the norm AX-B subject to CX=B. So minimize a normal — a general norm of an affine function subject to a linear equality constraint. So that’s the parent of both of them. And let’s see. So in this problem how would I reconstruct, for example — well, least squares, it’s just you forget the objectives. You just — sorry, you forget the constraint. How do I make this into least norm? What would I choose to make this a least norm problem? This thing. I’d take A=I and B=0. If I take A=I and B=0, that’s a general least norm problem because I’m minimizing then just norm X subject to some linear equations. Okay. So how do we solve this? Well, as usual we square the norm because minimizing the norm is the same as minimizing the square. And when you minimize the square, it’s nice because we have a nice formula for the square in terms of inner products. Then that ½ goes in front. Why? Because it makes all the formulas prettier because we’re gonna differentiate, basically, a square and we didn’t want the two polluting all our formulas so this is what we do. You form a Lagrangian now. That’s the objective plus ? transpose times CX-D. That’s this Lagrangian. And then we rewrite — we expand everything out and then it looks like that. So this term is from — that first term there, that cross term is from here. This term is the third term from here and then these are the two terms there. Now, one of these — that’s the gradient with respect to ? being zero just recovers our equality constraints. It’s not interesting. The other one says that the gradient with respect to X of the Lagrangian, that’s A transpose AX minus A [inaudible] is zero. That’s actually a real equation right there. Now, you can actually solve all of these equations. I’m gonna do it on the next page but it’s not pretty. And it turns out there’s a better way to do this. It’s to write it as an equa — a joint equation in both X and ?. So we’re gonna do that. This top equation is AA transpose times X plus C transpose times ?. That’s this term and this term. Equals — and then this goes over to the right-hand side and you get A transpose B. This equation, CX-D=0, well, that’s really just the constraint. I write that down here this way as C times X plus 0 times ? equals D. So you get this equation here. That’s a square matrix, but it’s a very famous matrix that comes up in lots and lots of contexts all over the place. It comes up in, like, economics and, oh, tons of areas. It — I mean, this form of matrix. Now, if this matrix is invertible, we get the solution immediately and that’s this. It’s X and ?. So both the optimal X and the optimal ? are — you get them at — simultaneously and it’s given by simply — well, obviously it’s the inverse of this matrix times that. Okay. And now, I actually strongly recommend this is — that this is the one you should keep in mind. It’s the right one. By the way, some people call this a primal dual formulation, and I can say why. X is thought of as a primal variable here and this Lagrange multiplier is a dual variable. And so in this formulation, you’re really jointly finding both the primal and the dual variables. I mean, that doesn’t matter but I’m just saying that’s what this is. Now, this will recover all of our forms. So this is the common parent of both least squares and least norm. And you can recover all of our formulas. So for example, if A transpose A is invertible, that means, of course, that A has to be skinny and full rank. Then you can get a — you can actually block solve these equations here or you can just block solve these equations. So what you do is if AA transpose is invertible, I multiply this equation by AA transpose and I get X equals, you know, AA transpose inverse A transpose B and so on. That’s here. You get this formula for X in terms of ?. Now, this form — now, you take this X and you plug it back into CX=D and you get this equation. And now you can get ?. ? is this. It’s CA transpose A inverse C transpose inverse times this thing. And now finally you go back to this formula. It gives you X in terms of ? and you get that. So actually, really, it’s your choice. You can remember this one here or that. So it’s really your choice. I mean, of course they’re the same thing. This is just working out in detail what solving a block two by two system gives you. Okay. So this is the picture. You can check, by the way, if you go back to the original parent problem here. You can check. It recovers everything, absolutely everything. So for example, if A is I and B is 0, you can go down here and plug this into the horrible formulas here — down here. If you have B is 0, a lot of things simplify, right? That goes away, that goes away. If A is I, all these things that say — they all go away. And I think — yeah, sure, it looks like — except I’m seeing a — no, I’m not seeing a minus sign. There’s a minus here and a minus there that cancel each other and you’re recovering it. So it does kinda recover all the equations. This is useful. I think we made a terrible mistake and didn’t assign any homework problems that required this. Is that true? I think it’s true that we failed to assign any homework problems that use this. But we just kept to least norm and least squares type things. But you should know this. Okay. So that finishes up all the material that will be on the mid-term. And it finishes up in fact the first, I don’t know, 40 percent of the course or something like that. So that finishes up a whole block. I’m gonna start the next material because we’re actually in a very good position. Sometimes we don’t finish the material until, like, Thursday.

**Student:**[Inaudible] least squares problem?

**Instructor (Stephen Boyd)**:Oh, how do you recover the least squares problem? Well, there’s actually a couple of ways to do it. So the simplest way is to just not have C there. And I believe this will actually — it will actually work in that case. So you make C an empty matrix, whatever that is. So, yeah, it works.
Look. If I just pretend C is — I actually can’t pretend C is zero. That actually won’t work because this matrix won’t be invertible because it will have rows down here that are all zero. So what we have to do is C is null, so it’s not even there. If C is not here, you do get this, right? You get this thing inverse times A transpose B and it looks good to me. I mean, it’s not totally straightforward but that’s the right thing to do when C is null as opposed to being zero. Are you buying that? No, you’re not. What part of it are you not buying?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:Sorry. How does what? Oh, you mean up above? Oh, yeah, that’s easy. Let’s go back to that. Oh, how did — I’ve lost it. There it is, okay. So here if you want to make this least — the least squares problem all we do is we eliminate that. That’s least squares. Okay. Now are you buying my other one? Okay. Good, great. Any other questions about this material?

**Student:**Just one more question.

**Instructor (Stephen Boyd)**:Yep.

**Student:**Isn’t there a diagram about [inaudible]?

**Instructor (Stephen Boyd)**:Yep, that’s the — back here somewhere. I’ll find it. I’ve lost it. Here it is. There you go.

**Student:**[Inaudible].
Instructor:

It was what?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:Oh, sorry. Did you mean this for the mass? No. Oh, do you mean the geometric picture? Okay. I’ll draw it again because it’s gonna be faster than my finding it. Okay. So here is — you know, the pictures are somewhat unexciting, right, because they’re generally in R2. So here’s a set of X such that AX=BY, I guess we use here. Like that. Okay. That’s all these points satisfy AX=Y. I mean, this is silly because A is actually — AA is A transpose and there is A. Okay. So that’s what it looks like. The least norm solution is the — so any point on here satisfies AX=Y. This point right here is the point of closest approach to the origin. That point actually has least norm. And that would be — this point would be X least norm for this problem.

**Student:**And how did you get the null space of A?

**Instructor (Stephen Boyd)**:Oh, and how did I get the null space of A? Well, the null space of A in this case is this. And I can do that several ways. A is A transpose. A is a row vector here. And A is a — is the normal of this hyper plane. So all the thing — if you look at all the points that are orthogonal to A, it’s this line right here, okay.
Now, there’s another way to see it. This in — this is the solution set of AX=Y. And the point there is that the difference of any — if you ask — if someone comes up with — one person has an A, and another person has an X and they both satisfy X=Y, the one thing you can be absolutely sure of is that the difference is in the null space. And, in fact, that’s if and only if. You know, so in other words if one person has a solution and it has an element of null space, you add it, you get a new solution.

So what that said, that sort of makes sense here because it says that when you’re moving in this direction, you’re really moving in the null space. And so that’s another way to understand why this — why the null space would be the same thing but translated to the origin. Okay. So my claim is you know quite a lot now. And it’s not that much math in it, but it’s not trivial. You know a fair amount. And these methods — maybe you’re convinced, maybe not. These — you can already do serious things. You can do all sorts of stuff that you could not do by some heuristic or hacking method. Just with the least norm, least squares, throw in a little regularization, a little multi objective, throw in a smoothing parameter, you’d be surprised what you could do. That’s you, of course, and computers and high quality open source software, I might add. Because you can’t do a whole lot — people did least squares before they had computers. It was not pretty. Okay. It was basically you would do these things with a calculator — I mean, with a mechanical calculator, and that’s if you’re really lucky if you had the mechanical calculator. So it was done. It’s a lot easier now. You should be glad you weren’t born 80 years go, something like that, longer, a hundred. Okay.

If there’s no more questions about that, we’ll move on and actually cover just kinda some of the boring stupid stuff for the next topic which is autonomous linear dynamical systems. So if you can go to — which is I guess what the class is nominally about so we got to it finally. Okay. So what we’ll do is I’ll just go over some of the nomenclature. I’ll talk about some of the basic ideas and get that over with.

So autonomous means that it goes by itself and that means, in fact, that there’s no input here. So what we’re missing from the general formulization is this — that’s just gone for a while. So we’ll first understand just what happens if you have Xdot=AX. It looks very simple. It’s a first order vector differential equation. And we should probably just as a warm-up, answer the following question. If A is one by one — would you say if X is scalar, let’s get this out right now. What’s the solution of Xdot=AX in that case? Well, it’s an exponential, right? It’s something like this. It’s X of T equals E to the TA X of 0. Something like that. No, no, it’s not something like — it is that. Okay. That’s the solution when A is lower case, which is to say it’s a number. Okay. So you can expect something like this to come up. By the way, the qualitative behaviors of the scalar differential equations are kind of boring. Let’s talk about them now. If A is 0, Xdot is — X is a constant. It just says Xdot is 0 so X is a constant. If A is positive, this — you get a growing exponential. And if it’s negative, you get a shrinking exponential, okay? So that’s it. That’s my discussion of Xdot=AX where A — where X is scalar, okay? There’s basically three qualitative types of behavior. They’re all kind of boring. You can’t really have anything that interesting, okay? So just file that away. Because what we’re gonna do now is you’d think if you overload this idea to vectors, how much more interesting can it be? And you’ll find out very soon. Actually, it’s pretty much as interesting as any dynamical system can get almost. There’s another level but we’ll get that later.

Okay. Now, here X of T is called the state. N is the state dimension or informally it’s the number of states. So it is slang to refer to XI as the I state. However, it’s widely used slang. Basically, you wouldn’t say that, I think. You wouldn’t write that but you would say it. Now, of course in a lot of applications like in dynamics of structures or aircraft or something like that, the Xs actually have names like, you know, X1 or they have meanings, in which case you would actually talk about that. You know, what is the YA and what is the YA rate and what is your angle of attack and all this kind of stuff, your altitude. So in that case, of course, you would — it’s ok — well, it’s still slang but you would talk about that — those as individual states. Okay. So N is the state dimension or the number of states. A is called the Dynamics Matrix. By the way, in lots of different fields it’s got a different name. Let’s see. I was just talking about aeronautics. So what is A called in aero — there’s somebody — there’s a bunch of people here in aeoro-astro. What is A called when this is a predictational model of a flight — some steady state flight? You know, I mean, the entries of A have — A has a name and the entries have names.

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:That’s it. So in that case the entries of A — in that case A is called the matrix of stability derivatives. I am not sure where that came from except that indeed it will depend — the entries in that matrix will determine whether the flight is — that flight mode is stable or not. So they’re called the stability derivatives. And I guess it’s obtained from linearization of a non-linear system, so that would explain the derivatives, so okay. And other fields have other names for it. In circuit design it’s called the small signal dynamics matrix or I don’t — who knows. But anyway, lots of fields have different names for it. Okay.

So here’s a pictures. It’s very stupid and extremely useful. It’s this. So here’s your state at X of T. And it’s very useful to do the following. Of course AX of T, it’s just a linear and basically A maps X — basically where you are into where you’re going because X is where — essentially where you are in state space. Xdot is where you’re going. So A maps X into Xdot. Oh, by the way, what are the physical units of A? Assuming let’s say all the Xs are in, you know, some common units. Let’s just leave it that way. So all the Xs have some units which are irrelevant. What are the units of A?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:It’s inverse seconds, exactly. It’s a frequency. It’s a rate. That’s what it is. But, I mean, this is kind of obvious but that — so A is a rate. A is an inverse seconds. I mean, depends on the units in X, but generally it’s an inverse seconds. By the way, that means that big A and this is a fast system and small A is a slow system. I guess this is kind of obvious, so I’m gonna move on. Let’s go back over here.

So Xdot which is AX is where you’re going. And it’s extremely useful to take that vector and to glue its base to X. And so you have a picture like that. So if you’re over here, AX might point in that direction, okay? And if you’re over here, AX might point in that direction, okay? And what it says — it does not mean, of course, that X is gonna be traveling along this line. What it means is that along the solution of X at this point whatever that curve is, it’s tangent to this line. And the length of that line gives you the actual speed at that point, okay? This is kind of obvious. All right.

Now, if you draw a picture of Xdot for a whole bunch of points X in a plane, you get a picture called one — oh, this — so there’s a name for this. Actually, it’s a vector field, okay? So that’s both, by the way, a mathematical description. That describes something which on some set at each point gives you a derivative on the set. That’s formerly a vector field. So, in fact, Xdot=AX, you would actually call in mathematics a vector field, okay? But it’s also used informally to mean something like this where you have a field of points and it — and sort of at each point conceptually — of course you don’t draw it at each point. You draw a little arrow that gives you a rough idea of where you’re going and how fast. Okay. So this is the example for Xdot=-1021X. You — we can check things. But the cool thing about this is when you see this vector field, you can actually start visualizing the trajectories. That’s actually very important to understand really what’s going on. So let’s see what it says. It says if you’re here, you’re moving up and to the left and you’re moving at a pretty good clip at least compared to over here. So although you’re not gonna end up here, you know, you don’t know where you’re gonna end up but it might be like here. And you can see now that you’ll actually keep moving up. You might even — it looks to me like it’s even accelerating. So you can imagine a point starting here as actually kind of moving up like that, okay? On the other hand, if you’re sort of over here, if you start here, you can sort of imagine now various things. You know, you might slow down. You’re not gonna actually hit zero. You’d slow down a lot, and then it looks like you might actually start accelerating as you go along there, okay? So these two are just the kinds of things you would get. And by the way, if you ever have a system and you want to quickly figure out what it does, you need to look at pictures like this. It only works in two dimensions. Actually, it depends on your visualization skills. You could probably do this in three, but it would be tricky, I guess. Okay.

Here’s another example. Another little baby two by two matrix, and in this case it’s this. You will later come to understand that you’ll look at that matrix so — the same way but so far you look at just a matrix and you know what it means in terms of its input, output entries, right? If I write a matrix down, there’s zeros, you know what it means. If there is large entries, you know what it means. If there’s negative numbers, you know what it means. In terms of just how the input affects the output. So that much you have. That should be wired into you by now. You will actually develop something like that for dynamics matrices. So certainly for two by twos and three by threes you’ll start getting a real — very good idea. You’ll look at that and get a rough idea. There’s gonna have to be some complication to really know what happens but that’s the idea. So here’s the vector field here and you can kind of get a pretty good idea for it. Here it looks like the trajectories are kind of elliptical. Now, I’ll tell you what you can’t tell by your eyeball here is — unless you were super duper careful. You can’t tell if the trajectories are actually — are they winding in or are they winding out? You’d have to really kind of trace this very carefully and figure out if — when you kinda come around one cycle, you’re bigger or smaller than you were before. Okay. So that’s, I think, not obvious from here. It will be very obvious to you in a week as to how to do that. But that’s the idea. Okay.

Now, another very useful thing is a block diagram. So you can write Xdot=AX this way. By the way, it’s done not with differentiators but with integrators. So that’s — and there’s historical reasons for it. Well, I’ll tell you what the historical reason — actually, does anyone know the historical reasons for it? It’s entirely likely that you’re all too young to have any — this is in the deep — this is — we’re talking slide — we’re talking before slide rules here. Anyone here ever use a slide rule? Cool, zero, you did. That is so cool. Did you do it as a joke or, no, you really used it?

**Student:**Well, it was my dad’s.

**Instructor (Stephen Boyd)**:It was your dad’s, well, there you go. So all right. So but still it’s cool, though. Do you actually know how to use it?

**Student:**It’s probably [inaudible].

**Instructor (Stephen Boyd)**:Cool. That’s about the right — that’s about how it should stay. Okay. So I can tell you the — I’ll tell you the historical reason for this. So first let me just say what this is. This is a vector. These are vector signals and it’s sometimes common — I guess this is from digital circuit design — to take in a signal flow graph to put a little note with a line through it. I don’t know why this tradition came up.
And this tells you the dimensions. So that’s a vector signal with N components, X of T. It goes into A so what comes out here is AX. And that goes into — 1/S is actually — you really should write that as I/S because this is — so you would interpret this because it’s a vector signal in and vector signal out as a — you would actually — the slang for this on the street should be a bank of integrators. That would be the slang for this. Because it I exploded this out and showed the individual components, it would really look like this. It would look like that. Let’s say if it’s two by two. So that’s if I clicked on that box and asked for the detail, I would get this, okay? So it looks like that. So it would be a bank of integrators. These are now scalar integrators here. Okay.

And now, let me get to Y integrators. So nowadays you will soon see how to actually solve the equation Xdot=A of X. It won’t be surprising to you that you can work out the whole trajectory for X1000, 2000. I mean, these are just enormous systems. Just immediately on a laptop. I mean, 2000 is not immediately, all right. But a thousand, even 500 is extraordinary, okay? So a 500 state model will model a lot of things. I mean, that’s actually a fairly detailed structural model of a lot of things. You can actually just solve Xdot=AX. It’s nothing. It’s gonna be two lines of code, something like that. If that — it’ll run on a laptop, not yet a phone but that’s coming. It’ll — and just get it so it’s like — it’s sort of like Lee squares. For you it’s nothing, it’s a backslash, right? For your parents, it was much more complicated. It was a half day of coding Fortran. Don’t even ask what your grandparents had to do to do Lee squares. They — maybe not your exact grandparents but somebody’s grandparents did it and it wasn’t that cool. It was mechanical calculators or sheets or slide rules. Lots of people in rooms. So it was done. So all right. Back to this. In the, I don’t know, in the 20s, 30s — actually even earlier than that. I think this — anyone know? You want to do a Wikipedia on differential engine, Vannevar Bush, differential engine, differential analyzer, there you go. So that actually — I believe it might even be late 19th century. So in the late 19th century it was already recognized that Xdot=AX was an — was that if you understood what the solutions of that did, you could actually say a lot about how a machine or something like that was gonna work. That was all or how — 19 what?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:Well, I was off as usual. It’s a good thing I’m not in the history department. But I’m allowed to — I got it vaguely right. It was a long time ago, so 1927. So in 1927 — oh, but maybe that’s the — is that the mechanical one? Okay. So this guy built a mechanical system that will actually give you the approximate solution of Xdot=AX, okay? Not long after that people built vacuum tube computers like analog computers. This means nothing, thank God, actually, to anyone. Nothing, no one has even heard of this. That is so good. Usually — you’ve heard of it? That’s so good. Did you actually see one? No, okay. That’s too — I should bring in some pictures just so you know how lucky you are now. Yeah, what’s that?

**Student:**[Inaudible.]

**Instructor (Stephen Boyd)**:Yeah, they’re typically in basements now or storage closets. Yes, that’s right. You saw —

**Student:**[Inaudible.]

**Instructor (Stephen Boyd)**:Yeah, sure, they really used them. Okay. So what it was was this. It was an electric — you had a big patch panel and you had electronic integrators. I guess anyone here in electrical engineering knows how to do that with an op-amp and a capacitor in the feedback loop, you get an integrator. And you had a big — you had a whole bunch of integrators and then you had a little like banana plug things and you could plug these up and you could wire them up. They had little gain units that you would dial in.
They’re really quite beautiful. I — actually, I never touched one so — just so you know. So — and you dial in little gains and things like that and you’d have a whole — so how would you actually program this analog computer? You’d do it by actually physically hooking wires up between these things, okay? And then there’d be a big button and you — a big button and you’d press start. And also the red lights would come on meaning that things just overflowed their ranges, right? And that either means you messed up the programming which in this case literally means plugging wires in or it means you probably shouldn’t build that aircraft. It means one or the other. You’d have to figure out which it was.
Oh, and the way you would — the way if you had like a class like a homework exercise on the analog computer, the way it would work was actually kind of cool. You’d have gra — your program would be this big thing like this with all your wires on it. And you would detach the whole thing and then walk around with it. And then the other — another student would come in and plug theirs into the analog computers.

Are you at least a little bit grateful now about when you were born and stuff? I mean, I hope so. Yeah, I’ll take that as a small sign of gratitude. Watch out because provoke me and homework eight, analog computer. And if you don’t think I can find one on E-Bay, you are wrong. Okay. So all right. What’s that?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:Oh, take a bite. I’m gonna take that as an open challenge. You —

**Student:**It is.

**Instructor (Stephen Boyd)**:Okay, cool. No, you know what we’ll do. What’s that?

**Student:**[Inaudible].

**Instructor (Stephen Boyd)**:Oh, okay. Yeah, actually the T.A.s have to take this course on, you know, ethical actions and this, so Jacob is exercising his right to not be involved in such an escapade. But that would be great. No, maybe we’ll do it with like white proto-boards and op-amps and capacitors. That would be great. Okay, all right. But it’s noted that there’s been a challenge. All right. Back to this.

So there — the reason was that you find integrators here is because of this. Oh, and by the way, I guess you mentioned — someone, you knew what the story was. So this was used in 1939 through 1945 at M.I.T. They put these things together. They weren’t quite linear. They actually had other terms in. In fact, it was just basically, you’d fire a shell and wanted to work out firing tables. And a firing table was for a certain shell, if you fired at this angle and there’s a wind in a certain direction, the question is where does it land? And you would solve a difference equation like this. It wasn’t quite linear. It had one non-linear term in it. And this was done, in fact, in secret in a basement in M.I.T. with analog computers. And then results, you just tabulate — just ran all the time and they’d worked out the things and when you wanted to use it, you checked the wind, figure out, you know, go in the table, find the range and find out you should elevate it 22.63 degrees. Okay. So that’s what this was. Okay. All right. So that’s a — just a historical comment about why you see integrators and why this block diagram would strike fear into the hearts of your parents and grandparents if they did this kind of thing. But not you. So — and why — that’s why you should be grateful. Okay. All right. Okay.

So if you draw a block diagram out, if you explode the block diagram of A, you can actually get interesting information. Here’s an example. Suppose you have Xdot is AX where A is block upper triangular. Well, by now if you just see this — if you see Y=AX and A looks like that, you know exactly what it means. Without even thinking you would say, hey, how interesting. The bottom half of — let’s suppose that’s Y. You’d say that the bottom half of Y doesn’t depend on the first half — it doesn’t have to be half, of course — but the first part of X. That’s what you’d say when you see that. But now, that’s the derivative which is actually more interesting. So you read this equation this way. You’d say something like where the bottom half of X is going, that’s English for X2dot, doesn’t depend on X1, that’s the zero. Okay. And when you draw the block diagram it’s super — it’s totally obvious because you draw it this way. Here’s X1. X1dot is A11X1+A12X. Oh, I didn’t say something here. The rule here is this. You want to know how do you get X1dot if on — if all you have are integrators. You look at the output of an integrator and you ask, well, what went — if that’s an integrator and what came out is X, what had to go in was Xdot. So that’s how you do. So you simply — you go backwards through the integrator if you want to get Xdot. So the inputs to integrators are derivatives. So this is X1dot and this is X2dot. And this says X1dot is A — it’s a sum of two things, that’s what the summing junction does. It’s A11X1+A12X2. Now, when you stare through this block diagram, something exceedingly obvious comes up and that’s this. If I draw a dashed line like that, you see something really interesting and that is that information flows from the bottom to the top but not vice versa. So nothing that happens up top ever has an effect on what goes on down here. Okay. And well, and basically it says X2 affects X1 but X1 has no affect whatsoever on X2. That means all sorts of interesting things. We’ve concluded things like this. It says that X2 — you can actually calculate the solution of X2 separately because it has no — it is in way affected by X1. That’s what this says and that’s what you get out of looking at that equation. It’s kind of — well, we’ll see lots of other ways to do it but this is kind of the idea of the way to get the intuition for how this works. Everybody see this? So that’s the picture here. So let’s see.

Let’s look at a couple of examples. I think I’ll just look at just one, which is a linear circuit. So here I have a cir — a linear static circuit. Now, that means it’s a circuit that can contain things like resistors, transformers. It can have, oh, let’s see. Well, it depends on your model of a transformer. If it’s an inductive model you have to put it out here. So we’ll skip transformers. But it can have things like dependant sources and things like that. So that’s what’s in here. And I pulled the capacitors out to the left and the inductors off to the right. And the equations here are very simple. It doesn’t matter if you’re not in E.E. and don’t know these equations. So that doesn’t really matter. It’s just an example. So here I’m gonna — the equations here for each capacitor are this. It’s CDDBT is the charging current and I’ve drawn the charging current to go into the capacitors like that. For the inductors, it’s the same thing. It’s LDIDT is the charging voltage. So for an inductor — again, I’m addressing people who do E.E., right? For an inductor, you think about voltage as charging it. When you apply voltage to an inductor, it ramps up the current. When you apply a current to a capacitor it ramps up its voltage. Okay. So you get these equations. And then this thing is some horrible complicated thing. But the point is, it’s linear. So it’s a set of linear equations that relate — these are the port variables, the voltage and the current and the voltage and the current at these ports. That’s called a port when you hang two wires out of a circuit. It’s a port. Okay. And there’s a linear relation that covers these — the voltage and currents at the port, the port variables. And we’re gonna write that in this way. We’re gonna say that the inductor — sorry, the capacitor’s current and the inductor voltage — actually, these are the charging — these are basically the charging variables is some matrix times VCNIL. So we’re gonna write it that way. All right. And we’ll let C be a diagonal matrix with these capacitors and L, this thing, so that I can write these out as matrix equations. And if you have state, VCNIL. So the state is the voltage on the capacitor and the inductor current. Then you can write out everything here as — it’s very simple. It’s — this is CVdot here is IC and this is LIdot — uh-oh, that’s hard to make a dot and make it clear — equals VL. And you simply put those equations into here, take C inverse on the left-hand side and you get a set of equations like this. Okay? So this tells you that you can write out and this is, of course, an autonomous linear system. That’s AXA is this matrix here. Okay. So — by the way, this is already of huge interest. It says that, for example, again, this is addressed to people in E.E. It says, for example, if you have an interconnect circuit and some leading edge, you know, 45 nanometer design it says that if you want to analyze the interconnect in some digital circuit, in some high performance circuit, which you can model as — certainly with some inductance, capacitance and resistance. It says you write that as Xdot equals AX period. That’s what it does. So that means it’s already of extreme interest in practice to know what the solutions of this do. Okay. So we’ll quit here.

[End of Audio]

Duration: 74 minutes