ConvexOptimizationII-Lecture16

Instructor (Stephen Boyd):Thatís okay though. I see it puts a nice emphasis on, I mean, itís probably okay to say hereís our problem, actually, itís an approximation. Hereís the relaxation. Then when we run the relaxation, of course, on the real thing you get one thing, but then Ė actually, when you run it on the true system the relaxation thing does better than the other one. Thatís not uncommon.

Student:Yeah.

Instructor (Stephen Boyd):Perfectly okay.

Student:Yeah.

Student:Itís Kreloff.

Instructor (Stephen Boyd):Itís what?

Student:Kreloff.

Instructor (Stephen Boyd):Kreloff. I just spent five days with a bunch of Russians. I guess Iíll have to remember that for next year. The left monitorís going to be right to you that Iím on. Does that mean Iím on? Something about this room, I donít know. At least I havenít said anything too horrible while being taped or whatever. Okay.

Today weíre jumping around a different order of topics, but this is maybe, I think, the next topic that some people are working on, itís obviously too late for them for their projects, but we can at least cover the material and for people who are doing this at least itíll make a lot of sense. For other people, itís actually very, very good stuff to know about. Itís widely, widely used. So itís called Model Predictive Control. In fact, Iíve been reading a lot about it the last couple of days. To sit through very long airplane flights, read a couple more books on it. It has got tons of different names, all different. Basically all the different areas doing this donít know about the others.

Often the notation and stuff is absolutely horrible. So, anyway, hereís the basic idea. Weíll get this in a minute. Itís basically, I mean, thereís a bigger area. Model Predictive Control is the name that comes from a relatively small community. In fact, I read one book and heard, I guess from some sort of member of this community, as referred to as the control community. That was as opposed to some others. Other names would be used. One that is a very good name in a sense is embedded optimization, real time optimization. Actually, the way that fits in the future, I think, is very, very interesting. A lot of people have their view of optimization locked into a model thatís from 1969.

Either because they learned it in 1969 or they learned it from someone who learned it in 1969. They think of optimization as this big, complicated, very big thing that requires several Ph.D.ís, a bunch of super computers, and things like that. You really wouldnít use optimization except maybe to, I donít know, design a nuclear weapon, design the newest airplane, schedule United Airlines for tomorrow, or who knows what else. But these are the kinds of things you Ė or price derivative in the basement of Goldman Sachs. These are the things people think of when you think of optimization. You all ready know that a lot of these things actually are extremely reliable.

Theyíre reliable enough to just work always, period, end of story. Theyíre also very, very fast. By the time you say fast and reliable then it means youíre automatically talking something candidate for imbedding in real-time systems, right? Because these, in fact, are the only two qualities you need to imbed in a real-time system. It has something thatís gotta be fast, have bounded run time, and it has got to be extremely reliable. It canít require a Ph.D. to come over and say, ďOh, your algorithmís jammed. Let me change the number here. Let me restart it from a different point.Ē You can do that if youíre doing some big, big problem where itís important to run. Thatís fine. But if youíre making a decision every 2.6 milliseconds you obviously canít do that.

The bigger name for this lecture would be imbedded or real-time optimization. Okay. So letís start. I think itís a very interesting area. I think youíre gonna see a whole lot of it more in the future. Letís start with an optimal control problem and thereís some very nice subtleties here and itís quite complicated. The idea, although, unfortunately, when itís taught many times the subtleties just arenít seen, but theyíre actually very important subtleties. So Iím gonna go over them a little bit because it actually takes some time to really understand what it means. So hereís what weíre gonna have. Weíre gonna have a linear dynamical system, so thatís x(t+1) = Ax(t) + Bu(t) and this u is gonna be our input or control or action.

We start from a state z. These input and controls have to be in some set U. The Xís have to align some set script X and our objective is a sum over the entire trajectory of functions, this is called the stage cost or something like that, of this state and the control. Now, this is very traditional and youíd find this in like a vehicle control if youíve taken a course on control or something like that, but, in fact, this describes tons and tons of problems, including supply chain management, things like scheduling and networking in networks and computer system. This just goes on and on and on. A lot of problems in finance look to be exactly like this. I mean, some donít, but many do. Even though Iím using the traditional symbols that you would find, for example, in the course in AeroAstro describing vehicle control, this can just as well be a finance or economic system or something else or a supply chain system.

So donít let the notation of the, as this book I was reading said, the control community fool you. This is much more general than controlling your local robot or something. Actually, it doesnít actually include your robot because the dynamics are linear. Okay. So the variables are an infinite number of inputs and we can also treat the states as variables too. In which case, we think of these as equality constraints. Now, the other option is to not think of the states as variables and to think of this as merely a recursion, which relates the Xís, which are expressions, and, in fact, affine expressions to the variables, but these are actually equivalent. Okay.

Now, the data to the problem are this dynamic matrix A, this input matrix B, this stage cost function, and weíll assume that itís zero at zero. So, in other words, if youíre at zero thatís some costless point or something like that. Weíll assume that these are convex and they include zero. You can actually change all of these assumptions pretty much and it just makes things more complicated. Okay. So the first thing one can have is something called greedy control and itís sort of the most obvious thing. It works like this.

What you do is you simply pick the input U to minimize the current stage cost here, thatís what this does, over your allowed control actions and the only part of the future you intend to take into account is the following: Because of your action W, the next state is gonna be Ax(t) + Bw and the only accounting of the future youíll take care of is that this must be in this state constraint set, so thatís Greedy Control. The basic idea is that it minimizes the current stage cost and it ignores the effect on the future. Well, except for this. It says that you canít do anything that would get you into serious trouble, namely violating the state constraint, on the next step.

As to whether or not you have violated the next step, thatís entirely possible and too bad. Now, this method typically works very poorly. Of course, we can make it work very well, no problem. If A here, for example, has a very small spectral radius or very small norm it means that the X at the next state doesnít depend much on the current state and then that tells you that basically the dynamic system has what people would call something like strongly fading memory. That means that, in fact, Greedy is not too far from optimal because although your current action does have an affect it propagates into the future the affect goes down fast. For example, if the norm of A is point one that would certainly be the case here. The Greedy Control would probably work well. Okay.

If you havenít seen optimal control in dynamic programming Ė actually, how many people have seen this somewhere? Good, thatís good. Thatís how many people should have seen it. I mean, this is sort of like 48 transforms. You need to see it maybe three times minimum, maybe four. Maybe itís more important actually, but I guess thatís open for argument or something like that. You should see it three or four times. So this will be your, whatever, third or fourth time for some of you. If itís your first time thatís fine too. Hereís the way it works. You define V(z) as the optimal value of this problem right here. It is the optimal value of this problem. Thatís an infinite dimensional problem.

The variables here are the Uís and the Xís. It is infinite dimensional. It has an optimal value and itís a function of Z. Thatís an infinite dimensional one, but the optimal value of a convex optimization problem as a function of the right-hand sides of inequalities or equalities is convex in those parameters. The optimal value of this function is a convex function of Z. By the way, you know that rule very well. You know that rule in specific context, for example, the minimum of a quadratic function over some variables is another convex quadratic function. Itís given by the sure compliment. So thatís a very specific case where you can say much more about it.

In general, you know itís just true. Okay. So we formed this function V and thatís called a value function. I should say that, there seems to be a weird Ė once again, MIT and other places split in typical notation. So some people use J as the value of the Ė Iíll have to hunt down the source of this. J is the objective and then a lot of people use then V for the value function and at MIT itís reversed. They donít use V, this is J. So if you look in books from MIT or something the value function is called J. However, it is always called the value function or the Bellman function or, I forget what the Russian name for it is, but I think the Russian symbol is V, so anyway. Just a comment on notation.

This is convex. Itís actually extremely easy just to show it directly. The important part is it satisfies Bellman or dynamic programming or Hamilton-Jacobi equation, many other names, Punctreargon equation. It says this, itís actually quite interesting. Iím not gonna go into a lot of the details, but it makes perfect sense and, in fact, itís extremely easy to show. It looks deep. A lot of people want you to think itís very complicated and deep and itís actually not. It basically says the following: It says that if you want to make the right choice right now from state Z hereís what you do. Well, when you take an action Z youíre gonna run up the bill l(z,w). Thatís immediate.

So the extremely greedy control, would say this: Choose W to minimize this. That would be the extremely greedy control. Well, thatís not good because you have to at least take an action that will land you in an allowed state next step. Thatís the first plausible greedy method. As a result of choosing W, hereís where you land: A(z) + B(w). So far weíve all ready said thatís feasible, thatís okay. But the question is how should you take into account the affect of your current action on where you land? Well, in fact, if you imagine where you land, if you then simply carried out the optimal control from them it couldnít get any better than that.

It turns out this is what you should do. You should add to the current cost, the current bill you run up, plus the bill you would run up for landing in this state and thereafter doing exactly the right thing. So thatís what this says. It says if you add these two together and minimize over W then this will actually equal V. Itís the optimal thing and not only that, but the argmin of this thing is the optimal U, right? It looks complicated, itís really not. If you take an infimum, you can minimize in any order and it doesnít make any difference, right? So thatís really what it is.

So this is the dynamic programming equation. By itself it says absolutely nothing, doesnít help in the slightest because then youíd say so whatís V? What is interesting is that in some cases you can actually find V. Once youíve found V you actually have the solution here. And thereís one thing I want to point out about this, this optimal input is, in fact, a function of the state x(t) because although itís complicated thatís a function of the current state. So thatís what it is. In other words, that the optimal input has to form some function phi of x(t). State feedback form would be the common way to say that. Thatís quite interesting.

And here, as I said, the interpretation is this: That this term V(Ax(t) + Bw) is very important. Itís basically the accounting term needed to completely accurately account for the effect on the future cost of your current action, so thatís exactly what this thing is here. Okay. So one famous case where this actually leads to a numerical solution, but basically a solution, is this: You have linear quadratic optimal control, so the stage cost is X transpose QX plus u transpose RU. Itís easy to add a cross term between X and U and itís easy to add a linear term and all that kind of stuff and this is just like exercises. Q is traditionally positive, semi definite and R is positive, definite. Thatís just to keep the formulas simple. Okay.

Now, in this case you can actually solve the problem by dynamic programming. It turns out the value functionís quadratic. Thatís hardly a surprise, right? Because you basically have an infinite dimensional quadratic problem, an infinite number of variables. You know that for a finite dimensional quadratic function if you minimize over some variables the result is quadratic in the others. In fact, the result is a sure compliment in the others. You know that all ready. This merely extends that to the infinite dimensional case and, in fact, sure compliment is the right thing because itís very, very close. In fact, if you stare at this long enough youíll see a sure compliment here. I should have written it that way. I donít know why I didnít. You could take the A transpose and the A out of these two terms and you will see, I donít know how good your sure compliment visualization skills are, but you should see a sure compliment.

Anyway, itís plausible, right? Because a sure compliment is blah minus something, something inverse, something transpose where the third something was the same as the first something or something like that. Anyway, you know what I mean. So, anyway, thatís a sure compliment, so it shouldnít be surprising. The value function is quadratic in this case and, in a sense, it is a sure compliment of an infinitely big positive quadratic function and you minimize over infinitely many variables leaving just N left and the result is, in fact, a giant infinite dimensional sure compliment, but as an explicit solution. This is the Bellman equation now written out. Because theyíre quadratic forms it just says the two matrices are the same and you get this.

Now, by the way, thereís no reason to believe that you can solve this. Itís a quadratic matrix equation, so thereís lots of ways to solve this. One just by iterating and things like that, so Iím not gonna get into that, but this equation can be solved quite effectively. In this case, the optimal policy is a linear state feedback. It just looks like this. U* +k(x(t)) where k is this thing here. You can actually see all sorts of various interesting things here about what happens when P gets big and when itís small and all that kind of stuff. You can actually work out a lot of the behavior. This thing, I think a bunch of people have Ė actually, how many people have seen this? Iím just sort of curious. Okay.

So, if youíve seen this you probably know this is extremely widely used. Even in cases where systems arenít linear and so on like that and it is unbelievably effective. I mean, state feedback gains like this just work really well. In particular, itís sort of like least squares. I mean, itís almost like magic in the sense that when you synthesize state feedback gains like this Ė I mean, this is not interesting if this data is small and U is small, but if youíve got 20 dimensions and five inputs thatís 100 numbers in there and 100 numbers just come out just like calculating a pseudoinverse and then doing image processing. It would take you a long time to adjust those hundred numbers by hand to get something that even works let alone works so well. This is widely used. Okay.

Now weíre gonna look at this finite horizon approximation and this is quite straightforward. What we do is instead of solving this infinite dimensional problem what weíre gonna do is weíre simply gonna run it out to time capital T, thatís some horizon. What weíll do is weíll insist that when you arrive at the end of this horizon you drive the state to zero. Now, once the state is zero you have the option of putting in input zero, zero, zero after that. Weíre essentially looking at a finite dimensional subspace of possible inputs. The inputs go out to a certain amount of time, they drive the state exactly to zero, thatís this. Thereafter you apply zero to the whole thing.

So this would give you a valid input on infinite horizon. Thatís what this is. Okay. Now, then you have a finite dimensional convex problem. We can solve that easily and this will give you a sub-optimal input for the original optimal control problem. Why? Because itís basically youíre looking over a finite dimensional subspace. Itís the subspace of signals of inputs and states that banish after time T. Thatís a finite dimensional subspace. You solve the problem over subspace you certainly get an approximation. You get an upper bound. Letís just do an example of a system with like three states and two inputs. It really doesnít matter.

Itís just completely random. Just quadratic stage cost and weíll have a limit on the state. None of the components can be bigger than one. Same for the inputs, they have to be less than point five. Who knows why. And then hereís our initial input and the optimal cost turns out to be 8.83. If youíre curious, I can tell you why thatís not an approximation. Thatís the exact number. Iíll explain it in a minute. Not that it matters, but let me get back to that in a minute. So hereís the cost versus horizon. In this problem it turns out it takes ten steps. Notice if you say get to the origin itís infeasible for ten steps because you have some state, it propagates forward in time, you have an input, which is limited between plus/minus point five. Thereís this limited amount you can do.

By the way, those of you who are in 263 I should say that I made fun of it then, but Iíll make fun of it even more now. This very traditional idea of like controllability and all of this. I donít know if you remember this. If youíve blocked it out that was the right thing to do because, I mean, itís perfectly okay. Itís the first thing you ask. Can you wiggle the input to take the state to zero or something and you get these kind of silly things involving ranges and all that and thatís all fine, but the real question is not can you maneuver the state to zero in capital T steps? The real question is, can you do it with an input signal thatís less than minus point five? Thatís actually a real question. That oneís of interest and no stupid theory of controllability is gonna tell you that. Okay?

In my opinion, this is the real theory of control. This is the first theory thatís actually useful. This is with b, a, b, a squared and b and all that. The Kreloff subspace, as we now know it. Being full rank and everything, I mean, I donít know. Itís classical and, as far as I know, it doesnít really have any actual use, but okay. So hereís a real controllability analysis. It tells you can you hit the origin. It says that for even in nine steps you canít do it. Itís infeasible, so the cost is plus infinity. In ten steps you can do it at a cost of 11.5. Then in, I donít know what this is, in 11 you can do it down this way. You see something here that this is converging very, very quickly. This is very common.

Itís not universal. I should say, obviously this is monotone decreasing, but, in fact, thereís a lot that would tell you that this thing is gonna converge very quickly. In other words, it makes sense because the optimal inputs, Iíll show you what they look like. Hereís the actual infinite dimensional optimal input and you can see very carefully what happens here. The input looks like this, but then after a while itís dying down like this and then thatís dying down like that and, of course, this is falling off exponentially. Here itís not gonna take Ė if I tell you, in fact, you have to be zero or T=30 it takes very little perturbation of this input signal to drive this state to zero here and not have it be merely small. Takes just minor perturbation.

Thatís why these things converge very, very quickly. Okay. So this is the cost versus horizon. Actually, people who have had these courses, I donít know, 207, 210, anyone know how we calculated that line? This is just for those who have the background. I mean, look it would be perfectly okay to just do this, actually. Perfectly normal would be to solve this for T=50 or 100 and numerically say thatís it. Thatís perfectly acceptable. But, in fact, you can calculate it exactly. I mean, up to this numerical round off here. Do you know how we did it? Do you know?

Student:[Inaudible].

Instructor (Stephen Boyd):This is with constraint. Yeah. So hereís how you do it and this is just an aside, but just for fun if you have the background. You can show in a problem like this that you can compute an ellipsoid in state space such that if youíre in the ellipsoid the optimal control is, in fact, coincides with the linear one because youíll never saturate, never in the future. Therefore from then on itís given by an explicit formula. In fact, itís this formula right here. All right. This is an advanced topic, so just ignore me if you donít know what Iím talking about. Also, Iím going the wrong way here. There we go. So itís given by this formula and once you have that you can actually evaluate the cost to go by solving the Lyapunov equation. Okay.

If you knew what I was talking about, fine. If you donít, just ignore what I just said. Okay. And if you look here you can see all sorts of things. This is the smallest amount of time you can possibly drive the state to zero and you can see this is really quite different. The input is quite different although, actually, curiously it doesnít start off that differently here, right? I mean, it basically says Iím just showing you U, but I think maybe thereís several Uís. I forget and I donít know why weíre showing two inputs. Okay. So this is one of the Uís. I donít know which it is. Weíll fix this typo. Itís either U1 or U2. I mean, not that it matters, right?

You see it does something and then it pulls back. This is all because you have to drive the state to zero. There are two other states as well that get here from the zero up here at ten. Okay. All right. Now this brings us to Model Predictive Control. This is something everybody should know about this. Unfortunately, itís considered part of control. Itís not even mainstream control because itís considered part of industrial control. I think itís really stupid. Itís incredibly effective for all sorts of things. It goes like this. Hereís the way itís gonna go. There are some subtleties, so Iím gonna take some time to do it right.

Itís at time T and hereís what you do. You solve the following problem. You have a planning horizon capital T. From time T forward, capital T periods you are going to assess a cost, you will Ė all of this is essentially a planning exercise and itís gonna work like this. Youíre gonna propagate forward the dynamics of the system. You are gonna start at the current state. So youíre in a current state at X=T. You will do a planning exercise, which will actually propagate the state forward, capital T samples. You will insist that capital T samples in the future you must hit zero. By the way, there are variations on this so-called terminal constraint, but the easiest is just to put that there.

Youíll do this planning exercise. Literally, at time T you will work out an optimal control that would take the state to zero in capital T steps starting from there and so on. By the way, at this point the Uís here they constitute a plan for the future. If you were to execute that plan, then the state would indeed be at capital T plus capital T it would go to zero. The clever part is this, youíre gonna do a complete planning exercise and youíre simply gonna use the first input. Then youíre gonna step T forward one unit and youíre gonna re-plan, again. Your horizon has just moved and thatís how this works. Itís always like capital T is 30. Youíre always planning to drive the state to zero in 30 steps at all times. Thatís the way itís working.

By the way, another name for this is receiving horizon control, finite look ahead control, dynamic linear programming because in a lot of supply chain problems these are linear or piecewise linear, so itís got lots and lots of names, including it probably comes up in lots of places where it doesnít have a name because itís kind of so obvious. Okay. So thatís the idea. Now, the fact is this is, actually, also a state feedback. Itís a fantastically complicated state feedback. Well, itís not that complicated. To tell you the truth itís piecewise linear. Weíll get to that in a minute. It is quite complicated because it basically takes a bunch of data, namely x(t) and solves a big QP in order to calculate a full trajectory.

So Iím just trying to think if I should draw a picture. Well, why not. I mean, itís the basic idea. I guess itís kind of obvious, but just to make sure. The way it works is hereís time and youíre right here. What you do is you make yourself a plan of input, something like that, that will take a state to zero at this time. So this is the plan that you execute at this time step. You then simply execute this. Thatís the U that you actually apply. You then step forward one to this point, but now your horizon has extended. There are other versions where it doesnít. Your horizon has extended here and now thereís no reason to believe that youíll do this. If you did this, you would actually take it to zero, but now, actually, whatís happened is your requirements have relaxed. You had 30 steps to take the state to zero before.

You step forward now and now instead of 79 itís 80 you have to hit and, in fact, this one might relax. It might be I wonít be so aggressive. Itíll be slightly lower, so you can actually watch this, and then it would take it to here. Okay. So this is the basic idea. Itís complicated. I mean, itís really interesting. In this case, you would solve a convex problem every step. For that baby example we have, hereís the way it works. Well, this is the MPC compared to the optimal. You see something thatís totally insane here, which is basically if you tell this thing plan ten steps in the future Ė by the way, if you ask it to plan nine steps in the future it canít even hit. Itís not even feasible.

So youíd say plan at all times I would like you to act as if I want you to file a full, legitimate plan that ten steps in the future would take the state to zero. Hereís how well it works compared to optimal. But if you say like 11 or 12 itís basically optimal. Yeah?

Student:So the way the graph is drawn it seems a little bit unfair to put them both on the same horizontal scale because one is [inaudible] than the other isnít it?

Instructor (Stephen Boyd):Yes. Right, right. Yeah. We do have to explain a few things. This one was calculated, as I said, by solving a QP big enough and then adding on an N from a Lyapunov equation. Thatís correct. And this one here, each point on that curve was done by solving one QP. Very good point. This thing was done by solving a QP for whatever 40 steps or 50 steps or something. You are right. The computation involved here is much more. Thatís correct. Yeah. Actually, even though itís a baby example and itís not always this dramatic thereís a hint here that things like this work really, really well. I mean really well. The reason is actually pretty simple.

It does make sense. It goes something like this. As you pointed out, youíre solving 50 QPís here basically. Youíre solving 50 different quadratic programs or something like that. All the time the QP you are solving is that youíre making a plan of action for T steps in the future. Letís make it 12, so youíre like out here. Basically, youíre doing flawlessly. Basically, at all times, youíre filing a plan of action to take the state to zero in 12 steps. What happens next? You then execute only the first input. You step forward and now instead of saying, well, it has to be zero at that time, that time when it has to be zero has shifted. Okay?

Let me explain why this works well. There is a sense in which at each step you basically compute a whole action. Basically itís a whole action plan is what it is. It means that if you were to get the signal all over, done, quit, stop, everything like that, then you would execute that plan in 12 steps or whatever it is. Capital T steps later youíd bring the thing to zero. So at all times you have sort of a viable plan that would halt the system in 15 steps. Thatís kind of the Ė or whatever capital T is, right? So, in fact, we have T=15. Youíre calculating 15 values of the input, but only one you use. Only the first one. The other 14 values of the input is nothing but a planning exercise and its only reason for being there is so that you donít do something stupid now that messes you up for the future.

That is the only reason for that planning to be there. By the way, once you realize that thatís the basic idea, that the reason youíre making a full planning is just so you donít do something now stupid with respect to the future, then it starts making sense that you could do a crappy job of planning and you would expect it to actually work. I mean, this sort of makes sense. Imagine this in some kind of a cash flow problem, right? Where youíre managing some cash flow and thatíll actually be something for next lecture, which is Stochastic MPC, but in a cash flow problem you would have inputs and now Iíve got expenses, Iíve got income, and stuff like that. What you do is you simply do some planning now. What you do now has a huge effect on the future, right?

You invest in some fixed income security that matures in three months itís gonna be a cost now, but itíll be a benefit in three months. You run your cash balance down real low now to pay for something, then youíre gonna have to borrow and thatíll be expensive when you have expenses come up. Everybody see how Ė I mean, itís the exact same problem. So MPC there would do just what you would imagine. Youíd basically do, in fact, what youíre required to do if youíre a CFO or something like that. You, actually, have just a plan. You make a plan for the next 12 quarters on cash flow.

Now, so far everythingís deterministic, so we, sort of, assume nothingís gonna change. Weíll see that the real advantage comes when, in fact, things are not what you predict them to be. Thatís Stochastic control. Thatís the next topic.

Student:Is there a way to solve this problem with [inaudible]?

Instructor (Stephen Boyd):Uh, if they are changing, but linear, yes. Thatís no problem. But if they are non-linear itís not a convex problem and you know what my recommendation would be then? I donít know. Lecture whatever it was, sequential convex programming. Itíll work quite well. Yeah?

Student:If the dynamics are changing, do you have to keep solving more feasibility problems to figure out your horizon? It seems like before youíd do this problem you have to figure out the feasibility problem to get your minimum Ė

Instructor (Stephen Boyd):Right.

Student: Ė or you can overshoot it, right?

Instructor (Stephen Boyd):Right. Weíll get to that. Generally speaking, if youíre running into feasibility issues your horizon is too short. The other thing is that when you run this, believe it or not, itíll work perfectly well if you violate feasibility. So if instead of saying x(t) + T is equal to zero, I put a big charge on capital X(t) like ten times its absolute value or one norm or something like that. This would work well now even, believe it or not, or tolerably it would work out here because it would just try its best to get that state zero. I mean, it wouldnít get great performance, but it wouldnít just like stop, right? So, yes. When you do these methods, you have to deal with gracefully, right?

Because if you do a plan for the future and someone says let me see your plan and there it is. Itís like nan, nan, nan, nan, nan, right? Thatís, yeah. So you donít Ė you have to relax all the things. This just makes it easiest to present.

Student:Just a question. Has anybody looked at like different [inaudible] forms so you could reuse some of the computations you had done?

Instructor (Stephen Boyd):Excellent question. Okay. When this is running there should be something thatís very clear. That at each step, if youíre the CFO, you worked out basically how to wind up operations in two and a half years, right? But do so, by the way, minimizing your total expanse or maximizing your lifetime or your time to live or whatever it was. I mean, itís a very simple thing. Now you step forward and youíre actually solving almost the same problem, right? Thatís your point. That can be exploited unbelievably well by using a warm start method. Then that gets back to your question, which is the computational complexity.

It is true that youíre solving 50 QPís here, but youíre solving 50 QPís and each is only a very minor perturbation on the other and, in fact, if you use a warm start method I guarantee itíll be two Newton steps to go. Basically, what happens is if someone says, ďOkay. Now T is 17. Letís figure out what to do.Ē And you say, ďOkay. Well, what I have to do is I have to do a plan from T=17 to T=29.Ē Thatís if capital T is 12. Okay? But you just did a plan from 16 to 28. So you take that plan and thatís your initial point for your new plan. Two Newton steps I promise you, three, I donít know. Itís all over, done. So, in fact, these things are fast. Okay? Weíll get to that.

Student:[Inaudible] about the [inaudible]?

Instructor (Stephen Boyd):Yeah.

Student:This is very deceiving because what youíre saying when the finite horizon control problem relate in 15 steps you are going Ė

Instructor (Stephen Boyd):Oh, you mean compare this one to this one? Yeah. Right.

Student:Another one is a strategy to compute, like, a somewhat different strategy to compute the optimal Ė

Instructor (Stephen Boyd):I agree.

Student:And you are actually going to reach near zero or some region near zero within terms of the radius near zero in some, like, 100 steps or 50 steps, not really in like 10 steps Ė

Instructor (Stephen Boyd):Right. No, I, no, but technically Ė okay, that is true. But let me put something out. Technically all three of these are showing the same thing. This one doesnít depend on horizon because itís the exact global optimal.

Student:Right.

Instructor (Stephen Boyd):Okay? Now, this one says bring that state to zero in 15 steps and then stop.

Student:Right.

Instructor (Stephen Boyd):Thatís a viable U. I mean, itís a U, it brings a state to zero, and it just sits there. It runs up a bill.

Student:But if you look at Ė

Instructor (Stephen Boyd):It runs up this bill. Whatís that?

Student:If you look at your first MPC and you actually look you will see that as if you are incurring the cost of 11.5.

Instructor (Stephen Boyd):Right.

Student:And then it would suddenly realize that I only used the most important then my cost is going to use Ė

Instructor (Stephen Boyd):Right. Well, what happened was at this point what happened for this MPC thing, on the first step, itís actually quite rushed, right?

Student:Yeah.

Instructor (Stephen Boyd):Because itís basically saying you minimize this cost, take this state to zero in 11 steps. It basically in nine steps is impossible. Itís not even feasible to do, so itís feeling quite rushed and it kind of has got some input that just barely does it. Then what happens is, in the next step the horizon moves another step forward, right?

Student:Right.

Instructor (Stephen Boyd):So I think theyíre perfectly okay. I donít know. I think Iím not Ė theyíre showing different things, but itís okay. Here are the trajectories generated. This is where MPC with T=10 and, I guess, the point is here that at least visually, I mean, you know all ready youíve seen the plot, theyíre kind of the same.

Student:Do you build anything like games like this where you have two players Ė

Instructor (Stephen Boyd):Yes.

Student: Ė and both of them Ė

Instructor (Stephen Boyd):Yeah. Yeah. So people have done games too. Thatís right. Yeah. Yeah, infinite versions of this and all sorts of other stuff. Okay. Here if you look carefully, I mean, you can see that these are Ė well, they have to be close, right? Because the costs were close, but, I mean, theyíre shockingly close. But the point is that this is calculated in a very different way. This is calculated by solving one QP essentially infinite horizon, letís leave it that way. Here. Thatís what this does. This says that this point was found by saying wind up in ten steps. Then it moves forward. It creates a whole plan for winding up in ten steps, moves forward in one step, now itís T=2 and it says now given the state youíre in plan to wind up in ten steps.

So whatís shocking is that each planning exercise is stressful in the sense that itís not even feasible for nine steps to get there. Well, it might actually start being feasible once the state has been reduced, but at first itís not even feasible. Each planning exercise is stressful, but the resulting input generated by sequence of stressful planning exercises is actually superb. I mean, this obviously doesnít always happen, but this is pointing out something which is correct in practice and that is that these things work like spookily well. I mean, if you have to solve things like this you should do this. Okay.

Now, Iíve all ready said some of this. Itís got lots of names in the 70ís. This was propagated all across chemical process plants and it had the name dynamic matrix control. You might want to ask why. The matrix means that they wrote a big matrix that related the future inputs to the future outputs and Iím not kidding. Thatís what it is, so that was heavy technology, but widely used. There was like four companies. Theyíre used just everywhere now. So you go to a big chemical processing plant itís almost certain this is what theyíre using. Letís see. And they can get quite fancy. By the way, I remember talking to someone from Shell Canada and he said that they actually have in these things that are Ė so Iím talking about big giant process plants, right? By the way, this is very impressive because these are also very dangerous, right? You send the wrong signals to a giant cracking tower and you can be very, very sorry.

Also, theyíre extremely high value assets as you can imagine, right? So he said that, actually, in the QP, thereís coefficients that go into that QP into the cost function come from analysts and things like spot futures prices and things like that for the future. Then I pointed out to him, I said that means that some guy in Chicago who raises a card, itís not done that way, but, anyway, raises a card on something, actually, more or less directly ends up affecting valves and pumps in Ottawa. He thought about it for a minute and he said yes, thatís true. So I think itís kind of cool actually. I think itís cool anyway. Okay.

It has lots of names, rolling horizon planning. A lot of people do it and donít even say it so thatís the other thing. Now, actually, whatís interesting is that itís propagated into industries where the dynamics are slow. So, in fact, I just recently found out itís whatís used, although not called MPC. Itís used in so-called revenue management. Do you know what that is? Thatís basically these sleazy ways companies get to part with most of your money possible and youíll be happy to know this is whatís doing it. This is how they release the number of airline seats at this price and then they change it and then more and then more and then they sense desperation like you say, ďNo, I need to be at this meeting in Chicago.Ē They go like, ďOh, itís really expensive. Iím sorry about that.Ē And you sit next to someone who paid 1/12 the price. The euphemism for that is revenue management. Theyíre managing it all right. At least according to my friend at business school, these are the methods used. Although no one doing it would recognize the terms Model Predictive Control. Theyíd say never heard of it. They would call it revenue management and it would be this. Okay. All right. Let me say a few more things about it. Actually, there is something very important about this that I should say and I think itís extremely relevant. I think you hit it right. You went right to the key point. The key point is in these systems in these methods if youíre running control youíre solving a convex problem at every step.

Now, if you think about control and things like that, things that run in real-time, that is very non-traditional, right? So the way controls that were implemented in many industries up until recently, and included in many now, is the control which runs at real-time is just a very small amount of calculation, right? You spend a billion dollars on all sorts of stuff to figure out some control gains, but then that actual control code is like 14 lines of C. Got a couple of switches in it, couple of flops, and things like that and then it can be checked for correctness like a zillion times or whatever. Then itís installed in a flight controller. Not, by the way, because itís updated so fast. These things are only updated in like 50 hertz or something like that, so you have 20 milliseconds. Itís just for tradition and various other things. Everybody see what Iím saying?

So the most common type of control is like PID, which is Proportional Integral Derivative. Basically requires about three multiplies per step or something, right? You get the new output compared to your target. Thatís your error. You do have to add that to the previous integral error, right? And you multiply by three numbers and thatís what you do. Everybody see what Iím saying? When computers were slow this all made a lot of sense. People who did traditional control ignored this entirely through the 70ís and 80ís because they said itís a special thing. I mean, it works because chemical process plants have very slow dynamics that go for like 24 hours and if you have 15 minutes you can go ahead, solve a QP every 15 minutes. What do we care?

If youíre doing revenue management solve a QP every 15 minutes. What do we care? LP in that case, right? Go right ahead. So people who did sort of the real-time control said, ďOh, no, I work on robots. I work on jet engines. I do networks. Our decisions have to be made at the millisecond or microsecond level. I have to schedule processes. I donít have 15 minutes to figure out what to schedule next. Iíve got to make decisions every millisecond.Ē So thereís a whole bunch of areas like that where whether those people know it or not this is coming for them. Mostly they donít know it, but this is coming. Okay.

So lets look at some variations. Iíve all ready mentioned this that instead of insisting that the final state be zero you can make a final state cost Ė make up some V head equals V. By the way, if the final state cost were exactly the cost to go function then, no matter how short the horizon, you would actually get the exact optimal input. In other words, even if it was one step Ė well, thatís literally Bellman equation. The Bellman equation says letís do some planning, ready? Letís plan only what youíre gonna do right now, but then you say, well, how do I count for the future cost? And you go by the exact effect on the future cost that would be V. If you put that accounting in and optimize Greedy locally in time you get the exact optimal. Everybody see what Iím saying?

So you might imagine, for example, the end point condition x(T)+T is zero. That is effectively a V had. Itís the following V had. Itís zero when X is zero. Thatís plus infinity otherwise. Not exactly what youíd call a superb estimate of the cost to go. Anyway, if you can put in an approximation of a cost to go here itíll work really well. For example, if a problem is like linear quadratic and zero is in the interior of the thing, you can actually solve the Ricatti equation and put X transpose PX as this thing and youíll just get amazingly good controls at that point. So thatís just one thing. It also deals with the non-feasibility issue quite gracefully.

In other words, that way when someone says can I see your plan itís not gonna be nan, nan, nan, nan, nan as inputs right is. Okay. Again, on that topic, you convert hard constraints to violation penalties because you donít want to just say I canít make the plan. What you do is you just violate some penalties. Whatís really weird about this is youíd think things would work poorly. You can actually run MPC where all of your planning exercises are, actually, infeasible. If you freeze it and you say, ďWhat are you doing?Ē You say, ďIím planning my actions for the next 14 steps.Ē You go, ďGreat, what are you gonna do?Ē ďWell, I wanted to make the final state zero, but itís infeasible. Also, Iím violating a few things. Iím violating the dynamics equations and things like that.Ē And youíre like, ďYou call that a plan?Ē And you go, ďHey, itís real-time.Ē

Shockingly those things actually work really well. It kind of makes sense because if you think of it this way, if you think whatís the plan for? The plan is there only so that you donít do something stupid now with respect to the future. So if youíre planning is not perfect Ė well, I guess if itís infeasible itís a little bit worse than not perfect. I mean, at least itís some kind of rough plan. You couldnít execute it, but anyway. All right. Another variation is this. You can solve the MPC problem every capital K steps. Basically what happens is that at T you plan out capital T steps and you plan out a whole trajectory and you execute the first five. You just do those. Then after five steps you see what state youíre in, thereís no disturbances here so the truth is you know what state youíre in.

Your horizon also shoots back five steps and you re-plan, so thatís another variation on these. There are lots of variations. Okay. Now, let me talk about explicit MPC. This is a special case for these. The loss function is quadratic and then these are polyhedral. The state and input constraint sets are polyhedral. Then it turns out that phi MPC is actually piecewise affine. Okay. It looks shocking or cool or something, but itís not. It turns out that the solution of any quadratic program is piecewise affine in the right-hand sides of the constraints. That comes straight from the kickety conditions.

By the way, this comes up in machine learning also that when you do an L1 regularized problem and you put your lambda there, you can say lots of things about the X, like that itís piecewise affine in lambda and you can say various other things about it, but itís a good thing to know and itís not hard to see. Theoretically, it says that if youíre gonna solve a QP with many instances of the right-hand side you can actually pre-compute an explicit lookup table for the solution. Now, you can do this for any QP for any application. Now, what happens then is you simply calculate these regions and then your code looks like this. Lets call your parameter P.

P comes in and the first thing you do is you find out which region P is in and then you simply apply KP + G where the K and G are the correct ones for that region. Okay? So thatís how that works. This is practical for very small problems only. I mean, certainly an empty C. I mean, these are things like two and three and things like that. Small number of constraints. Now, people have pioneered in the last, like, five or six years methods that actually analytically construct these things and they make very pretty pictures. You can type explicit MPC into Google or something and youíll see beautiful pictures with a two-dimensional states base and some kind of tessellation and 180 regions, and they actually use these things.

Theyíve used them in anesthesia. Theyíve used them also in switching power supply. Obviously, these things can get slow very quickly, which is kind of the Ė the whole point was to make it fast or something. Again, this comes from 1969 mentality. 1969 mentality was solving a QP is a big deal. Thatís 1969 mentality. Big thing. You need a big super computer, you need expensive software, a couple of Ph.D.ís to sit around, big 208 volt three phase, 50 amps. People just think of it like itís a big deal and, as you all know, it is nothing. You can write one in Matlab in 20 minutes. You could actually. Itís just not that big a deal.

You can write one in C in half an hour or something like that. So a lot of it is sort of mismotivated by that. Now, you can do things to simply these things and get quite a good control. There is another advantage to an explicit form and the explicit form is actually quite traditional in control. That would be called gain scheduling or something like that. The person who is traditionally trained to make sure that before some control code is downloaded into something dangerous or expensive theyíll be very familiar with a switch, right? With a big C statement that lists a bunch of cases and in each case dials in a gain. Thatís completely standard. Okay.

But, in fact, again, if you think past the 1969 mentality hereís what you would see. The MPC problem is highly structured, so thereís a whole section on this, but the Hessian is block diagonal. Iím not gonna go into the details. The quality constraints are block banded. Itís weird because the equality constraint matrix isnít square, so itís not clear what banded means, but if there were a meaning for banded square matrix this would be it. But, actually, you know why you can guess this and all you needed was your intuition to know this? This is one of the things I would hope you would eventually get. Probably by the end of this year youíll have it.

It has to do with going from problem structure in the actual problem to the problem structure in the numerics and what are the implications, so letís just talk about that for a minute. Lets say youíre solving an optimal control problem and let me just ask you about the state X(t). What does it connect to? Where do the other variables itís, sort of, directly constrained with or connected to?

Student:T plus one.

Instructor (Stephen Boyd):Thatís what?

Student:T plus one and T plus Ė

Instructor (Stephen Boyd):Itís X(t) is connected to U(t). Lets say itís connected to U(t)Ė1 because the previous decision is what landed you in that state and itís connected to X(t)+1 and X(t)-1. Thatís it. Whenever you can make a statement like that Ė now, of course, X(t) is related to X(t)+50. Okay? But indirectly because itís got to propagate through all these dynamics and things like that. Otherwise, itís just a separate problem, but the key is whenever you can identify blocks of variables and say that Ė in fact, what we just argued means itís banded. If you stack X and U and X and U(t) you only depend kind of on X and U(t) the previous and post that smells like banded. That says banded.

I mean, you have to go through and work out all the matrices and all that, but the point is at least you know what youíre looking for and youíre looking for banded and what does banded mean about solving it? It means blazing fast. It means linear in the time step. Okay? So thatís the other thing you should know. Banded Ė how fast does it take a Newton step if you have a band of 20 and itís a million long?

Student:NK squared.

Instructor (Stephen Boyd):Hmm?

Student:NK squared times Ė

Instructor (Stephen Boyd):Yeah, yeah, time is the answer. Super-fast. And itís not easy solving million variables, million equations generally. Okay? I just want to talk about the ideas. You should be used to this. Structure comes up other ways. In image processing what structure comes up? What are pixels related to typically?

Student:The surrounding.

Instructor (Stephen Boyd):Yeah, some neighbors. Okay. So did you get banded?

Student:Yes.

Instructor (Stephen Boyd):No. You get banded across a line, but youíre also related, so you get something called the chronicle product of banded. Youíre related to your left and right neighbors for sure. Things like signal processing control, anything with time in it, banded. Images you get actually a chronicle product of band. You get weird bands because youíre related to the guy to your left, the guy to your right, but youíre also related to your neighbor above and your neighbor below. Yeah, okay. Make your more complicated nine point. That hardly changes anything, but it means that youíre related here and thatís, by the way, an offset of N, right?

So the sparsity pattern there you would expect to see would be bands. Actually, itís a banded matrix of banded matrices. By the way, it doesnít matter. The main thing you should recognize is structure. Thatís structure and that means fast if you know what youíre doing. Okay. So you can actually solve this. I guess weíve gone all over this. I donít have to tell you this. Itís t(n+m)3. So to do this itís linear in T. Hereís a fast MPC. In fact, it was just put together by Young. Weíve all ready talked about some of the things here. First of all, you donít have to do a good job in the planning, so you might as well do an interior point method and fix the barrier parameter and then you can use warm start, which I think Ė who asked? You asked about that, right?

So you could do warm start and then it turns out you only need to do like two Newton steps or something like that. You can limit the number of Newton steps and this is just a simple C implementation, not even optimized, and you can compile it with whatever it was, O3 or whatever itís called. These would be the numbers involved here. Here would be one. Hereís a problem with, I donít know, four states, two inputs, a time step of ten, and that goes down in 300 microseconds. You can have a problem with 30 states, eight inputs, 30 horizon. Thatís a problem with 1,000 variables and 3,000 constraints. That goes down in 23 milliseconds. Okay?

So I donít mind Ė I mean, I just came back from Zurich where thereís people who have been doing MPC for a long time and those who had the 1969 mentality successfully convinced other people that solving a QP is a big deal. By the way, these are also the same kind of people who just used tools and donít know how they worked inside. If you fire this up in CVX youíll get actually some very little CVX overhead time. Youíll get 3.4 seconds. All right. This is actually why I think itís very important if youíre gonna use things like convex optimization. Maybe youíll go into an area where it doesnít matter, and you can use high-level tools and you can use other peopleís solvers. Good for you. Very good for taste. Very good judgment to go into a field like that.

It requires a combination of modest number of variables and no real-time constraints. If youíre considering fields go into a field like that. However, the point is if you go into something that is huge youíre gonna have to write your own solver. We went over that a couple of weeks ago. If you do anything thatís real-time youíre also gonna have to write your own, but itís really easy. I mean, it is not hard provided you know what youíre doing. If youíve taken 364a and b you know what youíre doing hopefully. You should know what youíre doing or something like that. By the way, I do want to make a big deal over this. Thereís not a lot of people outside this room who actually know these numbers. Even though everybody should know them, I know all the experts who write all the big codes, if I asked them how fast would you solve Ė actually, no one even thinks about solving small problems.

So even the experts in the world when you ask them how fast can you solve that? Theyíll say a couple of seconds. Right there. That wonít help you if youíre controlling a robot or something like that or doing navigation updates or whatever it is. I personally happen to think thereís a lot of opportunity for MPC running at rates like 50 hertz, 60 hertz, and so on. By the way, thereís other areas where this comes up. Thereís a student who did a project in 364 three, four years ago that had to do with optimizing guard bands and wireless transmission. He used an SOCP with, I donít know how many variables, 200 variables, and during his project he got a C program to run it down to 500 microseconds and then it ultimately went to hardware and it was right in there at packet time. It was in the microsecond range. I forget what it was.

These real-time things are actually quite interesting and fun. I think, actually, theyíre gonna have a lot of impact. Okay. So lets look at examples. This example essentially is supply chain management because itís completely generic, so hereís what it is. I have a bunch of nodes and these are warehouses or buffers, it depends on if youíre storing goods or storing packets or something like that, for fluid or whatever. This is gonna be single commodity. Itís very easy to make this once you understand how to do this. Totally trivial to make it multi commodity.

Then you have m links that go between the nodes in the warehouses. These could be communication links. They could be airfreight or something like that. It doesnít matter. Pipes. Xi(t) is gonna be the amount of commodity at node i. In period t, uj is the amount of commodity transport along the link j. Then you will split the traditional graph node incident matrix into its negative and positive parts and weíll call those Ain and Aout and this basically tells you whether link j enters or exits node one.

Itís something like Ain minus Aout would be the traditional graph node incidence matrix. Thatís the one thatís got the one and minus one on every edge. The one tells you what it points in to and the minus one tells you where it comes from or the other way around depending on which standard you use. The dynamics is this. Itís linear. It says that the amount you have at a certain time is equal to x(t) plus this is the amount that came in and then minus thatís the amount that flowed out. And let me just for fun ask a question. Suppose you had spoilage. In fact, suppose I told you that this is a perishable good and at each step 10 percent of the commodity goes bad. Please modify the equations.

Student:[Inaudible].

Instructor (Stephen Boyd):Yeah. Thank you. Would it be linear then still? Yeah, sure. Okay. Iím just pointing out thereís a lot of stuff you could do. I guess you did that, right? In your thing. If you have somewhat spoilage in airplanes thatís not good.

Student:No.

Instructor (Stephen Boyd):No, okay, okay. All right. All right. Here are the constraints and objective. These are just sort of generic. One is to say that the X is the amounts go between zero and some x max. You may not have to have this x max. That would be a buffer limit or something like that. By the way, itís also very common in some cases to let x go negative. In which case, it represents a backlog. Itís the same ideas having a negative investment in asset. It means you have the obligation to produce it or something like that. If a customer makes an order you can actually take the order and now you have a backlog, which just says how many units do you have on stock. You can say -12 means 12 are backordered. I mean, youíre gonna have to add a penalty for backlog if you do that. You have to add a steep penalty for backlog, but thatís another version of this.

Link capacities go between zero and some maximum value. This is a vector of the amount removed from each node or warehouse and thatís gotta be less than x(t). Some people call this actually a Ė do you remember what they? In fact, I just saw some presentation where this had some name to people who do supply chains. It means that what you canít do is you canít ship ten units to a warehouse and then ship it out on the same time period. Everybody know what I mean? So if you didnít have this it would mean you could actually do it all. Like a digital circuit, like a two-face clock, or something like that basically. On the rising edge all the trucks leave and this is that.

This says the truck canít leave with stuff thatís not there and then on the following edge of that clock they all arrive. Thatís my model of it. Okay. You can have a shipping transportation cost thatís S(u(t)). This could be positive, negative. It can include sales revenue, manufacturing cost, all sorts of stuff in it. Whereas a storage cost is just a charge on W. If you allow xís to go negative, you would have, in fact, a back order charge, which could be real or just an internal accounting. Your objective would be to minimize this. I should mention one thing. There are some very common shipment and storage cost charges, which are not convex. That would be like a fixed charge. Weíre not gonna look at those, although you could.

They would be things like this. The cost to run a truck from here to there is fixed. So if youíre gonna put two little boxes in the truck itís gonna cost you the same as filling it. Okay? That would have a cost that would go up like that and then, actually, it would jump up again. These would be highly non-convex. Obviously every time you jump from zero to one, one to two or trucks or something like that. Weíre not gonna look at those, although one could. Now weíll just look at an example. Itís an example with five nodes, nine links here, and you start with everybody has some amount of commodity at different nodes. Presumably, this is sold or something like that, so weíll get revenue from shipping stuff out on these two links.

At each step I can ship whatever I like along these edges. I mean, it has to be there for me to ship it somewhere. Iíll have a buffer limit of one. The most I can ship out is .05. My storage cost will look like this. Itís linear plus a quadratic term. The shipping cost will just be linear. I pay linearly to ship among warehouses. Now, my shipping cost, I donít know if itís fair to call it shipping, but when I pull things out of those bottom two nodes I actually get paid for it. Thatís why this is negative. Thatís revenue from delivering product here. The initial stock is 10011 and weíll just run MPC with five steps or something like that. The optimal cost, by the way, is 68.2 and the MPC cost is 68.9, so, once again, just five steps is enough planning horizon to actually do very well.

And lets just see what this looks like here. What Iíll do, I guess, is Iíll start shipping these out immediately. Thatís clear. But then I wonít have much space here, so Iíll actually spread the product around and then ship it. Actually, the point is even baby examples like this what to do? Not obvious. Not remotely obvious. And you can imagine what would happen if you have 40 links and 50 and it goes on and on. Okay. So hereís the example.

So the optimal was obtained just by solving a very long problem and this just shows you a couple of things. I guess the first one shows you Ė this is X1, which starts at one. You can see stuff is shipped out, then held, and then itís shipped down until about 30. This is X3. It goes from zero. It goes up and then down again. Sorry, thatís Ė well, itís the same thing. Thereís the optimal and thereís the MPC. Theyíre not quite the same, but theyíre very close and you can see here these are the shipments made. The optimal says that, for example, U3 should ship a bunch then, sort of, stop, something like that, and then U4 should ship nothing, and then somewhere around 12 start shipping stuff out like that.

And then this is the MPC one. You can see, actually, itís not the same, but itís good enough to give you within two percent the optimal cost. In fact, if the horizon were any longer theyíd just be indistinguishable, the two. By the way, just the pictures of these tell you this is not obvious. This is not remotely obvious what to do here. Maybe weíll make a movie. Did you make a movie? I canít remember.

Student:Yeah, I did, but itís not for Ė

Instructor (Stephen Boyd):Yeah. Presumably, you could make a movie or something that would kind of show you the right thing to do where after several steps youíd spread your product all around and then after one thing is done the stuff starts pumping out the bottom or something like that and that minimizes your various costs. Okay. Let me just mention a couple variations on the optimal control problem. Weíve all ready said this. You can have time varying costs, dynamics, and constraints.

One very extremely common thing, especially in finance, is discounted cost where the cost in the future is actually discounted by a discount rate, like you divide by 1.05 to the t in the future. Whatís funny about that is the people who do control have discount costs with negative, right? They actually put higher cost on the future, which induces the control to be more aggressive to bring something to zero faster. No one in control would ever, ever use a discount cost and everyone in finance would use a discount cost. If the whole thing is going down over a long enough period. Weíve all ready seen this. You can have a couple stay input constraints. You can add slew rate constraints. Iíll just ask you one question, then weíll quit for today, and then thatís our next topic, but let me ask you this.

If I added constraints like UT+1-U(t) is less than that itís a slew rate constraint. It says that your input canít change by more than some amount. Tell me what that does to the problem structure.

Student:[Inaudible].

Instructor (Stephen Boyd):It what? It smoothes out the solution, sure, because it says that you canít change very radically. But what did it do to the problem structure that you need to solve? It preserves locality in time. In other words, itís another constraint, but it affects only things close in time. That should go in your brain. The whole center of your brain that controls sparsity patterns and things like that should be lighting up like banded, banded, banded, banded, banded, and the other one should be going like fast, fast, fast, fast. Okay. Weíll quit for today.

[End of Audio]

Duration: 81 minutes