Instructor (Stephen Boyd):Looks like weíve started. Let me start by making Ė there were a couple of announcements I should have made last time and forgot to. Actually, some of it involves stuff we didnít know then. The first is we have to apologize. The readers that you have Ė itís a very minor error. The homework exercise numbers after lecture ten have the wrong numbering. We apologize for that. We caught this a week ago, silently updated all the PDF files on the website and had the readers not been made would, of course, deny that we ever did this. But with all the readers out there, we can hardly deny it. All the materials there are in the same order. Itís just the numbering gets weird.
The second announcement I wanted to make is for contacting me or the TAs. Please use the staff email address. Itís on the webpage. That goes to all of us, and thatís a very good way Ė that way we can see which emails have been responded to and which have not. Please do not email the TAsí personal email addresses. That means that other people in the teaching staff havenít seen your email. Please use the staff address.
I think I did ask if people had seen mat lab, and it seemed like a lot of people had. Thatís great. I do want to say that in fact you are perfectly welcome to use octave. Weíll do whatever it takes, which should be, theoretically, nothing, to make anything that you have to do in this class run under octave. The difference is octave, of course, is a new open source thing. Mat lab runs on a bunch of Stanford computers. By all means, please do not go out and buy mat lab for this class.
Youíre also welcome, by the way, if youíre crazy enough or whatever, youíre also welcome to use python or something else. Jacob will help you with that. Is there anybody who is tempted? By the way, the ďprogrammingĒ youíll be doing in this class will be limited to one screenful. Weíre talking 15 lines at most. Any python Ė really? No oneís gonna take us up on that? Usually thereís one person. We never hear from them again, of course.
There was one more thing I wanted to say. The last thing Ė I wanted to say a little bit about how this class compares with CME 200. You may never have heard of CME 200. You should know about it. It is a class. Itís taught also this quarter. If you were to compare the material listed in the abstract for that course and this one, you would see a bunch of overlap, enough to make you have some questions. I think I can explain a little bit about what the difference is.
CME 200 actually focuses on computation. It actually focuses on Ė itís a good compliment. The classes are meant to be complimentary. CME 200 will tell you how to actually carry out the numerical calculations. This class will focus more on modeling and applications of the materials. Theyíre complimentary. Is there anyone here taking CME 200? It goes the other way in the other class.
Now some administrative things. We have progress. The first is we have a section. Thatís good. Let me repeat that the most valid information youíll always find on the website, but now that we have a section, which is going to be Mondays 4:14 to 5:05, it will be televised. Thatís the section, and that sets the whole timing or phasing diagram for the whole class.
That says that the homework will actually be due Thursdays, and that includes homework one. Itís true, we did at one point say homework one was due on Friday. We changed our mind. Itís going to be next Thursday, a week from now, so we donít feel guilty pulling the homework due date up one day. The homework will be due Thursdays.
The section is on Monday, and then immediately following section, there will be TA office hours, and then on both Tuesday and Wednesday, there will be TA office hours from 4:00 to 6:00. Where is the section? Itís Gates B3. Since Jacob said that, itís likely to be correct.
Another thing is that we coordinated with some of the other classes many of you are taking and we worked out a tentative date. Letís just say itís the dates for the midterm. The midterm is gonna be either October 26 or 27 or 27-28. Thatís your choice. In other words, midterm and final will be something like that. Each one will be 24 take-homes and youíll have a choice of two days, so whichever fits better in your schedule.
I should also mention that if you have a moderately good reason why neither of these dates works for you Ė it doesnít have to be a superb reason or anything like that, but if itís a moderately good one or itís funny or something like that, weíll make arrangements for you to take it another time, generally speaking beforehand. That means you become a beta tester. You might want to mark these dates in your calendar.
The final is going to be Ė this is to protect my legally so the registrarís police unit doesnít come pick me up. This would be the final, but you canít hear in my voice the quotes. If anyone asks you, itís not the final. I think itís the last homework assignment and it just counts for a lot. Actually, I think thatís illegal, too, because youíre not supposed to assign homework on the last day of class. Iíll figure something out. I should probably put in my cell phone an attorneyís number so when I do get arrested by the Stanford registrarís office I know who to call.
Thatís traditionally the last day of class. These things usually go out at 5:00 in the afternoon and then people come back a day later, tired but alive. We havenít lost anyone yet. Thatís gonna be the last day. Once again, if these donít work for you, we can work something out.
By the way, if thereís some weird thing where you have to take it after by some significant number of days, thatís something you need to let us know really early so that we can work around that. We try to release solutions as soon as possible, but we canít release any solutions until everyone has taken the exam. If there are any quite strange scheduling things, people should get in touch with us soon.
Now I think I have covered all of the administrative stuff, including what I forgot to mention last lecture. Sorry. Today and for the next week or so, what weíre covering should be review. What Iím mainly going to focus on Ė if two weeks from now you find that we havenít covered anything that you donít know, donít worry. We will.
What I want to focus on is really the meaning of all this, because thatís something thatís conspicuously absent in your traditional linear algebra class. Last time, we looked at the definition of a linear function, a function that takes its argument and N vector, returns an M vector is linear if you can form a linear combination of the inputs and apply the function, and itís the same as if you had applied the function to each of the arguments separately and formed the same linear combination.
This formula, which looks very innocent, in fact involves Ė it says a lot and it involves a lot of overloading. What I mean by that is that all sorts of symbols are doing multiple duty here. This plus is technically not the same as that plus. This Alpha X is not the same as this Alpha X. Those are basically in different dimensions and things like that. Thereís a lot of overloading there.
At the very end of the last class, we got to the matrix multiplication function. Itís a very important function. Very simple. Itís two ASCII characters. Itís this. Youíve taken argument X and you multiply a vector and you multiply it by matrix A and thatís what you return. Thatís the matrix multiplication function. Itís linear. You just have to check these things. Itís easy to do.
It turns out actually itís more than thatís just an example of a linear function. That is actually the generic form. All linear functions can be represented as F of X equals AX. People make a big deal about this. Itís actually not that big a deal, although itís useful to think about how given F Ė if F were a black box or just a function or operator which you could call but cannot see the internals of, the question is how would you reconstruct A, a model? Thatís essentially the question. Thatís why youíre going to be seeing F of X equals AX a lot.
Now let me say something broad about what does Y equals AX mean. It turns out thereís lots and lots of applications of this. In fact, really much of the course is going to be about banging practical problems into a form like this or a variation on it that might be more complicated, but something like that. Here, there are huge, broad categories of interpretation. One is this. In one interpretation, X is a vector of unknowns or parameters that you want to estimate Ė things you donít know.
Y is a vector of measurements or things you do know or, in communications, this would be something like a transmitted signal and Y is a received signal. Thatís one very broad category of linear functions. In this case, there will be all sorts of interesting things you want to do. The things you want to do would be something like this.
Suppose youíre given Y. Youíd like to say as much as you can about X, ideally saying what X is. That would be undoing this linear operation. In another broad category, X represents an input or action. In this interpretation, X is something we can mess with. Itís something we can change. X could be all sorts of things. It could be a signal we transmit. Then this Y is an outcome in this interpretation.
X could be all sorts of things. It could be the thrust that you put on your engines. That could be X1. X2 could be the deflection of your [inaudible] and your elevators and your control surfaces and things like this. So in that case, itís an action. The general idea is we control it. Y is now a result. Y might be a vector of climb rates and things like that. Weíll see more specific examples soon.
Another very useful interpretation is that you think of this abstractly as a function or a transformation. Itís just something that takes a vector, does something and gives you another vector. An example of this would be something like a Fourier transform, or it could be a rotation. Something that takes a vector, rotates it 27 degrees clockwise around some axis. That would be Ė there, youíd be thinking in terms of a transform. Theyíre all going to look like Y equals AX.
What Iím about to say is very simple and totally obvious, but I think it still needs to be said. Hereís Y equals AX written out component by component. This is the Ė Iím going to call Y the output and X the input. Iím neutral on whether itís an input we can mess with or itís some parameter we want to estimate or something like that. Iím just Ė input and output are just neutral here. This says that the I output is a sum over all inputs. Itís weighted by AIJ. Thatís what it says. What that means is you can interpret AIJ immediately. Itís the gain factor from the J input to the I output.
Again, what Iím saying is not exactly complicated, but itís very important to understand. A lot of these are very important things to know. That means that the I row of a matrix Ėthatís AI1, AI2, AI3 Ė youíre going across the row. All of those things concern the I output. Those are the gains from the inputs to the I output when you scan across a row. If you look at a column of a matrix in Y equals AX, a column concerns an input. So the J column Ė the third column concerns the third input. In fact, as you scan down that column, what these are is these are the gains from the third input to all the outputs. Thatís the idea.
Thatís fine to say, but now what it means is when you see a matrix and you see something in it, you should interpret things immediately. So for example, if you see that A27 equals zero Ė thatís the 27 entry in the matrix. Itís zero. That has a meaning, and the meaning is the second output, Y2, doesnít depend at all on the seventh input. Again, this is obvious, but it needs to be said.
If, for example, you see that the 31 coefficient to the matrix is much bigger than all other entries in that row, that has a very specific meaning. What it says is that Y3 depends mainly on X1. Iím assuming here that the coefficients in X1 are scaled, so theyíre all about the same. You could have something weird where the X1 entries are on order of units and the X3 entries have numbers like 10 to the minus 9, but Iím assuming theyíre all about the same. Then you could say this.
You can see all sorts of other stuff. For example, this goes on and on. If you have a column Ė if you find that A52 is a lot bigger than AI2, that tells you that the second input affects mainly Y5 among the outputs. You can just go on and on and on. For example, if A is lower triangular, that means AIJ is zero for I less than J. It means that YI depends only on X1 up to XI.
By the way, this means Ė when you have lower triangularity, youíre going to want to make a vague symbolic link in your mind to causality. If the index here represented time, this would be something like causality, and thatís what lower triangularity means. You shouldnít just dispassionately look at a lower triangular matrix and say yeah, it looks pretty. It has a very specific meaning, and it talks about what inputs affect what outputs and itís very specific.
Hereís a very special case. Letís say the matrix is diagonal. If itís diagonal, what that means is that the I output only depends on the I input. In general, you have the idea of a sparsity patter of a matrix. The sparsity pattern is the list of zero and non-zero entries. In a lot of cases, a matrix has no interesting sparsity pattern.
All the entries are non-zero, in which case you call it a dense matrix. In a lot of interesting cases, a matrix will have a non-trivial sparsity pattern. So blots of entries will be zero, and that will have a meaning. You should never, ever look at a matrix that has blocks of zeros and not think what does that mean? A lot of this is simple, but this is the idea.
How would you find, for example, if I gave you Y equals AX and I asked you something like this Ė which input makes the greatest contribution to the seventh output, what would you do? Yeah, so youíd look at the seventh row, youíd scan along it and if the entries were all about the same, youíd say they all affect it about the same. If one stood out, youíd say it effected.
What does the sign mean? What does it mean to say that A35 is positive? It means the following. If A35 is positive, what it means is that if you increase X5, the fifth input, the third output will increase. These are obvious things. Donít think about them too much, because thereís not that much here to think about. When we see it in action, itíll make more sense.
Now what Iím going to do is go through a bunch of examples to give you a rough idea Ė just so you have some pictures in your mind of this thing. So the first one is from mechanical engineering. We look at a linear elastic structure. So I have a structure like the steel frame of a building or something like that. Here, XJ is gonna be an external force applied to this structure at some point and in some direction. It could also be a torque. For example, these could be all the various forces here.
Here I have the first four forces Ė we can imagine these are wind loadings on the building. These are wind loadings. These might be dead loading in the building. YI is an output. Thatís going to be the deflection of a point in a building in a certain direction. It might be the deflection of this floor here. It has to be oriented, so for example, I might say that downward is positive deflection. Upward is negative.
There are names for these. If you get into civil engineering or structural engineering, the name, for example, for the difference between the horizontal displacement from one story to the next is called interstory drift. Any time you have some specific area, youíll find all sorts of colorful language for it. Weíll say Y is one of these deflections.
Itís a basic fact. Itís roughly true. It says that if these forces are small Ė of course, in a steel frame building, small does not mean five Newtons. It can mean quite a lot. If theyíre small, which depends on the application, then it says that the deflections are approximately a linear function of the applied forces. Thatís actually quite interesting. Now I can ask you some questions. I have to show some displacement.
Iím going to make the displacement here Y1, the displacement here Y2 and so on. Thatís Y4. Now Iím going to ask you to tell me about some entries of the matrix. Tell me what you can say about A11. Itís positive. Whatís your intuition behind that? You push here with a unit of force. The building will deflect this way, and then whatever the ultimate equilibrium deflection is Ė it looks like Y1. By the way, what can you say about A21 compared to A11? Itís smaller. In fact Ė what sign do you imagine it has? Positive. Okay.
By the way, only those of you who have had a class on structures are actually allowed to say that. The rest of you, like me, have to say we guess itís positive. Tell me about the column that associates with X3. Tell me about the third column of the matrix. What do you think it might look like? So Y3 would be big. How about Y1 and Y2? I think theyíd also be big, because youíd shift it over and thereíd be nothing up there. You get the idea.
By the way, the units of this matrix Ė in general, itís sometimes useful to think of the units Ė the units here are in meters per Newton if those are the units Iím using for displacement and force. Itís got a name. Itís called the compliance matrix. If the deflections and the forces are measured at exactly the same place, then the inverse, because the matrix is actually square Ė the inverse is called a stiffness matrix for example. Thatís just one simple example here.
Second one is we have a rigid body. For example, letís say a satellite or something like that. Hereís the center of gravity. Letís say that Iím going to apply forces and torques on it. Iíll apply a torque here. X1 is a force in a fixed direction. X1 measures the size of it. X1 negative means that I have a force in the direction opposite to whatever my reference direction is. XJ is gonna be my vector of these external forces and torques. What Iíll do is if I give you the forces and torques, there is a net force and a net torque applied to this rigid body. In fact, those are linear. You have Y equals AX.
A depends on geometry. You donít actually often write it this way. All of that is obscuring the fact that the net moments and torque on this rigid body is a linear function of these applied functions. This comes up. This would be, for example, a vehicle. These could be thrusters or a control surface Ė anything that creates a force or torque on the vehicle. This would be a very interesting matrix because it would actually tell you how what you do maps into the total force of torque. Thereís lots of questions you can imagine you might want to answer here.
So for example, here, the J column of A is a pattern, and it tells you the first three components in the J column give you Ė letís imagine the J column is a force applied. Then the first three components Ė A13, A23, A33 Ė what are the units of those? If the third input actually is a force, whatís the unit of these? These are unitless, and the reason is they map Newtons to Newtons. These are unitless. How about A43, A53 and A63? What are they? Meters. They have units of meters because they map Newtons to Newton meters to torques.
If you want to know if that is an actuator or a thruster and you want to know what does the thruster do, you get a rough idea by scanning that column. Looking at the geometry here, tell me something about the second column that corresponds to this input. Letís go right down the second column. What can you say about A12? Thatís gonna be the X component of the net force contributed by this thruster. What is it? The magnetite is less than one.
Letís be more specific. Zero. Is it really zero? Itís not perpendicular. It tilts a little bit to the right. So itís small and itís positive. Letís go for a number. Letís say itís .15. Itís very important to know that the number I made up was not arbitrary. If Iím off, Iím not off by too much. How about A22? Itís large. In fact, how large? Itís close to one. Itís .85, .95. Something like that. Thatís enough on that.
Letís look at a linear static circuit. Here, I have a linear circuit. Itís an amplifier, but it doesnít really matter what it is. Itís got linear resisters here. Itís got an external current source here, an external voltage source here and itís got a dependent current source here. Thatís how that works. Thatís a current controlled current source here. Here, Iím going to let X be the value of the independent sources. In this case, thatís X1 and X2. Y is going to be any circuit variable. For example, it could be the voltage across a leg. It could be the current through a device. It could be the potential Ė the voltage at a point.
It doesnít matter what it is. In this case, itís definitely linear. You have Y equals AX here. You have Y equals AX and thatís interesting. By the way, depending on what YI is and XJ, all of these entries have interesting units and names. For example, if the Xs are all currents and the Ys are voltages, then A is called the impedance or resistance matrix. In this circuit Ė this is again only for people in electrical engineering.
What is A11? A11 has a meaning here. I want the street name. Itís the input resistance of the amplifier. Again, if youíre not in EE, donít worry about it, but thatís what A11 is. It says when you pump current in here, you develop a voltage here. The gain is A11. Itís in ohms and itís the input resistance of the amplifier.
Next one is a simple dynamic system. Itís your basic frictionless table here. You have a frictionless table and you have your standard issue one kilogram mass, and whatís gonna happen is this. Itís gonna be at a position zero and it will be at rest at T equals zero. What weíre going to do is weíre going to subject it to a P wise constant force. So for one second, we apply a force, which is X1. Then we apply a force X2, then X3 and so on. So now the interpretation of X is like a force program. Program means itís your plan. Itís a force. You could call it a force program, a force plan, a force trajectory.
You can call it all sorts of things. Thatís what it is. Thatís what X is. Weíll make this up to N. There are going to be two outputs or outcomes, and itís simply going to be the final position and the velocity of that mass. The point is thatís linear. Thatís Y equals AX. What is the size of A? Two by N, because itís got two outputs and N inputs. I think I mentioned this last time, but I do encourage you to read some things on the course website that are very elementary but set up the notation of matrix multiplication. Just scan it.
Youíll also find another PDF file there, which has a title like ďCrimes Against Matrices.Ē You might want to scan bits and pieces of it. Iíll come back to that later. It means you should always at an instantís notice be able to identify the dimension of something. If you canít Ė if someone can point to something youíre doing or looking at and say whatís the dimension of that and you donít know, that means Ė itís very easy with this stuff to sit back and go yeah, sure.
Itís very easy to get complacent here, but you should be checking yourself whether you know exactly Ė thatís related to these things. Y equals AX of course only makes sense if the number of columns of A is equal to the dimension of X. If you ever write down Y equals AX and thatís not the case, we call this a syntax error, and itís bad. This is not a good thing. Donít do that.
Here you can interpret stuff. The first row gives you the influence of the applied force at different times on the final position. Actually, Iíd like to ask you about that row. What does it look like? I donít want the actual number. I just want intuitively to tell me about that row. Itís ascending. What youíre saying is a Newton applied at the beginning has less of an effect than a Newton applied at the end. If you say that first row increases as you go down the row, for me, thatís this way. If you say itís increasing along the row, it has a meaning.
It basically says Ė what are the units of the entries of the first row? Is it even linear? Yeah, it wouldnít be in the notes if it werenít. By the way, thatís not entirely true. There have been things in the notes that were mistakes for three years at a time. Letís just say Iím pretty sure this is okay. If not, you donít have to do the homework problem where you work out A, but it is.
Assuming that it is linear and the entries of the first row are in meters per Newton because they tell you meters of final displacement per Newton applied, and if you say that that row is increasing, it has a meaning. Do you believe it now? The initial velocity is zero here. The initial position is zero. What do you think? Itís decreasing, somebody said. Itís decreasing. What does that mean? It says that if you apply a Newton for the first second, you will get more final displacement than you will if you apply a Newton at the last one second. Is that true? Why?
Thereís no friction here, so what that means is something like this. When you apply a Newton in the first second, you accelerate the mass to one meter per second. Thereís no friction. It now coasts for N minus one seconds with that velocity you just gave it. That gives you Ė you cover a lot of distance. In the last time when you apply one Newton in the last one-second period, what happens is in fact youíre merely accelerating it from whatever speed it has to one meter per second more. In fact, the total displacement you get is half a meter in that case.
The details of this donít matter. The point is to think about these things. What about the second row of A? Yeah, the second row of A is going to be constant because if you apply a Newton for one second to a mass on a frictionless table, it has the exact same net effect. You will increase Ė we can even say what that row is. Itís all ones, because if you apply a Newton for one second to a one-kilogram mass, it will be moving at the end one meter per second faster.
These are simple examples, but you should Ė this is the kind of thing you should do without even thinking about it. We can put a negative force no problem. The matrix A doesnít care and has nothing to do with whether you push to the left or right. When you push to the left or right in your force program, what youíre doing when you push to the left or right is thatís changing X. In fact, let me be very specific. Hereís an X. Letís make N equals 4 and hereís my X. 1,0,0,-1. That has a meaning. That means you push one Newton, you coast for two seconds and you pull. What is Y for this, roughly? Itís about three.
In fact, it happens to be exactly three. Y is a two vector, so itís first entry is three. Whatís the second? Zero. What does that mean? This means that the final position is three meters to the right, and the final velocity is zero. So this is, in fact, a force program that will transfer the mass three meters to the right. It will take it from stationary to stationary three meters.
This is all very simple, but this is, for example, a very simple version of exactly what your disk drive head does when you initiate a seek. This is what happens. It moves from track 25 to track 150. It does exactly this. By the way, the force program is not quite this, but itís roughly like this. Later, weíll find out what it really is.
This one is from geophysics. Itís gravimeter prospecting. It works like this. This is underground. It could be seawater or include some air. These are little voxels, and these voxels have a density row, and weíll make XJ the excess mass density over the average earth density. I forget what that is. If one of these has gas in it then X is negative. If this has got some really dense rock, X is positive. When you have an array of things at different density and you actually want to know what is the gravitational acceleration here, you get it by integrating all these things and using the gravitational force business.
Mostly, itís going to be pretty [inaudible] as itís going to be pointing down and it will be about 9.8 meters per second squared. It turns out it actually can be deflected. Itís very subtle. Itís out there in some digit Ė something like the third or the fourth digit or something like that, but it will point somewhere else. Its magnitude will change. If youíre right next to a giant mountain over there, down is now ever so slightly deflected that way. If youíre standing over flat ground and thereís a giant, giant cavity under there filled with gas, then in fact the same thing might happen.
This would be deflected slightly that way, and the magnitude would be slightly different. Whatís often measured is the difference of G with the average around there. You calibrate it somewhere and then you move it around, and in fact, this is approximately linear. Here, A is quite complicated. The formula for it is horrendous and involves all sorts of things. Itís got positive and negative things. Very complicated. This is going to involve all kinds of crazy things with sines and cosines and distances and all sorts of other stuff. That will just obscure this fact.
Now you can say lots of stuff. You can say, for example, that the J column of that matrix shows you the sensor readings that would be caused by a unit density anomaly at voxel J. Thatís literally what it means. The row shows the sensitivity of a sensor. If you make a measurement here, it tells you as a function of voxel what the sensitivity is. Youíd expect this to be Ė the matrix wouldnít be sparse, but it might have a lot of small entries. Youíd expect that the gravitational anomaly here would have not a who lot to do with voxels that are way, way off over here. There would still be some structure, if not exact zeros in the sparsity pattern.
Next one is a thermal system. Here, I have a system, and I have some sources of heat in the system. These could be heaters where in a control problem, I want the temperature to do something I want, so these are little resistive heaters. Or these could be processors on a multiprocessor chip where theyíre just operating and then the temperature is whatever it is. Lots of applications. I inject heat at these sources, and XJ will simply be the heat source, and thatís measured in watts. I assume that the thermal transport here is linear, so itís via conduction. It could be by convection, but it would be a very simple linear model of convection.
Itís not by radiation, which is going to involve these temperature to the fourth terms. Again, if you donít know what Iím talking about, it makes no difference. Itís fine. Itís linear. Thereíll be some appropriate boundary conditions, like theyíll be isothermal or theyíll be isolated or something like that. Then this whole thing will come to some thermal equilibrium temperature distribution and Iím going to let YI be the temperature at a certain point I. Thatís the idea.
If you look up thermodynamics, youíll find out Ė people will tell you thatís a [inaudible] equation. Itís very complicated. PDEs will come out and youíll pretty quickly Ė you have to do that if you really want to understand this. But in fact after all the PDEs go away and everything else, you will find that, in fact, Y equals AX. The vector of equilibrium temperatures will be equal to a matrix times the vector of the power dissipation in these things. Itís that simple. Letís talk about A. What can you say about the sign of the entries of A? Theyíre all positive. What does that mean?
It means that if you pump heat into something, at some point, the only thing you can do is increase the temperature everywhere. Here, what can you tell me about A41? Small. The idea is this Ė A41 is the gain, and itís in degree C per watt from this heat source to that location. It does have an effect, but itís small. Thatís kind of the idea. You can imagine all sorts of cool things you might want to do with this. These might be things under your control, and you might say find me an X that makes the temperature distribution something I want.
Letís say for some kind of experiment you want a nice, uniform temperature gradient or you want it uniform. AX is about equal to some desired temperature to the extent possible. Weíll be able to answer questions like that. Another one would be estimation, which is I give you 57 measured temperatures. I want you to estimate the power being deposited at these five locations. Thatís an estimation problem. You want to deduce it from the measured temperatures. There are all sorts of things.
Next one is illumination with multiple lamps. Here I have some surface like this and I have a bunch of lamps at these points up here, and what I can do is each lamp has a power XJ. These things Ė they go down to this patch here and here, I can actually say what AIJ is. Here, I say YI is going to be the illumination on patch I, and XJ is the power in lamp J. One thing to notice when you look at matrices and things like this, and you get used to it after awhile, but when you have A sub IJ, I indexes the output for effect. J indexes the input. So really, the pair IJ is really indexed output, input.
Thatís weird because most people think of Ė if you just walked up to someone on the street, they would probably index things by input, output. Matrices are organized as output, input. Too bad, thatís the way it worked out. Youíll get used to it. By the way, these are dummy variables. I could switch them and be sick and just turn it around and make it JI. That will happen. But right now, Iím just trying to stick to a reasonable Ė they can be anything you like.
Here, the illumination of a lamp on a patch is given by the inverse square Ė itís one over the square of the distance and then itís multiplied by this cosine factor, which is basically how much of the light is caught by the angle. By the way, if itís all the way over and the light is below here, you get nothing because youíre obscured. Thatís what the max is here.
Once again, the vector of illumination levels is a linear function of the vector of lamp powers, and that tells you it has the form Y equals AX. Again here, A is non-negative. Thatís clear intuitively, but itís also clear from this. You get a rough idea. For example, if you look at a column of A, what you are looking at is the illumination pattern generated by that lamp. Youíre looking at the third column. That column gives you the illumination pattern. If you look at a row, youíre looking at a patch. Youíre focusing on a patch and youíre asking what the different gains from the different lamps to that patch are.
Weíll look at another couple of these. This one is from communications. Here I have N transmitted receiver pairs. The idea is transmitter J wants to transmit to transceiver J. Unfortunately inadvertently it also transmits to the other ones. We donít want that. P is going to be the power of the J transmitter. S is going to be the received signal power of the I receiver. Z is going to be the received interference power of the I receiver. GIJ is going to be the path gain from transmitter J to receiver I. That presumably will depend on how far they are apart. It may depend on all sorts of other things in between.
G is non-negative. You have S equals AP and Z equals BP where S Ė A is in fact a diagonal matrix where you just take the diagonal part of G. That gives you the vector of signal powers. You take the rest of G Ė the off diagonal part of G and you shove that into matrix B, and if you multiply that matrix by the power vector, you get the vector of total interference powers. Iím assuming that the interferences are going to add incoherently. The powers are going to add. Itís not coherent addition.
Ideally, what you want is you want A to be large and you want B to be small or zero. That means you want this matrix G to have a very strong diagonal and lots of little off diagonal entries. If I asked you questions like this Ė the third receiver is most susceptible from interference from which transmitter Ė how do you find that out given G? The third receiver is most susceptible to interference from which transmitter? What do you do? The answer is you walk across the third row of the matrix G. I guess for you, thatís kind of like this.
You walk across the third row of G. The three entry in G is very important. Thatís actually the gain the transmitter you want to listen to. You look at the other entries, and in the other entries, you look for the largest entry, and that tells you which transmitter you are most susceptible to. These are simple things, but this is the kind of thinking you need to do.
The next one is from economics. It involves things like cost of production. Here you have a bunch of production inputs like materials, parts, and labor. You combine these to make a bunch of products. Weíll let XJ be the price per unit of production input J. AIJ is gonna be the units of production input J that you need to manufacture one unit of product I. So that means if you go cars the row of that matrix, it corresponds.
If you go across the third row, it basically tells you how much Ė what are the inputs you need to make one unit of the third product. If thereís a zero there, it says you donít need that. If it is a large entry, it says you need a lot of whatever that is. Could there be a negative number? Strangely, it depends. Generally speaking, no, but in fact, yes, you could have a negative number. What would it mean if A23 were equal to minus one? What could it mean? Itís a byproduct. Exactly.
It says that when you make one unit of product two, not only will you not need input three, you will actually as a byproduct of making that generate one unit of input three. A lot of these things where normally you think of it as something thatís positive, there usually is a really interesting interpretation of what happens when itís negative.
Iíll show you an example where thatís not the case. Transmitting negative powers Ė people have tried to do it, but so far, it just hasnít worked out. How about this one? What is it? You could have active cooling. They have these things and yes, you can pump one watt out. I just mention this because itís good to keep these in mind.
If Y is the production cost per unit of product A, you have Y equals AX. This is beautiful. This tells you something like this. This tells you how the cost of making your family of products depends on the vector of input prices. For example, I could ask a question like this. Among all the products you make, which is most sensitive to the price of energy? How would you answer it? You look for the energy column. It says energy is X3. You scan down the third column and you look for the biggest entry and you say that product is the one most sensitive to a change in the price of energy.
Letís move on to the next example. These do get a little bit boring, but it means that when we do talk about stuff thatís abstract, at least it has meaning in all of these contexts. The next one is from networking. I have N flows in a network. A flow is something that passes from one node across an edge to another node to another node. These are going to have rates F1 through FN. It doesnít matter. These could be in bits per second. They could also be, for that matter, in liters per second. It can also be electricity. It could be anything. It could be goods. These could be transported by trucks. These could be packets.
It doesnít matter. They pass from a source node to a destination or some fixed route. The traffic on a link Ė some of the routes will go over each link. If no route goes over a link, then itís utterly unused. But every link will have some routes go over it. It may be one, two or 100. The total traffic on that link is the sum of the flows of the routes that pass over it. You can write this as exactly the same thing. If you have T, thatís the traffic vector, and that is a vector that tells you Ė its index refers to a link, and it says this is the traffic on all the links. Itís a linear function of the flow rates. It looks like AF.
A is, in fact, a very simple matrix. People call it a zero one matrix or something like that. It basically encodes which flows pass over which links. Now, I can ask you a question. A bottleneck is a link that has a large number of flows going over it. If I gave you the matrix A, how would you find a bottleneck? You look for a row, and a row corresponds to an output Ė in this case, a traffic. A row corresponds to the contribution to a link from all the flows. You look for a row that has a lot of ones in it. There we go. Thatís what you said, right? Youíre right. Thatís it.
By the way, whatís the meaning of a column with a lot of ones in it? Itís a long Ė itís a flow with a long route. Thatís the idea. Interestingly, we can do this. Letís say that each link has a delay on it. In other words, when youíre going to go over a certain link, you actually arrive at that node. Thereís a queuing or transport delay. It doesnít matter. These are just applications to give you some context for all of this. Each link has a delay, and thatís the delay it takes Ė it might be waiting to get queued up and transmitted. It might be the transmission delay. It doesnít matter. Itís whatever the delay is across it.
Therefore, if you have D1 through DM are the delays on each link. Thatís the link delay vector. The latency of a flow basically is the sum of the delays along the route. If these were packets, it literally tells you if I inject a packet here how long it will take before it emerges at the other end. Thatís the latency, and itís simply the sum of the delays along the route. It turns out this is very easy to write down. Itís L equals A transpose D where this is the transpose of A. In other words, you simply take Ė you can work it out.
Thereís some very interesting things here. For example, if you work out what F transpose L is, F is a vector of flow rates. Letís just say itís in bits per second or packets per second. L is a vector of the same size. For each flow, it tells you the latency, which is basically the delay. When you inject a packet, how long it takes before it emerges at the destination. FILI is exactly the number of packets in transit. Thatís what it is. F transpose L, which is the sum over FILI over I is exactly the total number of packets in the network. Thatís what this is. This you can write out lots of different ways.
You can also write it, by the way, as F transpose A transpose D, because L is A transpose D. This is Ė just doing some simple matrix arithmetic, I can rewrite this as AF transpose, but AF is the traffic. So it turns out itís the same as this. This is an inner product. Letís talk about what this is. What is TI times DI? It is exactly the number of packets in transit or waiting on link I. This is the sum this way. You get the same Ė these are two ways to get the total number of packets in the network. You either sum over the flows or you sum over the links and you get the same thing.
By the way, if thereís any of this you didnít get, you should go back and make sure you believe it. Donít spend too much time on this because Iíve had people come back to me later and say yeah, but just the way you were saying it, it sound like it was very deep. Nothing weíve said today is complicated or deep. If you think you donít get it, you do. Iíve had people come and say things like I understand everything, but I think there are subtleties Iím not getting to which Iíd respond there are no subtleties. What we covered today was trivial Ė interesting and important, but trivial.
Hereís a generic source of linear mappings. It is linearization, which youíve seen. Thereís another name for it. Itís sometimes called calculus. Here it is. I have a function that accepts an N vector and returns an M vector. You say itís differentiable at a point. Whenever X is near X0, the function value is very near F of X of zero plus DF of X0 X minus X0. You have to be careful here. That is gonna be an M by N matrix. This is DF. Thatís the derivative of F at evaluated X0. Itís entries are these partial derivatives. Very near is a technical term. If I did this, this is the definition of F being continuous.
It says basically if youíre near one point and evaluate F, youíre near the image of that point. Thatís the definition of continuous. The definition of differentiable is this. Very near, by the way, means it says that the error here is like the square of the error here. Thatís what the very near is. Very, very near means it goes like the cube of the error, by the way, and it goes on from there. You can read this informally or formally. In many cases and in lots of contexts, people focus Ė what they do is they focus Ė lots of contexts have different names for this. For example, in a circuit, X0 would describe the bias or operating point.
Youíd have something like Ė these would be the bias voltages, the bias inputs or whatever, and then the deviations would be called the small signal values. Youíve seen that. In aeronautics, you would have the so-called trim condition. The trim condition is that your thrust is at such and such a level. Your elevators are at this level an so on and so forth.
Youíre in level flight at 40,000 feet at such and such a speed. Thatís your trim condition. Delta X represents a small change from that and a change in your elevator deflection. Delta Y would be a difference in, for example, the net moment in torque on the airframe. Different fields have different names for a base operating condition and then wiggling around it.
A lot of people introduce a notation like Delta Y is Y minus Y0. Delta X is X minus X0, and then you can write it this way, and this basically says the deviations in a response or output is a linear function of the deviations in the input. Thatís the generic example. Youíve seen this. That is literally calculus. The problem is all these stupid multivariable calculus classes make everything complicated by bringing up things like radiance and curls and things like that. This material was super useful in the late 19th century and maybe up until the 1930s.
Itís not really much anymore. You need a few people to know all those things, but not really. The problem is that if you look back at your multi-variability, itís unbelievably simple. Itís just DF. In fact, that is Ė this is an approximation. Just by syntax, this has to be Ė this is an N vector. This side Ė what youíre approximating is F of X, which is an M vector, so the only thing you could multiply an N vector by and get an M vector is an M by N matrix. So thatís just an M by N matrix. It would be, for example, in the case of the gradient, itís actually a row vector, which is the right way to write it.
This is something like this. This is the derivative of F evaluated at X0. Thatís a matrix and then this is indexed at IJ. Thatís what the parsing is here. The left and right hand sides of this are numbers. INJ actually are the indices. I index actually the component of F. F returns an N vector, so F3 or F of X sub three or something represents the third component of it. J indexes into the input.
Weíll look at a specific example of this. I wonít do that. Iíll just wrap up a little bit. The things we looked at again Ė this is the last day where youíll be subjected to me looking at stupid examples of Y equals AX. On the other hand Ė we didnít cover any actual material today. We just looked at a bunch of examples. Donít worry. The rest of the class is not going to go this way, but still, itís an important thing to do. I guess weíll wrap up next time. Remember, there is actually a section on Monday.
[End of Audio]
Duration: 81 minutes