**Instructor (Brad Osgood)**:I wonder how long I can keep this up. All right. So the first thing – any questions? Any comments? All right. The first thing I want to do is fix a little mistake I made at the end of last time. I was upset about it for days afterwards, so I just want to correct it. This was [inaudible] central limit theorem, which I hope you will agree was a real treat. I mean, it’s just an amazing fact, and the way that it brings in convolution, the [inaudible] transform of the galaxy and so on is really, I think, a real highlight of the course to see those things emerge and to – the whole spooky nature of repeated convolutions approaching the galaxy, I think, is just an amazing thing.

But I did screw it up at the end, so I just wanted to correct that. Everything was fine except for one final formula that I wrote, so I’m not going to redo it. I just want to point out where I wrote the wrong thing. It’s correct in the notes. I just wrote the wrong thing on the board. This was the setup – so X1 through XN are, as they say in the biz, independent and identically distributed random variables. You don’t have to write this down. I’m just recalling what the notation was. And P of little X is the distribution that goes along with all of them because they have the same distribution.

There’s a single function, P of X, which describes how each one of them is distributed. So it’s a distribution for each. And then I formed P of N of X was the distribution for the sum scaled by square root of N. So it’s the average – excuse me. There was some assumption we made on the Xs on normalation, that is. We assume they had mean zero and we assume they had standard deviation or variance one, and then if you form the sum, the mean of the sum is zero but the standard deviation or the variant center deviation of the sum is the square root of N, so it’s scaled by the square root of N.

You divide by square root of N to have this sum, SN, I called it, have mean zero and standard deviation one, and P of N of X was the distribution for this. What we found was that the Fourier transform – here we go. The Fourier Transform of P of N of S was the Fourier transform of P and S over the square root of N to the Nth power. And then the trick of the analysis – the way the proof worked was to compute the Fourier transform of P and S over the square root of N just using the definition of Fourier transform that uses a very sneaky thing of looking at the Taylor series of expansion with the complex exponential integrating the terms and so on, and it was really quite clever.

What I found was that the Fourier transform of P at S over the square root of N was approximately if N is large was one minus two Pi squared S squared over N. That was fine. That was fine. And then for some reason, I raised this to the Nth power, so the Fourier transform P at S over the square root of N to the N was then approximately this thing raised to the Nth one minus two Pi squared S squared over N to the N and then inexplicably I had the wrong approximation for this in terms of an exponential in terms of the power of E.

That is this one minus two Pi squared S squared over N is approximately E to the minus two Pi squared S squared. I’m going to look that up again to make sure I have it right this time. Sure enough. All right. Then from there, and for some reason I wrote two Pi squared over S squared last time, but it’s two Pi squared times S squared. That’s an elementary to the N. That’s an elementary fact from calculus that you learned probably a long time ago when you were first learning about properties of the exponential function.

And then from there you take the inverse Fourier transform and you get the result. FT of this inverse Fourier transform of this galaxy and E to the minus two Pi squared S squared to get the result, which I am afraid to write down because I’m afraid I’m just going to make one more mistake, so I’m not gonna do it to get the final result of the central limit theorem. There. I feel better now that I’ve corrected it. I’m not sure everybody else here feels better, but I feel better. Okay.

Now to get on to the day’s events. Brothers and sisters, today we have to come to a reckoning with our failings. Today we have to confront things we have not confronted but must. Now first, the good news, and there is always good news. The good news is the formulas we have derived, the applications we have done are all correct. So relax. Nothing that we have done, strictly speaking, is wrong. However, much has been left unsaid and today, we have to say it. We have to confront the convergence of intervals. It is a sin to consider integrals that do not convert, and I’m afraid there are times when we have sinned.

But in the end, nothing that we have done is wrong, so relax. But much has been left unsaid. So it is time now. Are you ready? Are you ready? Okay. Now we need to have a better, more robust definition of the Fourier transform in order to consider – I think I gotta stop this. There’s only so much of this I can do. We need a more robust understanding of the Fourier transform. We need a more robust definition of the Fourier transform that will allow us, verily, to work with the signals that society needs to function – sines, cosines, even the function one – constant functions.

And the Fourier transform as we have defined it will not do the trick. More robust definition, and that’s really what it amounts to – definition of the Fourier transform – to deal with common signals, ones for which the classical definition does not work or will not – is not adequate. The issue is exactly the convergence of the integral or if not the convergence of the integral for the function, applying Fourier inversion. The classical definition, let me just say, will not do.

There are two issues, really. One is the definition of the Fourier transform itself – the convergence of the integral defining the Fourier transform – when I say convergence of the integral, I mean just some way of evaluating, some way of making the integral make sense – convergence of the integral defining the Fourier transform. And the second and just as important, certainly for applications – it doesn’t do you any good to take the Fourier transform if you can’t undo what you’ve done. That is to say, you want Fourier inversion to work, also. You want Fourier inversion to apply.

It shouldn’t be surprising that if there’s a problem with the definition of the Fourier transform, there may be a problem also with Fourier inversion because the formulas are very similar. One involves the integral of either the minus two Pi ISX the interval of the integral of either the plus two Pi ISX but really, the issues of convergence as far as that goes are pretty much the same. Now there are two ways to approach this problem. This is a genuine problem that really has to be dealt with, and it has been dealt with in various ways over various periods of time.

Fourier analysis is a highly developed subject, and it has gone through various stages of development. There are two ways of doing this. In the early stages, which are probably a fair way of characterizing it, it was on sort of an ad hoc basis. That is, special techniques were developed for special problems that would explain away a special function or a special signal. That’s one way. Two ways of dealing with these problems. That is one is so to speak ad hoc special techniques. Got a particular problematic integral? Okay, let me just work with that integral.

Having trouble with convergence of a reticular signal? Having trouble applying Fourier inversion in a particular case? Okay, let me deal with that by some trick, by some method that applies not generally but to that one problem that you’re having difficult with and then you can advance until the next problem comes up. This was okay. This was all that could be done because it was all the understanding at the time led to. The second approach, and that’s the approach that we’re gonna take, actually, is to rework the foundations. That’s the second approach.

You have to rework the foundations to give, really, a different definition of the Fourier transform, one that applies more robustly, more equally, somehow, to all signals that come up all at once. You have to rework the foundations and the definition. This came much later. This came probably in the 1940s, and by that time, as I say, Fourier analysis was highly developed, both in its theory and its applications. It was a highly developed subject. This gets into mathematically deep waters, which we will not completely tread, but I do want to give you the confidence enough to swim in them a little bit, just so you know what the issues are and know how they were dealt with.

Furthermore, the other thing that I want you to get out of this is a certain confidence that you can compute. The method wouldn’t be any good, really, outside of mathematical departments and circles if it didn’t offer a way of computing with confidence when that couldn’t be done so much before. You can actually use this method to find specific Fourier transforms in a way that’s airtight and is actually quite nice. We’ll do that. Don’t fret over mathematical details. What I’m going to do and I hope to do is take you to the brink of the abyss and then back off.

Say what the issues are, say what the problems are but don’t deal with the mathematical proofs and don’t deal with all the fundamentals that have to be analyzed, because it’s quite daunting. There’s a huge amount that has to be done. We don’t need that. But I think what you can see is the general structure, the general overall direction to the argument. Let me give you an example to show where the issues are. Let’s go back to the very first example we considered, which already shows the problem. I may have mentioned this before, actually, but now I want to say it a little bit more sternly somehow.

The problem is evident in the very first example. The very first example we had was the rectangle function in this Fourier transform. So F of T is the rectangle function, Pi of T. There is no problem at all taking the Fourier transform of that. That’s not the issue. The integral converges. It’s a very simple function. The Fourier transform as classically defined is fine. The Fourier transform of the rectangle function is the sinc function. Sinc of S is sine of Pi S divided by Pi S. that’s fine. The problem is inversion. The problem is getting the other way around. We did this by duality.

There are really two things – there’s either Fourier inversion or finding the Fourier transform of the sinc function. The problem is concluding that the inverse Fourier transform of the sinc function is the rectangle function. We use duality or more or less equivalently via what we call duality, and that was all fine. Nothing we did was wrong. It wasn’t justified. The duality – what is the same thing, the Fourier transform and the sinc function, is the rectangle function. Same kind of issue. The same issue was involved here in showing this as in showing this. The problem is writing down that integral.

The Fourier transform of the sinc function of the inverse Fourier transform of the sinc function – let me just stick with the problem of Fourier inversion. The inverse Fourier transform of the sinc function is the integral from minus infinity to infinity of E to the minus two Pi IST. Let me write out the sinc in terms of sine and cosine. Sine of Pi S divided by Pi S, and I’m integrating with the respect to S, so what results is supposed to be a function of T. As a matter of fact, what results is supposed to be the rectangle function. Now I said at the time you want to try to evaluate that integral? Good luck, because it’s not so easy to do.

So again, according to what we expect to be true, this integral turns out to be one when S is less than a half and it turns out to be zero when S is greater than or equal to a half. It turns out to be the rec function. Now even this actually is not quite correct because – well, one thing at a time, here. In fact, by ad hoc techniques, special techniques and there are actually a variety of techniques that can be brought to bear on this, some more complicated than others, you can actually evaluate this integral. But it’s not obvious. No simple integration by par, no simple integration by substitution is going to get you this sort of result.

This result is almost okay but requires special techniques. In fact, there’s a problem at the end points. There’s a problem at plus and minus one with that integral. It’s equal to one half, I think, when S is equal to plus or minus one half, so that’s another reason for sort of defining the rectangle function that way so it doesn’t jump from zero to one but has a value of one half at [inaudible] continuities. Never mind that. Special technique. So in fact, special problem at the end points – end points S equals plus or minus one-half. But anyway, the point is that it can be dealt with.

It’s a little bit disconcerting that the simplest example in the entire subject already poses this problem. The most basic function already requires you to do special arguments just so you know the Fourier inversion works. That’s a little bit irritating. Oh, Jesus. Yes. There. Good. Don’t do that again. Same thing because it’s an even function. There. Now, let me take a second example. There’s a problem where it’s fine to compute the Fourier transform, but the problem is with Fourier inversion. Computing the Fourier transform of the rectangle function gives you a sinc function. That’s fine.

But Fourier inversion does not apply directly without some extra work. As a second example of troubles, there are very simple functions for which you can’t even get started. I believe I may have mentioned these before, but let me say them again now in this context of repentance and trying to lead a more virtuous life. A second example – consider F of T equals one, the constant function. Then there is no way to make sense of in a classical way of the integral from minus infinity of its Fourier transform. The integral from minus infinity, either minus two Pi IST times one DT.

There’s no way in which this integral can be made to make sense, period. It just won’t work. I know you know that you write down this integral and you write, oh, that’s just the delta function. That comes later, and that comes later really only as a convention because to write down this integral is to sin in the sense of expecting this to do anything. This integral does not make sense – can’t be made to make sense. And slightly more generally but equally importantly, other signals do not have classical Fourier transforms in the sense that the integral defines something – the integral is well defined.

For instance, sine and cosine have the same sort of problem. Likewise for F of T equals the sine of two Pi T or G of T equals the cosine of two Pi T – there’s no way to make sense. You can’t make sense of the integral from minus infinity to infinity of the classical Fourier transform. You have the minus two Pi IST, say, sine of two Pi T DT. The integral oscillates. The thing doesn’t die out as you’re tending to infinity. Nothing good is gonna happen here. No secret combination of cancellations, no conspiracy of plusses and minuses is going to make this thing work. It just won’t work.

These are the signals that society needs. What is to be done? You remember that the situation is somewhat analogous, philosophically, spiritually, heavenly. The situation is somewhat similar to what we confronted with Fourier series. That is, Fourier series was supposed to be a method that applied in great generality. You wanted to be able to write down a very general periodic function in terms of its Fourier series and then compute with it. The problem was those series didn’t converge in any sense. You couldn’t make those series converge, or maybe you had to make a very special argument to show that they converge and so on.

What was finally – and then finally the subject not so much collapsed of its own weight but a fundamental reworking had to be done. In order to make Fourier series apply as generally as you would like it to apply, you had to really change the notion of convergence. You had to rework the foundations. So it’s analogous. I don’t want to make the analogy too sharp or too precise, but in some sense it is analogous, the Fourier series. In the foundations, we needed a new definition, new conception of convergence. There’s a lot that went on with that.

You had to sort of abandon – if you wanted to advance the subject and if you wanted to talk about things in greatest generality, you couldn’t talk about point wise convergence. You had to talk about convergence in the mean and so on. You had to talk about the square integral functions and all the rest of that jazz, quite far removed from the initial development of the subject. It’s what had to be done, and ultimately led to a very fine theory and a very robust and applicable theory. Now we’re gonna have to do a similar thing here. Again, Fourier series – the Fourier analysis, Fourier series and then the work on the Fourier transform is a very highly developed subject.

You ask yourself if I’m forced, somehow, because the thing is collapsing. It’s too many ad hoc arguments, too many one off examples that have to be explained, so I really have to rework the foundations. But it’s such a highly developed subject that you are confronting the problem that any scientist or engineer or mathematician confronts in a situation like that is that nature provides you with so many phenomena that have been so thoroughly studied – how do you choose the rights one upon which to base your theory?

How do you choose a relatively small number of phenomena upon which or based on which the others can be explained? So much has been done in Fourier analysis and applications of Fourier analysis and everything. Where do you start? How to choose the basic phenomena which explains the others which can be used to explain others – all others. That’s the problem. It’s a problem that confronts anybody doing experiments. Think of this as viewing a series of experiments. What are you measuring? Is that significant? Is it not significant? What do I pay attention to? What do I ignore?

The results are there in front of me. What do I do? Which ones do I choose? I want to explain how this was resolved. It was resolved quite brilliantly and quite decisively in the 40s. The first step, and I’ll say a little bit more about the history of this as we go along. The first step in understanding this is to somehow back away from the troubles and concentrate on what works well. That’s the first step. So back away from the problems. What is the best situation? Identify the best situation or what characterizes the best situation. In this case, what that means is you want to identify a class of signals for which everything you want is true.

You want to identify – when I say what is the best situation, I mean identify the best signals, the best class of functions for Fourier analysis – let me just say Fourier transforms. Now the requirements are these. What we want is two properties. We want one. Let’s call this class of signals S. So think of S standing for signals but actually S is gonna stand for Schwartz, the person who isolated these particular class of functions as the best ones upon which to build a theory. So then we want to – one. If F of T is in S, then the Fourier transform of T is defined and it’s also in S. Fourier transform of F is defined classically by the integral.

The Fourier transform of F is also of the same class. Now that already rules out, for example – that already tells us – well, I haven’t defined S yet. You can imagine that it tells us that the rectangle function shouldn’t be in that class and the sine and the cosine shouldn’t be in that class or the constant one shouldn’t be in that class because for the rectangle function, there’s obviously a problem – the Fourier transform of the rectangle function is the sinc function. The sinc function is not as good as the rectangle function in some sense because I can’t plug it back into the formula for the Fourier transform or the inverse Fourier transform.

Likewise with sine and cosine, they’re not good, because the Fourier transform of the sine and cosine isn’t defined or the Fourier transform and the constant function one isn’t defined. This already rules out the exact signals that I want to deal with. It doesn’t seem like the best class to me, but in fact, it is. It’s gonna be a restricted class but it’s gonna serve as the foundation for a broader definition. That’s the first property. So the function – if a function is in this class S, then so is its Fourier transform. The Fourier transform [inaudible] so is the Fourier transform. The second property is the Fourier inversion works.

Two. Fourier inversion is defined by the classical by the formula, and that only makes sense even to talk about two in light of one because one says if you take the Fourier transform, you wind up back in the class, so you can take the Fourier transform of this function again or the inverse Fourier transform of this function because you haven’t left the class of functions for which you can define the integral. So that is to say the inverse Fourier transform or the Fourier transform of F is equal to F and same way going the other way, the Fourier transform of the inverse Fourier transform of F is [inaudible].

That’s what we want. This is the best class – if we can find a class of functions S that satisfies these two properties, then that’s the best class of signals that we can hope for for the classical theory. For the classical theory, this is the best thing you can hope for. There’s one other property that comes up that always comes up in the discussion, and I want to just state it now. It’s not a requirement of the theory, but it comes up as sort of part of the theory and is actually extremely useful for the further developments.

There’s a further property – let me just say that – a further property – an identity that I haven’t written down yet. I’m going to write it down now and we’ll make a certain amount of use of it as we go along is what’s sometimes called Parseval’s identity for Fourier transforms. We have Parseval’s identity, also called Railey’s identity for Fourier series, and there’s a corresponding one for the Fourier transform. Parseval’s identity is also sometimes called a [inaudible] formula. It’s even sometimes called Railey’s identity for Fourier transforms, and it says this.

It says that the integral for minus infinity to infinity of if I integrate the square of the Fourier transform, it’s the same thing as the integrating from minus infinity to infinity integrating the square of the function. This is not a requirement of the theory. This is not a requirement of the best class of functions to consider, but it tags along with it. It’s somehow almost inseparable from it. I’m going to derive this for you a little later on today. This says this has physical interpretation in various fields depending on where it’s applied.

This sort of identity comes up a lot in distinct but related fields, and it says something like you can calculate the energy. You think of the integral of the square of the functions as total energy, so you can calculate the energy in the frequency domain or you can calculate the energy in the time domain. Or sometimes you think of this as a spectral power, total spectral power. It can be computed in the spectrum or in frequency or considered in time and so on. There’s various interpretations. There’s various terms that go along with this depending on who you’re talking to. Mathematically, it’s always known as either Parseval’s identity or Railey’s identity – something like that.

It’s analogous of what we had for Fourier series. For Fourier series, we had the sum of the squares of the Fourier coefficients. That’s what we get on the left hand side. That’s the spectral information and then on the right hand side we had the integral of the square of the function, but we only had it over one period. Here, it’s integrated from minus infinity to infinity because the signals are supposed to live on minus infinity to infinity. I’ll come back to that a little later on. As I said, that’s not a requirement of defining the best class of signals, but it always comes up. Now, here’s the question – it’s not such a stretch to say this is what we want.

These are the functions that we want. These are the two properties that we really want, our best functions to have. The question is what is such a class and is there a simple enough way of describing it that’s gonna be useful that I can actually work with and how do you find it? How do you find such a class? Again, imagine the difficulties. Imagine you are a scientist or a mathematician or an engineer confronting this problem and as I say, the field is highly developed. Nature has provided you with so many examples, so many different kinds of signals and so on. How do you choose?

What properties of the Fourier transform, what properties of the signals do you choose upon which to base your theory? It’s hard. That’s a hard intellectual challenge. It was met by Laurent Schwartz, who was a very famous French mathematician who taught for many years at the Ecole Polytechnique in Paris. So how to choose, how to define S? That’s the question. How do you isolate a small set of phenomena along all possible phenomena that nature has provided in order to build a reasonable theory? This is solved by an absolutely brilliant and inspired work. Genius does what genius does. Who can account for it?

This is solved by Laurent Schwartz back in the 40s, and I think it’s significant, actually, that Schwartz taught at Ecole Polytechnique. Ecole Polytechnique is the famous technical school of engineering in France. It was founded in the Napoleonic era to train technically competent Frenchmen – probably not so many French women at the time – to become engineers. And Laurent Schwartz taught there, and Laurent Schwartz was although mathematician by training and mathematician by published work was thoroughly grounded and thoroughly familiar with the applications.

And as a matter of fact, in his monumental book on the subject which is called the theory of distribution – these generalized functions that we’re gonna talk about are often called distributions, not to be confused with probability distributions. Anyway, in his book on this, he – I think in the preface, the first sentence of the preface or something like that, he credits not mathematicians, not even scientists but he credits Oliver Heaviside with really laying the basis for the whole theory. And what he’s thinking of there is delta functions that we now call delta functions and things like that, sort of operational – what then was called operational calculus.

So it was in a very applied context that Schwartz was thoroughly familiar with from which he extracted his theory of what was a more modern foundation of the Fourier transform. It’s absolutely brilliant and inspired. It’s very complicated. If you go through all the details, it’s very complicated. We’re not gonna do that. But if you allow yourself to sort of go along for the ride, it’s actually not so complicated and it’s a pleasure to see it work, and it’s something that you can compute with. I keep coming back to that because I want you to believe me and I want you to realize that I’m not just doing this for the hell of it.

That is, we can actually compute in a very convincing way the Fourier transforms that you really want – sines and cosines, the delta function, constant function, all that stuff. It really comes out of the theory in a beautiful way, but you do have to accept a certain amount going into it. You have to accept a certain change of point of view and change in perspective. I’m going to give you the property right now, because it really is fairly simple to state. I’ll tell you sort of where it comes from and why it comes from, but I’m not gonna go through the details even at that, although more details are provided in the notes.

Here’s the definition. S, actually – so S is now not just standing for signals but standing for Schwartz, is the class of the best class of functions for Fourier analysis. It is the so called class of rapidly decreasing functions. S is class of rapidly decreasing functions. And it’s defined by two properties – two relatively straightforward properties. First of all, any function of the class is infinitely differentiable. You don’t have to worry about smoothness. It has arbitrarily high smoothness. You want to take 1,000 derivatives? Fine. You want to take a million derivatives? Fine. You want to take just one or two derivatives? Fine. Everything is just fine.

So F of X is infinitely differentiable. That’s one property. That doesn’t talk about the rapid decrease. The second definition is the rapid decrease, and what it says is that for any M and N greater and equal to zero integers – let me write the definition down and then tell you what it means. X to the N – the Nth derivative of F of X tends to zero as X tends to plus or minus infinity. I don’t know whether I switched the rules of M and N here – what I’m writing on the board here with what I wrote in the notes, but it doesn’t matter. What this says in words is not hard.

In words, this says that any derivative of F tends to zero faster than any power of X. M and N are independent here. The third derivative tends to zero faster than X to the 100. Also faster than X. Also faster than X to one million. The 50th derivative of F tends to zero faster than X to the five millionths or whatever. M and N are independent. It says once again in words, and that’s the way to think of it, is that any derivative of F tends to zero faster than any power of F. And independently – the order of the derivative of the power. Okay. Where this comes from I’ll explain in just a second.

First question you might ask is are there any such functions? I mean, is it terribly restrictive? This certainly does rule out sines and cosines and constant functions. Sorry, tends to zero faster than any power of X. This already seems to rule out the functions we want, or a lot of functions we want. So it’s easy to write down, but it seems highly restrictive. And again, it’s a little bit surprising that one could build such a powerful theory, such a general theory on such a restricted class. But you can, and that’s the story that’s yet to be told. Are there any such functions? Well, I’m happy to write down at least one of them.

A galcean is such a function. Any galcean – I’ll take the galcean that we’ve worked with so many times. It doesn’t matter how it’s normalized. Any such galcean will work, actually. As it turns out, of course the function tends to zero as X tends to plus or minus infinity, but in fact, the exponential is going down so fast that any derivative of the exponential is also gonna tend to zero faster than any power of X. You know what happens here If I start taking derivatives of E to the minus X squared, I’m gonna get powers of X out front, but always also – they’re gonna be multiplied by this E to the minus Pi X squared, which is gonna kill them of.

And in fact, any derivative – any power of X times any derivative of this function will tend to zero. Check this out. Do some calculations. Do some experiments. Do some graphs. Work it out. Just try it. I’m not gonna prove that for you, but it’s true. So that’s – it’s not a vacuous class. At least there are functions in the class. That’s good news. Another important class of functions, as it turns out, are the smooth functions, which vanish identically as you get close to infinity. Not just tend to zero, but actually which are equal to zero beyond a certain finite interval.

So C, another class of functions – these are the infinitely differentiable functions, which are equal to zero, identically zero, outside some finite interval. Different intervals for different functions. I’m not saying all the functions vanish outside of minus one to one or minus ten to ten or whatever, but a function is in this class if it is infinitely differentiable and it goes, as they say in the biz, smoothly down to zero. So the graph is something like this. You can do whatever it does, but it does smoothly and it tends to zero smoothly. It’s identically zero beyond some certain point from A to B.

And again, different intervals for different functions. I use the word C because the mathematical term for this is functions of compact support. The support of a function is the place where the function is non zero and has a bounded, closed interval, and that’s called in the mathematical biz compact set or compact interval, and so these functions are called compact support. You will see that terminology certainly in the mathematics literature, and you also see it in the engineering literature. It’s pretty widely adopted. That’s what the C stands for, compact support.

And there are plenty of functions like this, too. It’s not hard to come up with functions like that. So there are – at least the function of the class S is populated. There are functions which are rapidly increasing. It’s not a vacuous theory. Where does it come from? Why these properties among all possible properties and phenomena that nature has given you? Fourier analysis is a highly developed subject. There’s so many things to choose from. Why this? How did this work? Like I say, genius does – who can account for it?

I can tell you what it’s based on. It’s based on the relationship between smoothness and rate of decrease – it’s based on the derivative formula. That turned out to be sort of the fundamental property of Fourier transforms upon which this theory could be erected and the other things could be understood. The connection here comes through the derivative theorem. There are various versions of the derivative theorem, but the one that we looked at before was the Fourier transform of the Nth derivative – I used M before, so let me use M here – was at S is two Pi IS to the M times the Fourier transform of F to S.

There’s actually a similar formula for the Fourier transform of the derivative of the Fourier transform and so on. I won’t write that down. Again, these are in the notes. We used this when we talked about the heat equation. I said this was actually a fundamental property of Fourier transforms. It turns differentiation into multiplication. The derivative formula [inaudible] high enough elevation if you’re looking at it relates smoothness – that is to say the degree of differentiability to multiplication – to powers of S.

So saying something like this quantity tending to zero, meaning a power of S times in this case the Fourier transform tending to zero says something about how differentiable the function is. It’s a relationship between differentiability and measurements in terms of powers of S of rates of decrease. I’m not really gonna say anything more about it than that. That plus about a 250 IQ will lead you to develop a fantastic theory. But anything short of 250 IQ probably will leave you short.

But this turned out to be the derivative theorem and explaining the relationship between differentiability, between the degree of smoothness and the sort of rate of decay turned out to be the crucial thing upon which this thing hinged. So it was sort of the connection from differential equations, and actually again, that’s not so – that comes out a lot in the applications. We didn’t do a lot of this, but in Heaviside’s work, for example, where he introduced the so-called operational calculus delta functions and things like that, it was often in the context of differential equations – of solving differential equations, and he would work on solutions of differential equations by taking transforms.

Differentiation turns into multiplication under many of these transforms. You’ve seen probably when you were a kid the [inaudible] transform. It has the same sort of property. Other [inaudible] transforms have a similar kind of property. Again, it was one of the phenomena that was out there, but why this phenomena and not other phenomena? That’s what it took a genius to see.

I want to do one more thing today, and then I’m gonna show you next time how we’re gonna use this and how we’re gonna proceed from this very restricted class of very smooth functions with very good properties to functions which are very wild and things that are not functions at all, like deltas and things like that and how we’re gonna use this to actually define the Fourier transform for objects which are extremely general. But let me take this opportunity to finish up today with a derivation of Parseval’s identity.

It’s off the direct trajectory that we’re talking about now, but it comes up often enough, and it’s somehow always allied to the theory that I think it’s worthwhile just going through the derivation. So Parseval’s identity for functions in class S – for functions for which there is no issue of convergence, anything. Parseval’s identity says again, the integral from minus infinity to infinity F of S squared DS is equal to the integral from minus infinity to infinity of F of T squared DT.

And by the way, for functions in the class S, there is no question about convergence of these intervals, because the functions are dying off so fast that the integral of F squared converges. And furthermore, because the class has the property that if you take the Fourier transform you stay within the class, this integral also exists. This integral also exists. I should have said that, actually. Part of this point here is that not only does the fact that this satisfies the properties that we’re interested in is the function satisfies the property of rapidly decreasing, also therefore Fourier transforms do, also, and that’s where this relationship also comes in.

Sorry, I should have been a little bit more explicit about that. We’ll come back to it. I’m actually gonna prove a slightly more general version of this, and it’ll take me but a moment. Actually, it’s quite a straightforward derivation. The advantage is that everything is justified. All the manipulation I’m gonna do of integrals is completely justified. So let me prove a slightly more general one, namely the integral from minus infinity to infinity of the Fourier transform of S times the Fourier transform of G bar DS, an inner product of the Fourier transforms is the same as the inner product of the functions – integral from minus infinity to infinity F of T times G of t bar DT.

So I allow for complex value functions here, and of course the Fourier transform generally is a complex value. So how dose this follow? Well, let me use X here instead of T, not that it matters. I’m gonna invoke Fourier inversion. That is to prove this, I’m gonna write G of X as the inverse Fourier transform of the Fourier transform. I’m gonna write G of X as the integral from minus infinity to infinity of E to the plus two Pi ISX, the Fourier transform of G of S DS. I can do that because I can. In this case, I can.

If I assume that F and G are in class S, Fourier inversion works. Everything is fine. Okay. Then what about the integral from minus infinity to infinity of F of T, F of X, G of X bar DX? Well, G of X bar – sorry, let me do it over here. G of X bar is the integral from minus infinity to infinity. I take the complex conjugate of the inside.

So that’ll be E to the minus two Pi ISX, the complex conjugate of the Fourier transform S DS. So I take F of X times G of X bar integral minus infinity to infinity, that’s integral from minus infinity to infinity F of X times the integral from minus infinity to infinity E to the minus two Pi ISX F G of X bar DX and then the result is integrated with respect to – DS, sorry – the result is integrated with respect to X. Now, in respect to X. Now, let the rigor police through the door.

I don’t give a damn because everything here converges in every possible good sense. I can interchange limits of integration. I can whip them suckers around with absolutely no fear of retribution. That is to say, brothers and sisters, that this is equal to the integral from minus infinity to infinity, the integral from minus infinity to infinity – let’s put them all together and then take them all apart – E to the minus two Pi ISX F of X, G of X, F of G of S bar DS DX and now, that’s putting all the integrals together. Now I take the integrals apart and turn that into an iterated single intervals again, and I can do that because everything converges in every possible good sense.

You don’t scare me, you rigor police, you. Minus infinity to infinity of F of G of S bar integral from minus infinity to infinity E to the minus two Pi ISX F of X DX the result integrated with respect to S. I swapped the limits of the order of integration. You see, what’s inside that integral is nothing but the Fourier transform of F. This is equal to the integral from minus infinity to infinity, the Fourier transform of G of S bar the Fourier transform of F of S DS and I am done. Where do we start? Where do we finish?

We started with the integral of the inner product – the integral of the Fourier transform of F times the Fourier transform of G – no, I didn’t. Here’s where I started. I started with – there I started! Okay. Up there. I started with the integral of F times G bar and then ended with the integral of the Fourier transform of F times the Fourier transform of G bar and that is what I wanted to get.

Ain’t life good? And it all works and it’s all perfectly justified because the class S is the perfect class for all Fourier analysis, and next time you’re gonna see how we get from what seems to be a highly restricted class to all the functions that society needs to operate. All that and more, coming up.

[End of Audio]

Duration: 51 minutes