ProgrammingParadigms-Lecture06

Instructor (Jerry Cain):Hey, everyone. Welcome. I have one very short handout for you today. It is the handout that has two problems that weíll be going over during tomorrow afternoonís discussion section. Remember, we donít have it at 3:15; that was just last week to accommodate what I assumed would be a large audience. Itís at 4:15. Itís actually permanently in the room down the hall, in Gates B03. So Iím not teaching the section, Samaya is, who I think is here, who is handing out the handout. So look for that guy tomorrow at 4:15.

Both of these are all the exam problems, so theyíre certainly good problems to understand the answers to. And if youíre not gonna watch the section or attend, just make sure you at least read the handouts. You may very well be able to do the problems yourself, and if so then thatís fine, but if not, make sure you watch the discussion section at some point.

When I left you on Friday, I was a quarter way, a third a way through the implementation of an int-specific stack. So thereís nothing technically cs107 level about this, except for the fact that weíre being careful to write it in pure C, as opposed to C++.

Now, if you remember the details from Friday, about the interface file, the functions are, I mean, Iíll talk about those in a second. We actually expose the full struct. There are no classes, and thereís no public, and no private. So everything is implicitly public in the dot age. Thatís a little weird. C++, when you learned about classes and objects, it was all about encapsulation and privacy of whatever could be private. We canít technically do that in C; although, we can certainly write tons and tons of documentation saying just pay attention to the type, use these functions to manipulate it, and pretend that these are invisible to you.

You more or less operate as if this is a constructor, a destructor, and methods, but they happen to come in pure, top-level function form, where you pass in the structure being manipulated. So StackNew is supposed to take the struct, one of these things, addressed by s, and take it from a 12-byte block of question marks to be logically representing a stack of depth zero. This is supposed to kill a stack at that address. This is supposed to increase its depth. This is supposed to pop something off.

I wrote StackNew and StackDispose. I certainly wrote StackNew, Iím forgetting about StackDispose, but Iíll write them really quickly right now.

This is the dot c file:

Stack .c

void StackNew

Stack

I add these three things:

Stack logLength = 0 (thatís because the stack is empty)

Iím going to make space for four elements, and then Iím going to go ahead and allocate space for four elements using c as a raw dynamic memory allocator, called malloc. It doesnít take anything related to a data type. It doesnít know that anything related to a data type is coming in as an argument. You have to pass in the wrong number of bytes that are needed for your four integers. And thatís how you do this. All that malloc feels is an incoming 16. It goes to the heap and finds a figure thatís that big, puts a little halo around it saying itís in use, and then returns the base address of it.

I told you to get in the habit of using a certain macro, that was mentioned a little bit in Assignment 1, just to confirm that elems ? null. If malloc fails for some reason it rarely will fail because it runs out of memory, itís more likely to fail because you called free on something you shouldnít have called freed on. So youíve messed up the whole memory allocator behind the scenes.

Thatís a good thing to do right there because it very clearly tells you where thereís a problem, as opposed to it just seg filtering or bus erroring or it crashing in some anonymous way, and you have not idea how to back trace to figure out where the problem started.

As far as stackDispose is concerned, this is trivial. Although, thereís one thing I want to say about it, actually, two things I can think of. I want to go ahead and do the opposite of malloc. This corresponds to operator delete from C++. I want to free s-> elems so that it knows that whatever figure is identified by that address right there should be donated back to the heap. Thatís just what I mean when I talk about free right here. Some people question whether or not I should go and actually free s itself. They ask whether or not that should happen. The answer is no.

Nowhere during StackNew did you actually allocate space for a stack. You assumed that space for the stack had been set aside already, and that the address of it had been identified to the StackNew function. So you donít even know that it was dynamically allocated. In fact, the sample code I wrote last time declared a stack called s as a local variable. So it isnít dynamically allocated at all. So you definitely in this case do not want to do this.

The other thing I want to mention is that regardless of whether or not the stack, at the moment this thing is called, is of depth zero or of depth 4500, the actual ints that are held inside the dynamically allocated rectangle of bytes, thereís no reason to zero them out. And thereís certainly no freeing needs held by those integers. I say that because just imagine this not being an int stack but imagine it being a char * stack, where Iím storing dynamically allocated c strings, Fredsí and Wilmasí and things like that.

If I did have that, in the case of the char *, I would have to for loop over all the strings that are still held by the stack at the time itís being disposed and make sure I properly dispose of all of those strings. I donít have any of that with the int specific version. But the reason Iím saying this is itís going to have to accommodate that very scenario when we go generic on this thing and just deal with blobs as opposed to integers.

The most interesting of the four functions is the StackPush. And itís interesting not because of the algorithm to put an integer at the end of an array, but the algorithm thatís in place to manage the extension of whatís been allocated to be that much bigger because youíve saturated whatís already there.

I chose an initial allocated length of four, so Iím certainly able to push four integers onto the stack and not meet any resistance whatsoever. But if I try to press a fifth in on the stack, it has to react and say, I donít have space for five integers I better go and allocate space for some more, copy over whateverís been pushed, and dispose of the old array to make it look like I had space for 8 or 16 or 1,024 or whatever.

So the implementation, assuming I do have enough space, would be this simple: StackPush. Pushing onto what stack? The one thatís addressed by this variable called s. What number am I pushing on? The one that comes in via the value parameter.

Let me leave some space here for what Iíll inline as the reallocation part. But if thereís enough memory, where down here I can assume that thereís definitely enough memory, I can do this:

s-> elems of s-> logLength equals this value

Think about the scenario where the stack is empty. The logLength happens to be zero, which happens to be the index where you should insert the next element.

Once I do this for the next time, I have to go ahead and say, you know what, the logLength just increased by one. Not only does that tell me how deep the stack is, it also tells me the insertion index for the very next push call. It isnít that simple. Eighty percent of the code that gets written here has to deal with the one in two the end time scenarios where you actually are out of space.

If it is the case that s-> logLength == s-> allocLength, then as an implementation youíre unhappy because youíre dealing with a stack and a value where the value has no home in the stack at the moment.

Iím trying to press on a 7. Letís say that s points to this right here, and the logical length and the allocated length are both 4, and Iíve pushed these four numbers on there, the client doesnít have to know that thereís a temporary emergency. So right here Iím just gonna react by saying, you know what, that 4 wasnít big enough, I want to go ahead and I want to, without writing code yet, I want to basically reallocate this array. Now, I say reallocate because there really is a function related to that word in the standard C library.

In C++ you have to go ahead and allocate an array thatís bigger. You donít have to use a doubling strategy; Iím just using that as a heuristic to allocate something thatís twice as big. You have to manually copy everything over, and then you have to dispose of this after setting elems equal to the new figure.

It turns out C is more in touch with exposed memory than C++ is. So rather than calling malloc yourself, you can call a function thatís like malloc except it takes a value, thatís been previously handed back to you by malloc, and says please resize this previously issued dynamically allocated memory block. That is this:

Let me write just one line right here. I want to take allocLength and I want to double it. I could do plus equals 10; I could do plus equals 4. I happen to use a doubling strategy. And then what I want to do is I want to call this function. Let me deallocate these so I have space to write code.

s-> elems equals this function called realloc

Thereís no equivalent of this in C++. Iíll explain it in a few weeks why there isnít. But realloc actually tries to take the pointer thatís passed in and it deals with a couple of scenarios. It sees whether or not the dynamically allocated figure can be resized in place, because the memory that comes after it in the heap isnít in use. Thereís no reason to lift up a block of memory and replicate it somewhere else, to resize it if the part after the currently allocated block can just be extended very easily in constant time. The second argument to realloc is a raw number of bytes. So it would be:

s-> allocLength times size of integer.

I see a lot of people forgetting to pass in or forgetting to scale this byte size event. Even though that they know to do it with malloc, they forget with realloc for some reason. It takes this parameter right here, it assumes itís pointing to a dynamically allocated block, and it just takes care of all the details to make that block this big. If it can resize it in place, all it does is it records that the block has been extended to include more space and it returns the same exact address.

So that deals with the scenario where this is what you have, this is what you want, and it turns out that this is not in use so it can just do it. So it took this address in and it returns the same exact address.

Well, you may question why does it return an address at all? In this case it doesnít need to. However, it may be the case that you pass that in and you want to double the size or make it bigger, and that space is in use. So it actually does a lot of work for you there. It says, okay, well, I canít use this so I have to just go, and it really calls malloc somewhere else on your behalf. Just assume thatís twice as big as this. Whatever byte pattern happens to reside here is replicated right there. The remaining part of the figure isnít initialized to anything, because itís uninitialized just like most C variables are. It actually frees this for you and it returns this address.

So under the covers, beneath the hood of this realloc function, it just figures out how to take this array and logically resize it, and preserves all the meaningful content thatís in there. If it has to move it it does free this. It turns out realloc actually defaults to a malloc call if you pass in null here. So technically you donít need the malloc call ever. That actually turns out to be convenient when you have entered a processes that have to keep resizing a node, and you donít want a special case it and call it malloc the very first time, you can just call it realloc every single time when the first parameter is null on the very first iteration.

Itís very easy to forget to catch the return value. If you forget to do that then you refer to s-> elems captures the original address, which may have changed, and you now after this call may be referring to dead memory thatís been donated back to the heap manager. So itís very important to catch it like this. If realloc fails itíll return null. So Iím gonna put an assert right here.

The one thing about realloc thatís neat is that if you donít want to just end the program because it failed, it actually, if it returns null it wonít free the original block, it would only return null if it actually had to move it. Actually, thatís not true. If it canít meet the realloc page in request itíll just leave the old memory alone and return null. In theory, you donít have to assert. And in the program here you could just check for null, and say, okay, well, maybe I wonít resize it, maybe Iíll just print a nice little error message saying, I actually cannot extend the stack at this time, my apologies, please do something else.

Weíll learn a lot about how malloc and realloc and free all work together. Theyíre implemented in the same file. Theyíre implemented in the same file because even though that the actual blob of memory doesnít expose its size to you, like in a dot length field like in Java, somehow free knows how big it is. Well, thereís cataloging going on behind the scene so it knows how much memory to donate back to the heap every time you call free. But I donít want to focus on that.

This right here, it doesnít get called very often. This doubling strategy is popular because it only gets invoked as a code block one out of every two to the n calls once you get beyond 4. Because of the doubling strategy [inaudible] only comes up once every power of 2; 512 wasnít big enough, okay, well, maybe 1,204 will be. And you have a lot of time before you need a second reallocation request.

There are a couple of subtleties as to how this is copied. In this example right here, all of these little chicken scratch figures right there, they correspond to byte patterns for integers. So when I replicate them right there, I trust that the interpretation of those integers right there will be the same right there. So realloc is really moving my integers for me.

If these happen to be not four integers but four char *ís, that means the byte patterns that were here would have actually been interpreted as addresses. When it replicates this byte pattern down here, it turns and it replicates the addresses verbatim. This would point to that; that will point to that; this will point to that; this will point to that. Do you understand what I mean when I do that?

When I dispose of this it doesnít go in and free the pointers here. It doesnít even know there are pointers in the first place so how could it free them? So as this goes away, and these all point to where other pointers used to be pointing, you donít lose access to your character strings.

This is certainly the most involved of all four functions.

Student:If you could explain the assertion again.

Instructor (Jerry Cain):All assert is, for the moment just pretend that itís a function. It takes a boolean. Its implementation is to do absolutely nothing and return immediately if it gets true, and if it gets false its implementation is to call exit after it prints an error message saying an assert failed online, such and such, in file stack dot c.

Student:So actually [inaudible].

Instructor (Jerry Cain):S-> elems ? null. You want to assert the truth of some condition that needs to be met in order for you to move forward. If realloc fails it will return null. You want to make sure that did not happen. Thatís why there is a not equals there as opposed to a double equals.

This isnít technically a function. Weíll rely on that knowledge later on. Itís technically whatís called a macro. Itís like a #define that takes arguments. But, nonetheless, just pretend for the moment that it works like a function.

Student:If the realloc has to move the array is it now or anytime?

Instructor (Jerry Cain):Yeah. The question is if realloc has to actually move the array it is time consuming. Itís even more time consuming than o of m. It actually involves not only the size of the figure being copied, but the amount of time it takes to search the heap for a figure that big. So it really can, in theory, be o of m, where m is the size of a heap.

Let me just quickly, just for completeness, write stackPop. It turns out itís not very difficult. The most complicated part about it is making sure that the stack isnít empty.

This continues on this board right here. I have this thing that returns an int stackPop. Iím popping off of this stack right here. No other arguments assert that s-> logLength is greater than 0. If you get here then youíre happy because it has something you can pop off. So go ahead and do the following:

s-> logLength -- and then return s-> elems of s-> logLength

The delta by one and the array access are in the opposite order here. I think for pretty obvious reasons. Youíre effectively control zíing or command zíing the stack to pop off what was most recently pushed on. So if this came most recently, you want to say I didnít even do that. Oh, and by the way, thereís the element that I put there by mistake. Thatís what I mean.

You could say you see this as demanding a reallocation request and going from 100 percent to a new 100 percent, where the old 100 percent was actually 50 percent. You may ask whether or not if you fall below a 50 percent threshold that you should reallocate and say, oh, Iím being a memory hog for no good reason and donate it back to the heap. You could if you want to.

My understanding of the implementation of realloc, because it wants to execute as quickly as possible, it ignores any request to shrink an array. As long as it meets the size request, it doesnít actually care if it allocates a little bit more. So itís like, oh, the size is 100 bytes and you just want 50. Okay, then only use 50 but Iím gonna keep 100 here because itís faster to do that.

With regard to the stackPop prototype right there, it sounds like a dumb observation, but itíll become clear in a second when we go generic. This particular stackPop elects to return the value being popped off. I can do that very easily here because I know that the return value is a four-byte figure. If this were a double stack I would just make that leftmost int up there double and it would just return an eight-byte figure.

When I go generic, and I stop dealing with int *ís and I start dealing with void *ís, Iím gonna have to make the return type either void *, which means I return a dynamically allocated copy of the element, for reasons thatíll become clear in a little bit Iím not gonna like that, or I have to return void and return the value by reference by passing in an int * or a void * right here. That will become more clear once I actually write the code for it. But we take advantage of a lot of the fact that we know that weíre returning a four-byte figure here so the return type can be expressed quite explicitly as in int.

So now what I want to do is I want to start over. And I want to implement all of this stuff very generically. And I want to recognize that weíre trying to handle ints, and bools, and doubles, and struct fractions, and actually the most complicated part of it are char *ís because they have dynamically allocated memory associated with them.

Let me redraw stack dot h. And we are going completely generic right here. Most of the boilerplate is the same. Typedef struct, and I donít want to commit to a data type, elems. Now, think about what I lost. I lost my ability to do pointer arithmetic without some char * casting. I also lost intimate knowledge about how big the elements themselves are. So I canít assume the size of int anymore because it may not be that; it probably wonít be. So Iím gonna require the prototype of StackNew to change, to not only pass in the address of the stack being initialized, but please tell me how big the elements that Iím storing are so that I can store it inside the struct.

The logical length versus the allocated length and the need to store that, that doesnít change. I still want to maintain information about how many of these mystery elements I have. I also want to keep track of how much I am capable of storing, given my current allocation. And thereís gonna be one more element that we store on a byte. So Iíll leave that piece of spencil for the next 15 minutes and weíll come back to it.

The prototype of the functions change a little bit. StackNew, same first argument, but now I take an elemSize.

Void * stackDispose

stack *s (That doesnít need to change. Itís mostly the same, although, weíll have something to say about what happens when itís storing char *ís.)

Void stackPush

stack s

And then Iím gonna pass in void * called elemAddr.

I canít commit to a pointer type thatís anymore specific right there Ďcause it may not be an int *, it may not be a char **, it may not be a structFraction *. It just needs to be some address that I trust because the information that I am holding is pointing to an s-> elemSize figure. So thereís that.

This is the part that freaks people out. This is what Iím going to elect to do, Iíll talk about the alternative in a second. But when I pop an element off the stack, I want to identify which stack Iím popping off of, and I also want to supply Address. This is me supplying an address. Iím actually gonna identify a place where one of my client elements, that was previously pushed on, should be laid down. Itís like Iím descending a little basket from a helicopter so somebody can lay an integer or a double or something in it so I can reel it back up to me. So void * elemAddr. And thatís all Iím gonna concern myself with.

Student:Where is the struct named stack?

Instructor (Jerry Cain):Iím sorry, I just forgot it. Itís right there.

So 70 percent of the code Iím gonna write itís all gonna be the same. Itís gonna be slightly twisted to deal with generics as opposed to ints. But rather than using assignment, which works perfectly well when youíre taking one space for an int and assigning it to another int, weíre gonna have to rely on memcopy and things like that.

StackNew is not hard.

Void StackNew

stack *s int elemSize (Same kind of stuff)

s-> logLength = zero

s-> allocLength = four

s-> elemSize = elemSize

s-> elem = malloc

I donít have a hard data type so I canít use size of here, but I have no reason to. I have been told how big the elements are via the second parameter. So four times elemSize, just use the parameter as opposed to the field inside the struct.

I can do a few things to make my life easier. S-> elems better not be equal to null or else I donít want to continue. And the assert will make sure of that. I also could benefit by doing this: assert that s-> elemSize Ė this one is not as important because it takes a lot of work to pass in a negative value for elemSize. But, nonetheless, itís not a bad thing to put there because if you try to allocate -40 bytes itís not gonna work. That and implementation make sense.

I think even if it makes sense, thereís some value in seeing a picture. It takes this stack right there, with its four fields inside at the moment, and letís say I want to go ahead and allocate a stack to store doubles. That means the elemSize field would have an eight right there. Iíd set aside space for four of them. The logical length is zero. Iím just making sure this is consistent with the way Iíve done this. Thatís right. And then I would set this to Point 2. As far as the stack knows, it has no idea doubles are involved, it just knows that itís a 32-byte wide figure. And it has all of the information it needs to compute the boundary between zero and one, one and two, and two and three.

As far as stackDispose is concerned, stack *s, this is incomplete, but itís complete for the moment given what Iíve talked about. I just want to go ahead and I want to free s-> elems, and not concern myself yet with the fact that the actual material stored inside the blob might be really complicated. Just for the moment think ints, and doubles, and plain characters. But the fact that Iím leaving space there [inaudible] that this will change soon.

Let me write stackPush right here.

void stackPush

stack *s

void * elemAddr

Iím gonna simplify the implementation here a little bit by writing a helper function. I could have written the helper function on the int specific version but just didnít.

Up front, if itís the case that s-> logLength == s-> allocLength, then I want to cope by calling this function called stackGrow. And you know what stackGrowís intentions are. And Iím just gonna assume that stackGrow takes care of the reallocation part. Iíll write an int second, thatís not the hard part. Once I get this far, whether or not stackGrow was involved or not, I want to somehow take the s-> elemSize bytes that are sitting right there and write into the next slot in memory that I know is there for me, because if it wasnít this would have been called.

So what has to happen? Let me refer to this picture. Letís just assume that this picture is good enough, because I obviously have enough space to accommodate this new element. Letís say that I have three elements. So that these have been filled up with interesting material. And I somehow, in the context of that code over there, have a pointer to some other eight-byte figure that needs to be replicated in.

This is gonna be the second argument to memcopy. This right there is gonna be the third argument. The only complexity is computing and figuring out what the first argument is. It has to be that. It doesnít need to know that itís a double *, or a struct with two ints inside *, or whatever. It just needs to actually get the raw address and replicate a byte pattern. No matter what the byte pattern is, as long as itís the same in both places, youíve replicated the value.

So the hardest part about it is doing this:

void * target = char * s-> elems + s->

logLength times s->elemSize

This is why I demanded all of this stuff to be passed to my constructor function so it was available to me to do the manual pointer arithmetic later on.

I donít think I messed this up. LogLength is right there for the same reasons it was in between the square brackets for the int specific implementation of this.

Then what I want to do is I want to go ahead memcopy into the target space whateverís that elemAddr. How many bytes? This many, s-> elemSize. Then I canít forget this, this was present at the other implementation as well. I have to note the fact that the logical length just increased by one. So thatís how I managed that.

Do you guys see just the bytes moving on your behalf in response to this function?

Thereís a question back there.

Student:The char * after void target equals?

Instructor (Jerry Cain):This right here, remember that s-> elems, unless I made a mistake, is typed to be a void *. So you canít do pointer arithmetic on the void *, so the trick Ė thereís actually two tricks. I opt with this one because Iím just more familiar with it, you can either cast it to be a char * so that pointer arithmetic defaults to regular arithmetic. This as an offset itís still the offset, it just happened to involve multiplication. It is itself implicitly multiplied by size of char, which as far as multiplication is concerned, is a no op.

I have also seen people, I think I mentioned this before, Iíve seen people cast that to be an unsigned long, so that it really is just plain math, and then they cast it to be void *. Pointers and unsigned longs are supposed to be the same size on 32-byte systems. A long is supposed to be the size of a word on the register set. So Iíve seen that as well. You can do whatever you want, I just Ė all of my examples use the char * version so thatís why I use that one.

The stackGrow thing. The reason I want to do that is not because of the mem copying or the void *, I just want to explain what a student asked two seconds before class started. Youíre kind of already getting the idea that the word static has like 85 meanings in C and C++. Well, hereís one more.

When you see the word static, decorating the prototype of a C or a C++ function, not a method in a class just a regular function, such as static void stackGrow, and it takes a stack *s. What that means is that it is considered to be a private function that should not be advertised outside this file. So in many ways it means private in the C++ sense. The technical explanation is that static marks this function for whatís called internal linkage.

You know how youíre generating all these dot o files, when you type make all this stuff appears in your directory, some of them are dot o files. Iíll show you a tool later on where you can actually look at whatís exported and used internally by those dot o files.

StackPush, and stackNew, and stackDispose are all marked as global functions, and that the symbols, or the names of those functions, should be exported and accessible from other dot o files, or made available to other dot o files. Something like this is marked as whatís called local or internal. And even though the function name exists, it canít be called from other files. That may seem like it was a silly waste of time, but it really is not.

Because you can imagine in a code base of say one million files, itís not outlandish believe it or not. Think Microsoft Office, the whole thing, probably has on the order of hundreds of thousands of files, maybe tens of thousands, I donít know what it is, more than a few. You can imagine a lot of people defining their own little swap functions. And if theyíre not marked as internal functions at the time that everything is linked together to build Word or Excel or whatever, the linker is gonna freak out and say, which version of swap do I call? I canít tell. Because it has 70 million of them. But if all 70 million are marked as private, or static, then thereís none of those collisions going on at the time the applicationís built.

This is responsible for doing that reallocation. Assume itís only being called if it understands that this is being met as a precondition. So it can internally just do this:

s-> elems = realloc

s-> allocLength times s-> elemSize

Thatís the cleaner way to write it. It makes this focus on the interesting part thatís hard to get, and kind of puts this aside as uninteresting.

Algorithmically, the function thatís most different from the integer version actually is this StackPop call. This is how I want to implement it:

void stackPop.

Iím popping from this stack right here. Iím placing the element thatís leaving the stack and coming back to me at this address. Thereís no stack shrink function to worry about. So what I want to do is declare this void *, not called target but called source = char * of s-> elems plus s-> logLength minus 1 times s-> elemSize. I forgot to do the minus minus beforehand so I recovered by doing a minus 1 right there.

This is where weíre drawing a byte pattern from in the big blob behind the scenes. Weíre drawing byte patterns from there so we can replicate it into elemAddr. ElemAddr is the first argument. Iím copying from that address right there how many bytes. This many. And then do what I should have done earlier, logLength --. This really should have been the first line, and I shouldnít have the -1 there, but this is still correct.

Do you guys get whatís going on here? If you understand the helicopter basket analogy? Weíre on actually identifying a space where itís safe to write exactly one element so that when the function returns I can go, wow, thatís been filled up with an interesting element to me. And I can go ahead and print it out or add it to something or replace the seventh character or whatever I want to do with it.

This used to be int. If I wanted to I could have punted on this right here and just passed in one argument. And I could have returned a void * that pointed to a dynamically allocated element thatís elemSize bytes wide. And I just would have copied not into elemAddr, but into the result of the malloc call. With very few exceptions, malloc and strdup and realloc being them, you usually donít like a function to dynamically allocate space for the call and then make it the responsibility of the person who called the function to free it.

Thereís this asymmetry of responsibility, and you try to get in the habit as much as possible of making any function that allocates memory be the thing that deallocateís it as well. Thereís just some symmetry there and itís just easier to maintain dynamically allocated memory responsibilities.

Itís not wrong itís just more difficult to maintain. It actually clogs up the heap with lots and lots of little void *ís pointing to s-> elemSize bytes, as opposed to just dealing with the one central figure thatís held by the stack, and then locally defined variables of type int and double that are passed in as int * and double *ís recognized as void *ís. But because we laid down the right number of bytes, as long as everything is consistent we get back ints and doubles and things like that.

Student:In this case do you have to with our elemAddr [inaudible] right?

Instructor (Jerry Cain):Yeah. Iím not actually changing elemAddr Iím just changing whatís elemAddr. So let me actually make a sample call to this.

Suppose I have a stack s. And I called StackNew with & of s, and I pass in size of int. That means that I have this thing, thatís my stack and Iím supposed to just take it as a black box and not really deal with it. But I know behind the scenes that it has a 4 there, and maybe my stack has 14 elements in it, and it can accommodate 16 elements. And that it points to this thing that has not 2 or 4 or 8, but 16 elements in it. And Iím like, you know what, I did all this work, and I pushed on 14 elements right there, but Iíd like to now pop off the top element.

When I do this stackPop, I have to pass in that. I have to pass in the address of an integer. So the stackPop call has a variable called elemAddr that points to my variable called top, and it relies on this variable to figure out where to write the element at position 13. It happens to reside right there. The memcopy says where do I write it? I write it at that address right there. You would replicate this byte pattern in the top space, this returns, and I print out top and that corresponds to the number 7,300 or something like that.

If I really wanted to change this void *, I donít change the void * anywhere, I just use it as sort of a social security number or an ID number on the integer. If I really wanted to change this youíd have to pass in a void * *. There are scenarios where that actually turns out to be what you need, but this is not one of them.

Student:For this pop, letís say that this function doesnít work, that itís popping the wrong stuff. How do we test to find out if we pop [inaudible]. Like, for instance, letís say you wanted to pop something that was an integer, and you popped [inaudible], and then what if there was something wrong with the resizing, can we set that in that we just pop to like null or something and then Ė

Instructor (Jerry Cain):You could. If you really want to exercise the implementation of the stack to make sure itís doing everything correctly, youíll write these things, you may have heard this, what are called unit tests. Which are usually implemented Ė thereís two different types of unit tests, thereís those that are written on the dot c end, which know about how everything works, and then those that are written as a client to make sure that itís behaving like you think it should. If that were the case, and you wanted to protect against that, you could write all these very simple tests to make sure that things are working.

Because in unit tests, as a client you might for loop over the integers one through a million. You might actually set the initial allocation length not to be four, but to be one, so that you get as many reallocations as possible. You could for loop and push the number one through a million on top. In a different function, you could actually pop everything off until itís empty and make sure that they are descending from one million down to one. And if it fails, then you know that thereís something wrong. If it succeeds, you never know that something is completely working, but you have pretty good evidence that itís probably pretty close if it isnít.

You can change the implementation to kind of make sure that all of the parts that are really risky, like the reallocation, and the

--, and the memcopy calls are working, one thing you could do, and Assignment 3 takes this approach a little bit, right here I use this doubling strategy. If I really wanted to test the reallocation business, I could actually do a ++ instead of a times two on the allocation size and to reallocate every single time. Not because you want it to work that way, but because you could just make sure that algorithmically everything else works regardless of how you resize them.

Now, another answer to your question, if I didnít take that approach with the unit test answer, is that when youíre dealing with generics in C, and void *ís, and memcopy, and manual guests, you have to be that much more of an expert in a language to make sure that you donít make mistakes.

Itís very easy. Think about it this way, I know youíre not gonna believe this yet because you havenít coded in C that much, but Assignment 3 you will. I know how most of you program. You write the entire program and then you compile it. And you have 5,500 errors. And you get rid of them one by one, and it takes you like three days, and then it finally compiles. And you run it, and even if it doesnít quite run the way you want it to it rarely crashes. Most people donít even know what the word crash means until they get to 107. You will next week, trust me.

As far as C++ is concerned, because itís so much more strongly typed than C is, it is possible for you to write an entire program, to have it compile, because compilation does so much type checking for you in a C++ program that uses templates, itís quite possible that once it compiles that it actually works as expected. It doesnít happen very often but itís certainly possible.

In a C program, where youíre using void *ís, and char *ís, and memcopy, and memmove, and bsearch, and all of these other functions youíre gonna need to use for Assignment 3, itís actually very easy to get it to compile. Because itís like, oh, void *, I can take that, yep, thatís fine. It just does it on all the variables, and so it compiles. And youíre like, good, it wasnít three days, it was one day. And then you run it and it crashes because you did not deal with the raw exposed pointers properly. So thatís what makes certainly the first assignment in pure C with dealing with void *ís difficult.

But thereís an argument that can be made to say that itís very hard to get all this stuff right every single time you program. Iíve already told you that right here I forget this s-> elemSize probably one out of every two times I teach this example. Thatís because itís very unnatural compared to the C++ way of allocating arrays to think in terms of raw bytes. And you think in terms of sizes of the figures, you see the pictures in your head as to how big things should be, so you just remember that number right there but you forget about this. If you forget this right here, it compiles, it runs. If youíre dealing with integers then your array is one-fourth as big as it needs to be to store all the integers you want there.

You will learn this and feel it at 11:59 a week from Thursday. All these types of errors, because itís gonna compile and itís gonna run very often. But occasionally itís gonna crash and youíre not gonna know why, and youíre gonna say, oh, itís the compiler. Itís not, itís your code.

Okay. So we will talk more on Wednesday.

[End of Audio]

Duration: 51 minutes