ProgrammingAbstractions-Lecture03

Instructor (Julie Zelenski):Itís good to see you guys. Just a note about the handouts that are going out today so you arenít confused about our ability to count in sequence Ė six, eight, nine and ten are in the lobby. Five and seven weíre going to bring at the end of the lecture, and Iím trying to avoid killing too many trees by what we do with five and seven. Five and seven are the handouts that tell you about the installing of the compilers and the debuggers of the two compilers, and thereís two versions of them. Thereís a Mac version and a Windows version for the X code versus Visual Studio.

We printed an estimated amount of how many people we thought were going to be using the PCs and how many using the Mac, and so when we bring those in at the end and youíre picking them up on the way out, be sure to get the one that you need and not take both. Take the ones for the product youíre going to use. If you happen to believe youíre really going to use both, itís fine to take both, but I figure thatís pretty rare. If you know what platform youíre on, take the one for yours and not both.

Those are the ones that tell you about how to get your compiler set up, which seems to be something a couple people this weekend were ready to get a jump on. Assignment 1 is also one of the handouts thatís going out today. For assignment one, just to kind of come up to speed is that the first assignment here is not some big, complicated program. Itís actually a set of small programs. Each is less than a page of code. Theyíre designed to exercise some skills in isolation Ė learning how loops and variables work, learning how our graphics library works, and learning how strings work in C++ and files.

The more complicated part is going to come from learning how to express yourself in C++. For those of you who are a little unsure about whether your background is good and whether youíre in the right place, this is actually a good testing ground. You should look at those problems and say oh, if I were solving this in my native language, I would be totally able to write this out without even stopping to think much. In C++, Iím definitely going to have some new things about learning how to convert what I know to how it works in C++.

If you do find these challenging to write, thatís probably a sign that youíre a little bit ahead of you. If you find them completely trivial and way too easy, you may want to think about 106X. One way to gauge that is to look at their first assignment. If you feel ready to do something like that, then maybe thatís a place that would give you a little bit more of the challenge youíre looking for.

We have all the section preference information in and weíre doing the big matching. We will email out the section assignments. Our sections mostly meet on Thursday, but some are on Wednesday and Friday. Youíll get an email tomorrow that tells you about where and when your section is meeting. If you have a conflict that has arisen post the decision making, we have a little bit of an ability to do some add late and rearrangement, but itís pretty tricky. If you can make the section that youíre assigned to, thatís going to help us the most. We will try to accommodate you if we can to make a switch for you.

What are we going to talk about today? Weíre going to talk about libraries and C++ and string and stream classes. Thereís some [inaudible] mostly chapter three and then the handout that went out last week about general C++ has some information thatís useful. Thereís also a library reference I gave out today which is handout ten, which is nothing but an overview that reminds you of some of the features. One other thing thatís going to be the theme for the next three lectures or so is going to be thereís a lot of material in C++. There are these huge libraries that do lots of things.

The goal of these weeks is not to make it so that you are an expert on all the minutia but that you are comfortable with the basic facilities and you know where to look to find out more. If youíre trying to figure out how the substring operation works on a C++ screen, what are the arguments? What do they mean? What are the cases I need to be worried about? Youíll know where to start and how to find that information as opposed to memorizing the whole of the library. Itís not really a feasible task and not even important.

Think about it as need to know. Iím going to get you the basics so you know whatís out there and then as you start to write code, youíll learn the specific things you need to solve the problem at hand. Any questions administratively? I had one student come and hang out with me on Friday. Iím hoping that was because it was so last minute. This Friday, I donít have a meeting, so put it on your calendars now for Friday hanging out.

C++ libraries Ė the notion of a library is really nothing more than saying youíve got some functionality that you want to provide to all users of C++ or all students enrolled in CS106. There is some reasonable grouping to these things. You have a bunch of operations that operate on strings or that allow you to do graphic works or allow you to do event handling or something. The library is the packaging device in C++ where you say hereís a set of routines. Typically, it comes with two pieces. One is the interface or declaration or header file. It tells you about what routines are in there Ė what are their names? What are the prototypes? How do you use them?

It often contains good comments about the things that you would need to know as a client using that facility Ė how to use it effectively and correctly. There is the code that really implements it that when you make the call to a substring operation, how does it actually work? Well, thereís some code that backs it that does the operation that gets called at runtime when you make a call to that function. The libraries that weíre going to see this quarter form roughly two big groups.

There is the C++ standard library, so things that come with every C++ compiler. No matter where you continue coding in C++, you will always find things like the C++ string, the C++ IO stream, the file stream which is the F stream. Thereís a math header. There are headers that deal with all sorts of other facilities that are beyond what weíre doing here. These are the ones that weíll see most commonly in the early part are string and stream.

The typical include for them is going to be the angle brackets. Thatís the sign to the compiler weíre looking for something from the standard header locations. One way to remind yourself about how to distinguish these from our special libraries is that youíre going to see these very terse and lowercase names. Part of the legacy of C and C++ was as a professional programmerís tool, they tended to value terseness over any kind of verbose and descriptive names, making it a little easier to type a little faster to get your point across.

Things like the cout, which is the console out stream, the get line, the [inaudible] call. Thereís the substring name of the function there. Those tend to be short and throw away vowels where they can. They tend to be all lowercase. Whenever youíre looking at a routine, you might wonder where it comes from. If it has this capitalization scheme, itís likely to be something coming out of the standard libraries.

In addition to what we have present in the standard, we also have about seven libraries that weíve included as part of CS106 mostly to make our lives a little bit easier. Things like the random library or the simple IO library Ė they actually layer on existing functionality that is already present in the standard libraries, but the way the functionality is expressed in the standard is just a little bit awkward or unhelpful for the task we need to do. Weíve provided a layer that cleans it up for you.

The graphics is a good example of where there is no graphics library included in standard C++. If youíre working on graphics on Windows, you have access to a different toolkit than you do on the Mac or on Linux or some other platform. We have tried to abstract out a very simple graphics library that we can run on both Mac and Windows that then we provide one interface through that in turn talks to your platform in its native language to take those Windows and drawing things happen.

Our header files are always in the double quotes. We typically use a strategy of having capitalized verbose names. We donít throw away our vowels. We try to make a little bit of sense to describe the action thatís being taken. Today, what Iím going to look at is Iím going to look at one of the 106 libraries here at the beginning because itís a nice, easy one to get our head around.

Iím going to go and look through the C++ string library and then Iím going to hopefully get a chance to even start talking a little about the C++ stream library. Along the way, there will be two additional 106 libraries that help out with string and stream that provide a little bit more functionality than whatís already there.

Randomness comes up in all kinds of simulations in game playing. You want the computer to simulate some random behavior Ė flipping a coin, rolling a die or shuffling an array or a deck of cards. Computers actually arenít capable of true randomness in the sense that you might think in the real world, but they have whatís called pseudo randomness behavior where it can generate numbers in a sequence that appears effectively random from the outside even though there actually is some determinism in how it operates.

There is a set in our library Ė there are four functions that form the CS106 random library that are used for all kinds of random behavior that you want. I note here that they are free functions, and by free functions, I mean functions that arenít on a particular class. Theyíre actually globally accessible. You donít call them by sending [inaudible] particular object. They just exist at the top-level name space and you can just call them anywhere and any time. When youíre ready to get some random behavior, you make a call to one of these routines.

There is an initialization routine for this library. The randomized call Ė thatís called once and exactly once in your program, usually at the very beginning to set up a new random sequence. Thatís whatís called seeds of the generator to get it started in a new place. Then, once youíve made that call, you can intersperse any number of these calls to simulate certain random events.

The standard random number generator thatís in C++ provides all of this through one call. Weíll generate a random number from zero to the largest number possible, and then you can decide how to map that to other things. If you want the ability to flip a coin, you want to say half the numbers are odd and half the numbers are even. You could do something like generate a number and see if itís odd or even, or see if itís from the bottom half of the range versus the top half.

What these functions do is just provide that functionality and package it up for you neatly. You can say things like random integer low to high Ė you say one to five. It will give you a number from one to five inclusive. If you keep calling that, you should see an unpredictable sequence of the one, two, three, four, five coming in jumbled and mixing up. There are no real guarantees about what order youíll see it that will allow you to simulate random events.

The random real Ė same sort of idea but in this case using boundaries that are expressed in real numbers and returning in real number. Again, the bounds are inclusive, so it canít actually return the number low or high or anything in between. The last one is simulating a probability true/false value. Given the probability of 0.5, half the time it should return true and half the time it should return false. If you give it a probability of .25, one quarter of the time it will return true and the other three quarters of the time it will return false. It allows you to simulate coin flips or other random events where you have a [inaudible] distribution.

The point of a library is to take some set of facilities that are needed, package it up, have a vision of how they work together, a naming convention and design convention that makes them coherent, that they provide some convenience and theyíre complete. They cover all the bases. These three provide a pretty good range of different kinds of random events. There are still other things you might need to simulate, but you can typically do it in terms of using one of these.

You could also have left one out and have to simulate the others from it, but each of them has a client use thatís pretty handy, so it actually has all three of them for your use. [Inaudible] comes out of random.h in the CS106 library. The .h is just a convention for the extension for header files. .txt gets used on text files and .cpp gets used on source files. .h is for header files, which are descriptions of routines but no real code is typically in the .h file. Itís an interface file they call it.

In Java, there isnít that distinction. Everything is all in one file, but the definition of a class serves as both the description for a client using it as well as the implementer implementing it. C++ has them separated. Let me look at C++ string as the next example of something we have to make use of in getting things done. The C++ string type is actually defined in a header file, and itís a library thatís added into the language.

Unlike int and bool and double that are part of the language and canít be separated from it, string is kind of an add on thatís defined through a library. It models a sequence of characters including everything Ė letters, numbers, digits, punctuation Ė and the string is defined as a class. In the same way that in Java youíre used to the class being the pattern from which you can declare and initialize objects that you can then message and do things with, string in the C++ world is the same sort of deal.

You have a string class. You initialize string objects. You send messages to those string objects to ask them to do things for you. Asking a string to give you the character at a particular position or the number of characters or to insert some characters or change some characters within the body of the string are all done by messaging the string. A couple simple operations Ė I put a little bit of string code to get started.

The variable name is actually string itself. There is something a little bit different about string when you declare it and you donít initialize it relative to the things we know about primitives. When I say string S and I donít say anything else, you might assume then similar to the primitive types that S is garbage. It has some sequence of characters. In fact, string has whatís called a default constructor, one thatís invoked when you donít specify otherwise such that when you initialize a string with no other explicit information, it will assume you meant to set it up to be the empty string.

Making a call string S actually declares and initializes a string with no characters. If I were to ask it for its length, which is the way we ask for the number of characters in it thatís being used right here Ė for example, S.length and then close the open paren there. It would return zero on the empty string. We can use square brackets like the array notation youíre probably familiar with to access individual characters of the string.

Applying the square brackets to S Ė S sub I sometimes Iíll call this accesses the [inaudible] character within the string. The characterís index starting from zero Ė so if I have a ten character string, they actually are indexed zero through nine. The C++ string Ė the square brackets allow you to access that character both to read it and to write it. A C++ string is mutable. The Java string is immutable. Once you create a string, it has a certain sequence of characters, and although you can make a new string and overwrite that one, you canít go in and just manipulate the string in place and change its contents.

C++ you can do that. I initialized string in this case to the string literal or string constant CS106, and then I ran a loop over the index of the proper range of indices for this string, and then I used the two upper, which is a function from the standard C library that takes a character and returns its uppercase equivalent or unchanged if itís not a letter, and then [inaudible] S of I the result of two upper.

The effect of this was for each lowercase character in the string, we overrode it with its uppercase equivalent. Any other existing uppercase or punctuation characters were left unchanged. You can make assignment into that, which is something you cannot do with the Java string.

Student:Can you insert [inaudible]?

Instructor (Julie Zelenski):I certainly can. Iím going to show you that in about two slides. There are a whole set of member functions that then do these things. This one allows you to have the sequence of five characters Ė what if I want to put one in the middle? Iíll use something called insert. If I want to take one out in the middle, I use something called replace or erase to pull it out and put something else in.

Many of the built in operators Ė things like equals and less than or less than or equal to, not equal, have extended meanings that apply to strings when theyíre used as the operands for those types. I can assign two strings using equals. If I say string S equals T as Iím doing right here, then whatever value T is, S becomes a copy of that. S and T have the same value, but theyíre not related in any important way going forward. We have two copies that both happen to have the same five characters.

For example, the first thing I did after this was change the first character of T to be J, so now T is jello. S is still hello. It was initialized from the same sequence, but they donít retain any kind of aliasing from that point forward. Iím able to compare two strings directly to see whether theyíre lexicographically equal or less than according to ASCII ordering. I can say if S = = T Ė in Java, that didnít do what you wanted. It did compile, but it didnít test the thing you were hoping for. In C++, it does do what youíre expecting, which is to say take two strings and say do they have the same sequence of characters.

If I have assigned S to T, if I do S = = T, itís going to say yes, they have the same five characters in the same order. Once Iíve changed one of them, then theyíll come up as not equal. I could do less than and less than or equal to to see in ASCII ordering which one proceeds the other to do sorting of strings. Just like you think of as the integer types in double touch, those operators have reasonable meanings applied to strings.

The plus and plus equals is whatís called overloaded, so extended beyond its usual meaning for addition to do concatenation of strings. I can take S and I can add to it a character space at the end, so now instead of being just hello, itís hello space. I can also add strings to strings, so I can take T and use the shorthand plus equals, which takes jello and turns it into jello jello there, attaching another one on the end.

The concatenation for the C++ string only operates on strings and characters whereas in Java, thereís this kind of automatic mechanism where things like doubles and integers are converted to string and added into the concatenation. That does not happen in C++. Concatenation is just for strings and characters. If you have something thatís in numeric form and you want to add it into a string, youíll have to first convert it to a string. Iíll show you a routine that does that a little bit later.

I would be happy to do that. Most of the things that Iím talking about actually are in handout four was well as repeated in handout ten and in the reader. There are a million places you can look for information on strings.

Most of the heavy lifting on the strings is done via these member functions. These are part of the string class, and so these are operations that apply to string receiver objects. Theyíre not free functions. You canít call them outside of a usage where youíre saying on some receiver string, apply this function using these arguments. For example, the length member function is applied to a string.

Just to note here, the word member function is vocabulary-wise the same thing as method. Java programmers tend to call the functions that are defined as part of a class methods. C++ programmers tend to call them member functions. They really mean the same thing, but I do try to use the word member function because we are a C++ class, and that is kind of the convention. Iíll probably end up using both accidentally without even noticing it. Hopefully, it wonít cause you too much grief there.

The member function here is saying str.function R is saying apply the function, send the message function to this particular string with these arguments and then get its answer back or have that operation happen. I can ask a string for its length in terms of an integer. It tells you the number of characters. I can ask a string to look for a particular character or string sequence substring within the characters that that string maintains right now. It will return the index of the first occurrence found, scanning from left to right or a string::end pause. Itís a little bit of a funny return value, but it is the return value that says I didnít find it.

Itís a string::end pause. Itís an integer value that is distinct from any other valid index within the string itself to tell you it didnít find it. Both of these have a default argument on them. We talked a little bit about that last time. If I do not specify that second argument when Iím making a find call, it will assume that you want to start looking from the beginning. If I do specify it, then it will start from that position and scan from there to the end of the string. Itís a way of targeting the place youíre looking for a little more precisely than just starting from the beginning and going to the end.

C++ does allow whatís called overloading. In this case, the function find that finds a char and the function find that finds a string both have the same name, and so that name can be used for multiple purposes as long as thereís a sequence of arguments that distinguishes them so that when I make a call to find, it knows whether the first version or the second version by virtue of whether the first argument is a character or the first argument is a string. That can be extended to other types. This is typically used when you have an operation that really has the same behavior but some slightly different sequence of arguments is required to invoke it.

It is not something you want to use a lot to make a bunch of similar named functions that donít have similar operations. It allows for a convenience when there are two or three variations of the same theme. They might all come under the same name by virtue of overloading.

Substr is something that given a receiver string and a position in a length will extract a new substring out of the middle of the string that was received. If I take the hello string and starting from position zero take two characters, I get the string he. It copies them. Itís distinct from the original, and so all it did was get its initial sequence by copying characters from there. If I go in to change the hello string into jello, that he string stays he. Theyíre not attached in any long-term way.

Insert, replace and erase are all of the family of something that I call modifiers or mutaters that change the receiver string. You can send these messages to a string to cause new text to get added into the string, text to be removed or text to be deleted and replaced with something else. Inserting Ė someone asks, well, how can I put new characters in the middle? Well, I put the position where Iíd like them to go. If I say position zero and I say put the string I in there, then it would bump everything down and put I in the front and replace it. If it was hello, it would be I said hello. I inserted the string I said.

The replace at a position removes length characters starting at that position and then replaces it with that character. Itís a way to take a chunk out and put something else in instead. Erase does a straight remove at a position. Take this number of characters and throw them away, deleting them from the string and making it shorter. All of these change the receiver string. When you say str.insert, str.replace or str.erase, after that call, str now actually has new contents based on what youíve asked it to do about changing and mutating its contents.

Hereís something I should tell you a little bit about C++ string relative to Java string. C++ is kind of an industrial strength language thatís targeted at professional programmers. It does not make any guarantees to you about what happens if you misuse these calls. If you give it a position that isnít valid for this string or a length that isnít valid for the string, there is no contract in the C++ libraries that said this is what will definitely happen. It doesnít say oh, itís definitely going to throw an exception or throw some sort of error. It doesnít say itís just going to truncate it at the end.

It says that the library is free to do whatever is convenient for it up to and including just crashing. It does mean that as the programmer using these calls, it is a little bit more on you to be careful that youíre using them correctly and making the numbers inbounds for the string in ways that will produce correct results. It might be that it will produce a nice error message, but there are no guarantees. You wouldnít want to come to depend on that. You want to just be careful about knowing what the right numbers are.

Unlike Java, which is very attentive to those things and on your case when youíre a little bit out of bounds, in the name of efficiency, it tends to just breeze through that stuff. Iím going to show you a little bit of coding together just for fun. I like to sit and show you some things. If I were to do something like want to count the occurrences of a particular character within a string, I could write a loop that looks like this.

I could say int count = zero for Ė and this is a very ubiquitous loop for operating over a collection Ė in this case, the collection being the characters in there from zero to this length. If S sub I equals the character Iím looking for, we would increment the count and then return it. I put this down here in my code and do a little testing of looking for the character C in Chihuahua cheese crackers.

Letís take a look at that and see if we manage to count the number of Cs in my list. There are four, apparently. Letís go check and see if that comes up. It looks good. We did a little bit of counting. Weíre feeling okay about that part. Let me do something where for example I want to remove all of the occurrences from that. Iím going to write this two different ways to highlight a little bit about how things work.

Iím going to design a remove occurrences that given a character in a string will return to you a new string where all the occurrences of CH have been removed. Easy enough. Itís not going to modify the original string. Itís going to return a new one. Hereís my strategy. The way to build these things up is I could go through the manipulations of trying to take the characters out in place and figuring out where Iím at, but often, the easier way to do this is to build up the result Ė decide when to append or concatenate a character from the original string and when to ignore it and go past it.

I can do something like this where itís like if the character Iíve just seen is not the one that Iím trying to avoid, then I can just add it into the result. When Iím done, I have the result. If I do this and I change this call down to remove occurrences Ė Iím counting on the fact that result is initialized to the empty string. I didnít actually say anything there. I could, for example, do this and that doesnít change anything about it and you might feel a little better about seeing that explicit initialization, but C++ programmers are very used to seeing uninitialized strings and knowing that that means they got the default initialization to the empty string.

When I didnít find the character I was looking for, I [inaudible] the result. Iím going to switch this up just a little bit. Iím going to change remove occurrences to instead of making a new string to actually modify the string that I have. Iím going to change my code down here to match whatís going to happen. Iím going to set it to do this, and then Iím going to call remove occurrences C of S, and then Iím going to print out S afterwards.

In this case, I donít expect there to be a second string created. I expect us to go in and modify that string in place, truncating some of those characters and taking them out to make this work. I could kind of do this thing where Iím walking down the string character by character and then deciding whether to collapse over it. Iím actually going to change my strategy entirely just so I have a little practice using some of the other routines, and Iím going to end up using the string find.

Weíll start with this. I can actually do this. S.find of CH, and if I donít give that second argument, itís going to start from the very beginning and look all the way through and see if it finds it. Iím going to put a hold that result at a variable and Iím going to say while found equals the result of calling find and then Iím going to stick it in. This is a very C++ way of coding. Itís tightly combining this up. In this case, I have an assignment and a comparison all in the test of the Y loop.

Iím making the call to ask the string to find a particular character, storing that result in an integer here so I can use it and then comparing that resultive string end pause. So the string end pause is a little bit of a funny C++ syntax there, but the way to read that is within the string class Ė string:: says within the string class, scope within the string. Thereís a particular constant called end pause, which is used as the return value in cases like find when itís looking for something. End pause being part of the class is a way of avoiding it conflicting and interfering with any other usages where you might have variables named end pause or similar functionalities.

Itís tied to the string class through the scoping mechanism. I check and see if itís not string end pause, and then if itís not, then I go into the loop here and I can do an erase of one character at position. Erase takes the position in the count and removes the number of characters I specified from that position. Then, it will come back around. I have passed string by reference coming into here, and thatís a very important part of whatís happening here because these calls to erase that are modifying string Ė if I have not passed string by reference, theyíd be operating on my copy.

Iíd go through all the trouble of erasing all the Cs in my copy, but when I got back out to the main call, none of those effects would have been permanent. Passing by reference really means that what remove occurrences got was access to the original S. I should make these names Ė Iíll call this my string out here so that we donít get any confusion about the two names. The my string variable in main is really being accessed by remove occurrences without a copy. Itís reaching back out into main and making changes to the my string itself.

What does it not like about that? Oh, pause. I called it found. Iíve achieved the same thing. Thatís one of the things about the string library is itís so big and has so many different ways of doing things that often two people or ten people running the same task wonít even come up with the same solutions. I could have used a replace where I replaced the character it found with the empty string.

I can build it up through concatenation. I can take it down with erase and replace. I could insert the other way around. There are a bunch of things I can do that in the end will achieve the same effect but show that there are a lot of ways to accomplish the same things. The library is pretty rich and has a wide variety of tools in it.

Iím going to make one change to this to show you how I can make it slightly more efficient. This is silly because strings are typically very short, so it doesnít really matter, but Iím going to do this and Iím going to use found as my index on subsequent calls. I can say starting at found from zero, do my search from zero and then found, and then any subsequent calls will pick up where I left off. The next time around through the loop, found is at the place where I found a previous occurrence of that character and it says starting from that position now, look forward and see if you see any more from here to the end.

For a very long string like this, it ends up doing a lot less work. It doesnít start at the beginning each time. It just picks up where the previous occurrence was found and goes from there to the end. Itís a small change, but no big deal.

Basically, what youíre saying is find needs to return something that says I didnít find it. It could return to you zero, one, two, three, all these indices. It actually needs to return to you, and it uses a special sentinel value that says I didnít find it. You might think that might be negative one or some other thing. A good programming form would be to have a constant for it so that you donít have any magic numbers embedded in your code. That constant is defined as part of the string class. Just the syntax for accessing that constant of the string class is using the string class name :: end pause.

Itís basically the syntax for I have a constant that was defined within a class. How do I get to it? I use its class name, two colons and then the name of the constant. Itís just C++ for something that in Java looks a little bit more like class name dot. Question?

Student:Sometimes I see a function declaration before [inaudible] and then the definition afterwards. Is that just a matter of preference?

Instructor (Julie Zelenski):It totally is. Probably Iím being a little bit lazy in class, which is if I put the function definition up here, then I can call it down here because itís already been seen. If I put it down here, then I need a prototype up there. The prototype means I have to be a little bit more careful. When I change the name, I have to change it in both places. If I change the argument, I have to change it in both places. The problem, of course, is that when you read the code, it probably reads a little better to say hereís the main which makes calls to A, B and C.

Some of it has to do with itís a little bit harder to maintain in that form, but I think itís easier when youíre done to read it. Youíre totally free to do it either way. You should probably pick a strategy and go with it. Maintain the prototypes is not really that much work once you get used to it, and I think in the end, it probably is a little bit cleaner. When Iím being lazy in class, Iím much more likely to just throw them up there to save myself some time. Itís good to note that there are a lot of things that will slip by me if Iím not being careful.

Let me go back and pick up a few last details about string that I donít want to overlook before I move away from this. There are library functions that are need to know. I have them sketched out in a couple places, and you can look at them and see what they do Ė knowing theyíre there and then learning about them as you encounter them is a fine strategy. There are a couple additions in our [inaudible] which is a 106 specific header file which are just some things that for one reason or another are a little bit harder or more annoying to do using the standard tools than we think is worth putting on your plate for now.

We have two convert to upper and lowercase that given a string just convert it to its upper and lowercase equivalent. There are some things that do conversion between string and integer and string and real when you have it in one form and you need it in the other. Hereís something that just does that for you as part of the string library. Itís just some simple things that you might find yourself needing and you just want to know theyíre there.

Here is something that is a little bit of a bummer. Part of the legacy of C++ being built on C means that every now and then, thereís a little bit of a history in our deep, dark past that pops its head up in ways that are a little bit surprising. For string, it turns out there is a little bit of a weirdness here that I want to point out before you run into it the hard way. There is a notion of the old style C string.

The original C language didnít have a string class. It actually doesnít have any object [inaudible] features at all. It did have, though, some other more primitive handling of sequences of characters. This is a very common [inaudible] to have something. Iíve put in parenthesis what it actually is. Itís [inaudible] an alternative. Donít worry about what that phrase means. Thatís just for those of you who have seen it a little bit before.

That would be fine. We have this better string object that has all these fancy features, so youíd think we could just use that and ignore the fact that the other one is there. It almost Ė 99 percent of the time, thatís exactly how itís going to work. It does turn out that there are a few situations where this old style string pops its head up and gets a little bit in our way. One way that may be a little bit of a surprise is that the string literals are actually C strings.

When you see an open quote, some characters and then a closed quote, the compiler interprets that as a C style string. It also has a mechanism by which if you tried to use it in a context where you needed a C++ string, it will automatically convert it for you. It will take the old style string and make a C++ string out of it. That means that basically I can use them wherever I want and it will mostly work out.

There is a way you can deliberately force it, if you use what looks like the type case here. This is actually calling the string constructor, and you pass a string literal or string constant. It will turn it into a C++ string manually there. Itís going to turn out that you might need to know this. Thereís also the other problem of what if I have it in one form and I want it in the old form? I have the old form. I want a new form. I have the new form. I want an old form.

There is a member function on the string class that will return to you an old style string from a new style C++ string, and itís called the C_str [inaudible]. They let you convert. Why do you care? It turns out thereís one thing youíll definitely run into, which is when youíre opening a file stream, you want to say this is the file on disk that you want to identify. It turns out that that library requires the use of a C string as the name. There was a little bit of an issue trying to get all the libraries to come together at the right time, and it turns out the stream library got finalized before the string library was done, and so it depended on what was available at the time, which was the old style string.

Even years later when theyíre both happily debugged and working, it is still the case that when you use the stream library, you have to describe the file you want by using the old style string. If you had a C++ string variable that held the name you wanted, youíll actually have to convert it. Converting in the other direction comes up in one case, and Iím going to show you this one.

It has to do with concatenation. The plus operator that does concatenation really wants to work on C++ style strings, so if one of your operands is a C++ string, itís all fine, as long as the left or the right side is a C++ string. The other side can be a string literal, a constant, another string, a character variable Ė all those things work fine. As long as at least one of the operands really is a true C++ string already, youíre good.

Thatís almost always the case. But in the case where you somehow have two things on either side of the plus, neither of which is already a C++ string Ė typically, that means you have a C string on one or both sides, a character on one or both sides Ė you are not going to get concatenation. If you try to add two C style strings, it actually wonít compile.

The sad thing about these two things, about taking a string literal and adding either a character constant or a character variable is that it does compile and it just does not do what you want at all. It does so in a silent but deadly way. Iím not going to tell you what it does, but if you are curious, you can come and talk to me and Iíll lay it out for you. What I want you to come away with is this memory that when Iím using concatenation to be sure that one of the two operands is a C++ string. If you have to, force one. If you have a string literal and you want it to be a C++ string, then make it one to avoid running into this.

The mistake that you get from this is actually quite mystical and very confusing. Probably 95 percent of you would never run into this, and so mostly, Iíve just confused you for reasons that seem unclear, but for the five percent that are going to run into this, Iím really trying to do you a favor by giving you a heads up before it causes you a lot of grief later. Just a little bit of a legacy. C and C++ go back a long way, and as a result, we sometimes have little quirks we have to deal with even in the modern world.

Student:Is that the only way you can convert a C string into a C++ string?

Instructor (Julie Zelenski):Not exactly. A lot of times, it just happens automatically is the truth. In almost all situations Ė if you had a routine that expected a string argument and you passed the string literal, it will automatically convert. Mostly, you wonít need to do this is the truth. It happens all the time behind the scenes without any effort on your part. This is the official way to say Iíve got a string and I really want to force it and Iím not waiting for the compiler to do it on my behalf.

In particularly, for example, in a situation like this, it doesnít realize what you really wanted to do was convert this and then to concatenation. It does something kind of goofy based on what the old meaning of taking a C string and adding a character to it was, which was not concatenation. Thatís a little moment of silence for old language C that comes back to haunt us a little bit like a ghost in the attic.

Student:What do you mean by a string literal?

Instructor (Julie Zelenski):A string literal just means a string constant. Itís something in quotes.

Student:Without explicitly declaring it to be a string.

Instructor (Julie Zelenski):Yeah. A string literal is when you see open double quotes, some characters, close quote, thatís a string constant or a string literal. In any situation where you see exactly that Ė not a string variable is basically what Iím saying there.

How do we do IO? How do we do input/output in C++? Let me first say that input/output is probably one of the more distinctive features of any language. Cís IO, for example, looks very different than C++ís IO, which looks kind of different from Javaís IO. These are areas where for some reason, even though they all do the same things underneath it all Ė they let you print stuff. They let you read stuff. They have some formatting features.

For one reason or another, these are the areas where theyíre widely divergent in their syntax and the way you express what you want to do. That makes them particularly annoying to learn is the truth. I know a lot of IO systems, and theyíre all very jumbled up in my head. At any given moment if you asked me how could I print a decimal number with three digits of precision in this language, Iím going to have to go look it up. My motto is look it up. Donít worry about memorizing these details, because they are very tied to any particular language and its formatting system.

That said, weíre going to use a little bit of IO. Weíll need to be able to read and write things to the console to interact with the user. Weíre going to do a little bit of file reading, reading numbers and strings from files, maybe even producing some files. Weíre going to use some very simple set of features. Weíre not going to go too deep. When you need to know more, there are great resources to go check into for that. I wouldnít in advance go make yourself an expert on any form of IO. Figure it out when you need to.

The IOs are actually handled in C++ using stream objects. There are stream classes. The O stream is the output stream thatís used for writing. The I stream are the classes used for reading. Their variance, for example Ė the IF stream and the OF stream are the file equivalents of the input/output streams. Cout and cin are these two basically global variables, effectively, that give you access to the console output stream and cin for the console input stream. That means the little text window that pops up that you get to type and print things for the user to see and interact with.

The standard operators for reading and writing to the stream in the default sense are the <<, which is stream insertion, and >>, which is stream extraction. You stick things onto a stream and then retrieve things back from a stream that youíre reading from. A very simple example of this would be I have the variables X and Y declared here. I asked the user to enter two numbers, and then I use extraction that says from the console input stream to pull an integer out followed by another integer, and then I repeat back what they said.

In its simplest form, the kind of things you can print out are very related to things you can read in. When I ask cin to read an integer here, it looks for a sequence of digits upcoming in the stream that form a valid integer which it assembles and puts into the value X. Then it looks for another one. It typically uses white spaces as an eliminator, so any returns, tabs or spaces will be skipped over in between. Anything that led up to X, it will skip over all the white space, look for some digits and then skip over any intervening white space, and look for some more digits to pull Y.

Of course, whatís likely to happen here is users are bad typists. They make mistakes that when I go to read this, what happens if theyíve typed the letter A or 72A45. This causes a little bit of havoc because when it goes to extract that, it looks for some digits and it finds this thing and it doesnít match its expectation. That puts the stream in what is called an error or fail state, which then requires you digging around, realizing it went into a fail state, cleaning it up and resetting and starting over.

Itís not that it canít be done, but itís a little bit annoying. We just made this task a little bit less onerous by providing in the simpIO library, which is our CS106 simple IO Ė it has get integer, get real and get line. They all read from the console, so reading from cin, and they deal with all that error handling. They make sure that the input given was well formed. If itís not, it reprompts and has them try again. It does that until they get an integer.

When you call get integer, you know that eventually, the user will have typed in a well-formed integer and you will get that value back when you make that call. You donít have to be worried about all the machinations to check for errors. Retry is actually bundled up behind that routine for you. Most of our console input will end up using these functions just for convenience. They save us a certain amount of hassle.

I would ask Ė if I wanted X and Y, I would say get integer one, get integer two. Iíd have to call it twice. Thereís not a combined form of it. It saves us a lot of trouble. I canít do it this way. Iíd have to stop it after one anyway, check to see if it failed, if not, go back in. Itís kind of misleading to even show this form, because that form assumes that the user is a perfect typist and never makes mistakes, which is in this day and age not too likely.

Weíll talk more about file streams on Wednesday. On your way out, look for handout five and seven for Mac or PC depending on what youíre using and good luck getting your compiler set up. Iíll see you on Wednesday.

[End of Audio]

Duration: 48 minutes