Instructor (Mehran Sahami): So welcome back! So yet another fun-filled, exciting day of 26A. After the break, it almost feels like Ė I came into the office this morning, not that I wasnít in the office during the entire break, but I came into the office this morning and it felt like a new quarter had started. And I was like, oh, itís been a whole week. And Iím sure for you, it feels like you just wish a new quarter was starting because we still have two weeks left.
So a couple quick announcements before we get into things. One is there is one handout, which is your section handout for this week. And kind of one of the themes of this week is bigness. In some sense writing bigger programs, bigger data structures, thatís the whole deal. And weíll kind of talk about that as we go along.
Another quick announcement, just wondering how many people tried the Name Surfer demo online and had a problem with it? You folks, yeah, we updated it. So evidently there was some issue that only shows up on Windows XP with Java 1.6. And like, if you had a Mac you didnít see it, if you had Vista, presumably, you didnít see it, if you had Java 1.5 you didnít see it. But in that one case, it would come up, so the name suffer web applet demo was updated a few days ago, I think on Friday, maybe. So now it should work for everyone, hopefully. If you still have an issue, let me know. The only thing that youíll see now, though, if youíll try running this applet, is that the interactors, instead of it being on the south border of the screen are on the north border of the screen. That was just a little hackler we had to put in there to get things to work. The functionality is exactly the same.
As a matter of fact, if you want, you can put your interactor and border on the north instead of the south. Itíll make no difference to the rest of your program other than where you say south for adding your interactors, you say north, thatís the only place it makes a difference. But you actually see that in the web applet version, the interactors are just in the north border instead of the south. Otherwise, it doesnít make any difference. But in case you saw that and was like freaked out, thereís nothing to worry about. Okay. So also, I hope you had a good break. Just wondering, how many of you actually enjoyed their week break? Good time. And how many were working most of that break? Yeah. Good times. Hopefully, it didnít cause you too much pain, but if it did, hopefully, youíre like, all caught up or ahead of the game in all your classes, now, so life is good.
So I want to spend a little bit more time talking about today, well, actually a lot of time talking about today, is thinking about data structures, building large-scale data structures. And we begin to talk about it just a little bit before the break and itís been a while so weíre going to review it a little bit and kind of build up even more. But one of the things we talked about, in the past, right, what a lot of our computers do is they manage data. They manage lots of data. And in fact, I would venture to guess that thereís a whole bunch of applications out there that manage a whole bunch of data about you, but you may not have thought about all the data they actually manage. So some of the things that actually come up, for example, online stores, right. Anyone actually bought anything online, just wondering. Yeah.
Thereís a huge amount of dataís that involved with that. Not only the particular transactions you make when you buy something, but keeping track of accurate transactions, figuring out things like people who buy product X also, tend to by produce Y. All of that is data management. And what makes those companies successful is they just do a very good job of managing their data. Okay. There are other things like, Iím almost frightened to ask, but social networks, like, Facebook, or MySpace, or ORCHID, or Friendster, or Linkdin, or you could just keep going on. Anyone on a social network? Just wondering. Yeah. Thatís good because your next assignment is going to be to implement one so you can see what itís actually like. But that will be coming in a couple of days. And theyíre not that hard, really. But what it is is a data management problem. Right? And it keeps track of things like who you are and information in your profile in the social network, and who your friends are, and all that happy news.
Or you know, even things like a friend web search, right? Thereís a huge amount of data you need to be able to keep track of to be able to web search, right? So all these things are all about managing data well and so part of this class, right, is youíve got a whole bunch experience in terms of building up code, and different kinds of classes, and doing nice things with user interfaces, and the whole deal. And one of the things that we need to spend a little bit more time on is talking about how do you manage lots of data and then do something interesting with that. Okay. So hereís some principles to think about, if we think about good software engineering, some of the principles of thinking about data kind of in the large. Okay.
When you think about keeping track of lots of data, one of the things you want to think about is the information you want to keep track of, what are the nouns you want to keep track. And youíre like, I donít want any nouns, what do you mean by the nouns? Letís say I was writing an application that was an online store, to keep track of, oh, letís say, music. And so one of the things I would want to think about is, where are the nouns that are associated with music? Youíre like, okay, now youíre really getting weird. No, itís pretty simple. Things like a song, right, is a noun, thatís associated with music, or an album, or an artist, right. And so what you want to think about is the things that are the nouns in the domain that youíre dealing with oftentimes end up translating into what your classes are. So you may end up having a class that keeps track of information about a particular song or class that keeps track of information about a particular album.
So the good linguists out there tell me we not only nouns but we also have verbs, not unless you happen to be talking to my son, who seems to only have nouns, but thatís a different story. And he loves jarens by the way. But like, why are you telling me this? Just cause itís fun, because I just spent a whole week dealing with it. In terms of verbs, these are oftentimes the methods that are associated with your classes, right. So when you want to do something, some noun takes some action, which is a verb, which is some class, has some method that operates on that class. So at an abstract level thatís what you want to think about in terms of high-level principles of design. Now, there are some other sort of more concrete things that you might want to think about, things that have to do with what are the characteristics of the data you actually want to store so one thing that comes up, oftentimes, is thinking of the notion of having a unique identifier, identifier. What do I mean by unique identifier? All of you have unique identifiers, whether or not you like it or not, as a result of being at Stanford. Your Stanford University I.D. number is a unique identifier for you at Stanford. Every student has an I.D. number. Okay. So it identifies you and itís unique. No two students share the same I.D. number.
So you get issued this number when you show up here and you have it for life. When you leave itís still with you. I know, I left, I came back, I have the same student I.D. number. It just exists and this uniquely identifies you. And in different cases you might want to think about what are unique identifiers. Right. So in some cases, for example, if you had a social network you might consider the names of people and not the social network to be identifiers, or say the names of their profiles, for example. In other cases, you might have something different. If youíre managing a store you might have some I.D. number for books, an ISBN number, or if youíre keeping track of music, you might say that the combination of the songsí name and the band that plays it is a unique identifier for that song. In some cases the unique identifier can be a combination of things. But if you think about your data having a unique identifier that also gives you some insights about what kind of data structures you might want to use to keep track of certain things. On other unique identifier that some of youíve already grapple with is saying Name Surfer. Right? If you think about the data in Name Surfer whatís the unique identifier there?
Instructor (Mehran Sahami): Name, right? Name is a unique identifier and for every name you have some list of values associated with it, which was the rank of that name over the last century in terms of how popular it was for names. But every name, well, I shouldnít say every name has some value associated with it, but every unique identifier in the system has some value associated with it and only one set of values. And so the important thing to keep track of there is when you actually are doing your Name Surfer assignment to fact of this thing is a unique identifier can potentially help you keep track of the data that youíre using. And weíll sort of go into that as we go along in the class. Okay?
So some other principles we can kind of think about in terms of designing data structure, in terms of actually doing the design, thereís some questions you want to ask yourself. And the questions you want to ask yourself is, are you keeping track of some collection of objects? Right? So there comes some collection of objects through data that you want to have. And if you have a collection of objects, say in an online music store, you might have a collection of songs that you want to keep track of, this word should be a tip off too, that perhaps thereís an interesting collection that exist in Java that would be a way of keeping track of that information. It may not be in Java if youíre programming in some other language. But the fact that Java has something called a collection and the reason why they gave the name of collections to a certain group of stuff is because theyíre used to keep track of a collection of objects. And the question that you ask yourself then is what collections do you actually want to use? Okay.
So with that said, what we can do is spend a moment, and it will be a brief moment, revisiting the collection hierarchy. Right? Youíve seen this picture before but Iím just showing it to you again, because the last time you saw it was like two weeks ago, which is a lifetime in a quarter. Right? I think it too, itís about a fifth of the quarter. Youíre like, oh, what was I doing two weeks ago? Was the break out, was I learning Printland? No, no it wasnít that long ago. But what you were learning about, a little bit, was collections. And o there are some collections, for example, like an ArrayList that going all the way up the chain of the hierarchy is itself a collection. Or there are other things, for example, like a HashMap. And a HashMap, if you said hey, I have some HashMap, the set of keys in that HashMap ends up actually being a set, which happens to be a collection. Okay. And so what you want to think about, do I have different things that I can keep track of? Like an ArrayList is one way to keep track of things. A HashMap may be another way of keeping track of things. When is the appropriate time to use one thing versus another? And so when you want to think about the appropriate times of one versus the other, you want to think about what are the methods that a collection provides to you. And it turns out all collections that implement the collection interface, like the ArrayList or the key set of the HashMap, have all of these properties.
And some of you have seen them before, but just to review. You can add a value, right? So this is a parameterized values type. Like you can have an ArrayList of strings and you can add some value to it, and it adds it to the collection, and little did you know, or maybe you did know, but at the time we didnítí really care about it, was it returned a bullion. Most of the times we just returned the bullion, we didnít care about it. But actually returned true the collection changed. So in an ArrayList, it always returned true because when you were adding a value, it didnít care about duplicates, it would always just add them to the end and always return true. Some collections, like sets, actually donít allow you to have duplicates. So if you try to add something to a set that already has the value youíre trying to add, it will not change the set and return false, because it says, hey, I already have that value and nothing changed.
A couple other things that you should know about, most of these youíve seen. Remove, removes the first instance of an element as it appears and returns true if a match is found or returns false if it didnít find anything to actually remove. And clear, basically, just sort of nukes the whole collection. It just says get rid of everything in the collection. Iím done with that and the collection is dead. Actually, the collection is not dead to you, it still exists, itís just an empty shell of what it was before. There are violins playing in the background. And then size, you can get the size of the collection. Youíve seen this, youíve probably used a lot of these before in your programs. Contains, thatís an important one, right? You actually want to see if a collection contains some particular value, if a collection is empty. And hereís one thatís sort of interesting that we talked about a little bit but we didnít actually talk about the fact that a collection or all collections can give you one these. All collections can give you an iterator.
So we talked about, for example, having an iterator over the key set of HashMap. Thatís one thing we did before. We said we had some HashMap that lets a map from strings from some other strings. And we want, say, hey what I want to see is get a set of all the keys and I want to iterate all over those keys. Thatís great! You can do that and thatís perfectly fine. When we used ArrayList we always had like a four loop and said, oh, from zero up to the size of the ArrayList do something. But if we actually wanted to, we could have an iterator over the ArrayList and then this would give us the elements of that ArrayList one at a time. So because an ArrayList is a collection it can also give us an iterator. And thatís just something to keep in mind is that thereís common patterns that get used in programming. One of the common patterns that get used is known as an iteration pattern, which again, is an iterator over some collection and you just go through and do something like printout the values for every element of that collection. And if you want to write it in the most general case, you donít care if that collection happens to be the key set of the HashMap, or an ArrayList, or whatever, you just say, hey, youíre a collection, give me an iterator and I can go through all your elements one at a time and, for example, print them out. Okay. So thereís just simple patternís that we get into.
Now, youíre like, okay, Marilyn, thatís fine, you told me some design principles over here, you told me about some collections over here. Show me something concrete, like put it all together. So letís actually put it all together. Okay. And weíll view a little example, which is going to an online music store. And because many names for online music stores are already taken, our music store is going to be called Flytunes cause theyíre tunes that will fly. All right. Yeah, man, when youíre like in your mid 30ís you just canít be that cool. But trust me, it is. Okay? So weíre going to make a little store that just keeps track of music and albums, and that music and actually lets us keep track of information and prices. And so what we want to think about is what are the things that we actually are going to do in that store, okay? So one of the nouns of that store is going to be a song, okay? So a song is some basic thing that weíre going to sell. This is what we want to be able to do with the song. Now, you could say, well, what does that mean, do I have some method called sell? If weíre doing inventory management we might not actually have a method called selling a song, but we might, for example, want to add for inventory to do things like add songs.
And similarly, songs, oftentimes, are put together into albums. Okay, so we may also want to keep track of albums and do things like add albums to our inventory. Now, the interesting thing with an online music store that differentiates it from say a physical music store, is you can do interesting things, right? You can actually have songs that are not on any albums. And that works, right? Itís kind of like thinking of a single, right. When you go and buy a single somewhere. In the days of yore, you could actually buy a little record single that had two sides on it so you got two songs, so it wasnít really a notion of a real single, single. I guess now, there are like CD singles. But who wants a CD single when it comes down to it? You can get songs that are on albums. At the same time, you can have the same song be on multiple albums, right? That always happens. Thereís a band, I wonít mention their name, but I remember from the early Ď80s, they had two albums. They had their first album and they had their best of albums, which were half the songs from their first album. Just anything you can do to milk the consumer. But basically, what that meant was songs can show up on multiple albums. Okay, so we want to begin to think about how that might actually affect our design.
Now, if we think about putting the information together, right, nouns become our classes. So if weíre going to have song as a noun, weíre probably going to have some class song thatís going to keep track of all the information associated with a song. And so just for the sake of brevity, Iíll tell you what informationís going to be associated with songs that we care about in our store. Thereís a notion of the name of the song, the band or artists that perform that song, and then a price, because weíre going to allow for songs to be sold individually, so individual songs, as opposed to whole albums, have prices. Okay. And you can think about these things and think about, oh, what data types do you want to have for them. Right, so what type data type makes sense for a name, for example, string type, or if you want to have a band name, this would probably be a string. Price is always an interesting one. You could sort of say, well, now, and thereís multiple things I could have it be. I could have it be an [inaudible], for example, if I was going to have it be the number of cents. In the simplest case, Iím just going to have it be a double. Even though we know thereís no fractional money unless youíre a banker, in which case there is fractional money. But we wonít talk about that right now.
It was just like Superman III. Anyone see that movie? No. Itís not worth watching, trust me. But fractional money does exist outside of movies, bad movies in Hollywood. So thatís the information we want to keep track of for a song and then we want to think about what are some of the things that we want to be able to do in relations to those songs. The other thing we also want to think about is our friend the unique identifier. Is there some unique identifier for a song? And this is one of those things you really need to think about the application that youíre using, what assumptions you can make. We might like to say that the name is a unique identifier for a song, but unfortunately, there are many songs that have the same name. Okay.
But I would venture to guess that the combination of the name and the band would perhaps be a unique identifier for a song. The only thing is we donít have one string that we keep track of that keeps name and band in it. So thatís another thing that we need to think about, and weíll get into code when we get into code, that we need to think about the design of that particular object. The other thing we need to think about is what changes in an object during its lifetime and what doesnít change. Like, so if I have a song its name and the band that made it for a particular song, like, some band can go uncover the song they learn, but thatís a different song, the name and the band name donít change for a song. But hey, it can go on sale and you know, I can jack up the price at the holidays and all that kind of stuff. So the price is something thatís malleable.
So another thing you think about in terms of the principles of design is, of the data that I have associated with a particular object, whatís going to remain static when that objectís created and whatís going to be potentially changed? And thatís what gives you some insight about whatís some of the data, for example, that you only get from an object, whatís some data that you can potentially set in the object, and if you think about what potentially uniquely identifies that object, what data do you actually need at the time that you construct the object, right. To say this object is actually some particular unique thing that I care about. Okay. So letís turn that into a little bit of code just to make it a little more concrete. So weíll get rid of our friend, Power Point, and we fire up our friend, Blitz. Ah, and look, a song, how convenient. So hereís the information to keep track of a song. Itís just a class called song. And what we want to do is keep track of song, the songís name, the band name, and the price.
So when we create the song, one of the things we might do is say, hey, give me all that information to start with. Because if youíre going to put some song in your store and youíre going to sell it, it better have some song name and band name that I can use to refer to it by, because thatís going to be its unique identifier and give me some initial starting price. Now, we might necessarily not require an initial starting price, because itís something thatís going to change during the duration of the program, and isnít in support of our unique identifier. But in this case, weíre just going to ask for an initial price. The thing we do care about, in terms of the malleability of whatís actually in this data structure, is thinking about song name, band name, and price. So song name, we only have a getter for there. Thereís no setter. Once the objectís created, you canít change the band name for that song. You canít say, oh yeah, you know, that was ďIn Your Eyes,Ē by Peter Gabriel and now itís going to be like, ďIn Your Eyes,Ē by Kanye West. Like, thatís a different song. And I donít know if thatís happened. Itís probably not a good idea. But the song remains the same, if youíre a Led Zeppelin fan, right? And the band name, actually, the band name is also going to remain the same for that particular song. But the price has both a getter and a setter. Right? Because itís something thatís malleable. After we create that song, yeah, we might change its price. And because we know that we provide both of those things in the definition of the class.
Now, as we talked about, in days of yore, whenever you create any class it should also have a method called Two String. And Two String just returns a string representation of the data in that class. So this just prints out inside double quotes, which is why we have this backslash, quote, thatís a single double quote character, the title of the song in double quotes by the band name, and then it says cost, and it has the price associater with the cost. So it just returns a string to baseline caps lets the data. And hereís the private instance variables of that particular class. Right. Thereís a title, a band, and a price for the title of the song, the band that made the song, and the price of the song. And thatís all the information thatís in there. But it captures and encapsulates the notion of having a song and what parts of the song are static or canít change, and what parts of the song are mutable or can change. Okay. So besides songs, we also have this thing called albums. Any question about the song portion? If youíre sort of feeling good with song, nod your head. All right, good times. If youíre not feeling good song, shake your head. If youíre awake nod your head. Thereís a few thatís not nodding, but thatís okay. Thatís cool, too. So letís do the class for an album. So the class for an album is another thing we care about. And albums become a little more interesting because an album not only has a name, right, so this is going to be a name, and yeah, the name will probably be some string. And thereís also a band, potentially, that produces the album. Now, the interesting thing is the band Ė you might say, but Marilyn, isnít that redundant? Like, donít I have some album and itís going to have a bunch of songs on it, and so I already have names for the band for those songs? So why do I need the name of the band for the album?
Anyone know? Want to venture a guess? Anyone have an album thatís like this, Ď80s compilation is the critical word? Right. You can have an album thatís band isnít actually a real band name. Its band name could just be something like compilation. And itís going to have a bunch of songs on it, each of one which has a distinct band. Okay. So thatís perfectly fine. Thereís no reason why an album, especially in the online world when you can sort of create mixes all the time, needs to have a single band. And so there wouldnít be a need for having bands associated with songs. We still need to have bands associated with the songs. And potentially, at a higher level, we might want to be able to say, is this whole album by one band, or one artist, or is it actually a compilation. Okay? Now, the interesting part though, is that an album not only has a band and name, but it has a list of songs. So how might we keep track of that list of songs? What would be a reasonable data structure we could use?
Instructor (Mehran Sahami): An Array, our friend an Array. Well, the only problem with an Array is, right, it needs to have some fixed size. Thereís some albums out there that are very short, like ďIn A Gadda Da Vida,Ē Iron Butterfly, thereís one song thatís one side of the album, if you were back on the LP days, and what a fine album it is. And thereís other albums that are just like, oh, look thereís like 300 songs on here. Okay.
So an Array with just a fixed size might potentially waste a lot of space. Whatís the more malleable version we could use?
Instructor (Mehran Sahami): Oh, yeah. I love it when itís just all around. All right.
Instructor (Mehran Sahami): [Inaudible] one, I think. Like that post Thanksgiving. Itís like the tryptophan, still like working its way. Yeah. You know.
Ė albums, to begin with. How do we actually add some list of songs on it. We need to have a way to be able to add songs to this album, and once we actually add songs to a list of songs on the album, we need to have some way of being able to list the [inaudible], or perhaps, iterating over them. The only thing with an ArrayList is enter implements collection interface so that it actually provides you enter it. Okay. So letís look at the code for that, just real quickly and then things will become more interesting, afterwards. Okay.
So hereís an album. Inside an album we have an album name and a band, those are the things that are going to start off by constructing an album. So we say hereís the initial album name and band, and what I want to do is build up the contents of that album. So it lets you get the album name and get the band name but you canít set them. Those things are fixed. Okay.
The other thing that Iím actually going to assume here, which is something I didnít assume for songs, is that the name of the album is a unique identifier for the album. Because if I can potentially have compilation albums thatís a compilation of multiple bands, so the band name is just something like compilation or maybe the band name is empty string, the album name by itself should be a unique identifier.
Now, you might say, but Marilyn, thatís not true in the real world. I have multiple albums that have the same title on them. Weíre just going to assume that for the purposes of what weíre doing here, and itíll be okay.
How do we build up the album? We have a notion of adding a song to an album and getting an iterator over the songs on the album. And so the way we do that is weíre going to have something called songs. Let me show you songs down here. Songs is just an ArrayList of songs. Okay. And so if I want to add a song to the album, I pass it in an actual song object and it adds it to its ArrayList. And if I want to list out all these songs that are on the album, I ask for an iterator over all the songs on the album. So what I actually get is an iterator over song objects. Okay.
Two stings just returns the title and the band, it doesnít actually list out all the songs. It just says, hey, itís just this name of this title and this band, and thatís all thatís in an album. Okay. Again, we think about whatís mutable and whatís not mutable.
Now, to put the whole store together, this is where things get a little more interesting. To put the whole store together, you need to think about whatís the store going to do. So let me show you a simple store running and this is the basic text interface for a store. Itís kind of like online store circuit of 1995. Okay. So I can list out all the songs, I can list out all the albums, I can add a song, I can add an album. When the store starts, I have not songs or albums in the store. I need to add them all. I can list all the songs on a particular album and I can also update the price for a song. Okay.
So if I list out all the songs. It says all songs carried by the store and says nothing, because thereís no songs that the store currently has. And list out all the albums carried by the store and list out nothing here, because thereíre no albums. But I can go ahead and do something like add a song. And letís say the song I want to add is ďIn Your Eyes,Ē Peter Gabriel. Any Peter Gabriel fans out there? No? A little bit? Come on. Oh, man. I give up. Itís all over. I just donít believe it. All right. Weíll say the song is, I say, okay, itíll be 99 cents. Go get it. All right.
So we add a song and if we list all the songs, now we have hereís the string representation of a song, ďIn Your Eyes,Ē by Peter Gabriel, cost 99 cents. We still have no albums, rights, we just have a particular song that we can potentially sell by itself and we donít have any albums. So weíll come back to this. But this is the basic idea. We want to be able to list all the songs and albums, add songs, add albums, and then list the information for a particular album. Okay.
So if we think about that, what we need is a bigger data structure to keep track of all this information about multiple songs and multiple albums. Okay.
Now, if we want to manage an inventor, the two things we have to keep in mind are also what I mentioned before. A song can exist in our data that is not on any particular album. So as a result itís not sufficient to just say what albums are carried by the store, because some songs may not be on any album, but we still sell them individually. So we need to have some notion of keeping track of a list of songs.
Now thereís different things we could think of for a data structure to keep track of songs. One thing is an ArrayList, right. Thatís what weíre using in albums to keep track of a whole list of songs. Another thing we could consider is a HashMap of songs. And so if we think about a map versus an ArrayList, what question that you want to think about gets back to this identifier question, right. Because if you want to have a map, say for example, some string to song, and you want this string to uniquely identify a song, this string needs to be something that is a unique identifier. But a song doesnít have one string thatís a unique identifier, itís unique identifierís a combination of a name and a band.
And so all kinds of funky things that are things that people consider. Oh, how can I connect those two strings together? People actually do that in real applications. Weíre not going to do that here. Weíre just gonna say, thereís too much complexity in dealing with this, weíre going to go for a much simpler approach and just say weíre going to have an ArrayList of all of our songs and not worry about the unique identifier issue. So here we have an ArrayList of type song, and weíll just call this songs, thatís all the songs in our database. And so here we create a new ArrayList of song and we call it constructor. Okay. Now, life in the album worldís a little bit different. Besides just keeping track of a list of songs, we also need to keep track of albums. But in the album world the name is actually a unique identifier. And if we want to be able to look up albums quickly, it might make sense to use a HashMap. So part of doing this whole example is to actually show you both ArrayList and HashMap in one application.
So what we could do is have a HashMap that maps from stings to albums where the map, this string, is in some sense the name of the album and this is the actual album object. And weíll call this albums and we can do all the new, you know, la de da HashMap we actually created. Okay. So now we have these two big data structures that actually keep track of stuff for us.
Now, hereís where things get a little bit funky. And when things get funky, what youíre going to need when you deal with big data structures, you need a guide. And youíll see this in just a second because youíre going to see some of the code that we write gets very long when we deal with big data structures. So Iíll be your guide. All right. So in the days of yore, I almost bought the whole outfit. But itís a little hot in here, under the lights. So in order to actually think about how you get the information and store the information when you have a large data structure, paper and pencil is your friend. Right If you spend all your time just staring at a computer screen it doesnít really allow you to internalize what is your data structure really look like and whatís going on. So break out some pencil and paper, not right now but when youíre working on data structures, and draw out, potentially, what things look like.
So hereís songs and songs, and songs is an ArrayList. And itís going to have multiple, letís say at this point, three songs in it. And over here we have albums and albums is a HashMap, albums, that maps from names of an album to a particular album object. Now, the important thing to keep in mind in objects, and this is kind of the whole key to big data structure, is all objects, when you refer to them in Java, are references to objects. Remember when we talked about that. When you pass an object to a particular method in some application, youíre passing a reference to the object. Youíre passing where that object lives. Okay.
Which means that when you have an ArrayList of songs, which what you really have here are a bunch of references, which we can think of as pointers that refer to the actual objects that contain the songs. Okay. So over here thereís a ďIn Your Eyes,Ē by Peter Gabriel and it was 99 cents. And over here we might have say, ďRamble On,Ē tell me thereís some Zeppelin fans out there. All right, good, good. We will not have to end lecture early. And ďRamble OnĒ is such a great song, itís like $12.99 by itself, single son. Thatís probably why most people donít listen to it. And over here we have the master, ďStairway to Heaven,Ē Stairway to H, weíll just abbreviate it. Because itís that good, weíll just have a moment of silence, also, by Led Zeppelin, and weíll just say that one should be like, 49 cents so everyone can listen to it. Itís just kind of like the bonus tune. All right. And so thatís what we have in a list of songs. Now, hereís the interesting part, right. If Iím going to have some albums, so I add some albums. So letís say add some album on here like ďSoul,Ē by Peter Gabriel, and Soul actually has the song, ďIn Your EyesĒ on it. Okay.
Now thereís two things that come up we will need to think about when we actually do this. We need to say, hey, this has got some ArrayList associate in there, and so I can create a new object that is a song for ďIn Your EyesĒ and set my ArrayList to be a reference to that object. And thatís a reasonable thing to do in some cases. The only problem is what happens if I go into my store and say, hey, I want to change my song ďIn Your EyesĒ from being 99 cents, because no oneís heard of it before, to 9 cents. Okay. So if I go thought my list of songs I say, oh, here it is, Iíll change itís cost to be 9 cents. Now, unless I go through all of my albums and find for every album go though every song thatís listed on the album and see if I can find that same song duplicated, Iím going to create an inconsistency in my data. What I really want to have is say, hey, thereís only one object that is that song. And if that song happens to be a song thatís sold individually, or itís a song thatís both in my list of songs and on some albums, thereís only one object ever that I refer to for that song, which means, I never create the second object out here for that same song. What I do, is when Iím creating the album ďSoul,Ē and someone tells me, oh, itís got the song, ďIn Your Eyes,Ē on it, I say, hey does that already exist in my store. If it does exist in my store, Iím going to add that object to my ArrayList. Iím not going to create a new object, which means each song only ever gets created once, but it can potentially get added to multiple ArrayLists. And itís the same single underlying object that has multiple references to it.
Why is that cool? Thatís cool because now, when I come along and a whole bunch of people start listening to ďIn Your Eyes,Ē and Iím like, Peter Gabriel, he just deserves a lot more money, weíre going to make this $9.99. Itís $9.99 everywhere by changing it once. And thatís the real key to large-scale software engineering. You think about not only reusing Ė you remember for a long time we talked about having methods that you reuse and how you generalize your methods, this is about reusing your data. Thinking about your data, sort of, if itís only one thing, exists in one place ,and everything refers to it. Okay. So any questions about that idea? This is what we refer to as a shallow copy, because what youíre getting, after youíve created that song once, when you want to add that song somewhere else, youíre just setting a reference to it, youíre creating a shallow copy, thereís only one copy. The thing we did before, where we actually created a whole separate structure, is referred to as a deep copy. And sometimes, deep copies make sense in some particular cases. Most of the time they actually, well, I wonít say mot of the time, they donít, it depends on the application, but most of the time what youíll actually be using is your friend, the shallow copy. Okay.
So what does that actually look like if we try to turn that into some code? Well, what does that mean in the application? Let me show you what that means in the application. So weíre going to add some songs. Weíre going to go through another example. All right, let me add the song and Iíll just abbreviate, ďIn Your Eyes,Ē Peter Gabriel, $1.99. Then Iím going to add ďRamble On,Ē oops, ďRamble On,Ē Led Zeppelin, and weíll make that, oh, I donít know, $2.99. Okay. Now, at this point I have two songs. Now, Iím going to add an album. So I add a particular album and the album Iím going to add is ďSoul,Ē by Peter Gabriel and it says enter a song name. Itís going to have ďIn Your EyesĒ on it. And it asks me because the unique identifier is both the song and the band name, it still needs to ask me for the band name, and the band name I give it is Peter Gabriel. And it says, hey, that song is already in the store. Itís just letting you know, hey, I found that song in my store, so when I add it to the album, Iím adding that same object thatís also in my store to the album. And then you could say, well, thereís other stuff on there like there happens to be a tune called, ďRed Rain,Ē which is also by Peter Gabriel, and you know itís a fine tune, but letís just say itís 1 cent, okay. And it says new song to add to the store. What did it do here? What it did in this case, it says, hey, you want to have a new song called ďRed Rain,Ē by Peter Gabriel. That song costs 1 cent, you want to add it to your album.
Well, if you want to add it to your album, itís also a song that Iím going to see in the store. So it actually adds it to the store and adds it to the album. And thereís still only one copy of that object ever. It just needs to make sure that when it creates a new song to add to an album thatís not already in the store, it adds it to the store, as well as to the album. If the song already exists in the store then it just adds a reference to the album. Okay. Thatís the critical idea here. All right. So now, if we sort of list Ė Iíll hit enter quit Ė and if we list all the songs, right, the song ďRed RainĒ has now been added to the store and costs 1 cent. And if I list all the albums that are sold by Peter Gabriel, and if I list all the songs on that album, it has the songs ďIn Your EyesĒ and ďRed RainĒ so it matches the picture that I think.
Thatís why having a piece of paper, where you draw pictures, is useful. Because you look at what youíre application is doing and you say, does it match what I actually think should be happening in my picture. And if doesnít, then you know one of two things is wrong. Either your pictureís wrong or your code thatís supposed to be dealing with that picture is wrong. But in either case, youíve already figured out a bug, even though the program hasnít crashed or anything, you just know thereís an inconsistency. Okay. And so now, if I update the price for a song, like I update the song, ďIn Your Eyes,Ē by Peter Gabriel, and I change itís price to, I just go crazy, no oneís going to buy the song anymore, the price is updated. Now, if I list all the songs, that song is $999.99 in the store, and if I also list the songs on any album Ė five, so lower case, the price is also updated on each of the individual albums, because thereís only one object. Okay. Thatís where the consistency comes in. Thatís why the consistencyís key. Okay.
So what does this actually look like in code? How do we do this? Let me show you what the actual application looks like for our little friend, the Flytune Store. Okay. So thereís a bunch of stuff at the beginning that just asks for the user selection, basically print some stuff out to allow you to make a selection, and then gets youíre selection for you. And then thereís a big case statement that calls an appropriate method, depending what selection you made. So Iíll go though some of the simple ones pretty quickly. You can list out all the songs carried in the store. In order to be able to do that, we need to keep track of how this informationís actually stored, itís exactly in these data structures I just showed you. Song is kept track of in an ArrayList of songs and albums is kept track of in a HashMap that maps from the name of the album to the actual album data structure, itself. Okay. Any questions about that, hopefully, thatís all clear. I will take off the hat.
So how do we print these things out? To list all the songs, we just go through our ArrayList up to its size, and this is why you want to think of data structure as your needed guide, because youíre going a journey. At any given point, when youíre dealing with a data structure, you want to think, what is the type Iíll dealing with right now? What does that mean? It means, when I want to print something out, what I need is a string that prints out. How do I get a string? If I started at songs, songs is an ArrayList. I donít have a string I can print out. But from an ArrayList I can get an individual element. When I get an individual element of that ArrayList, what do I have? I still donít have a string I have a song. What can I ask the song for? I can ask to get the string version of the song and I have a string to print out. Okay.
So you always want to think of it as youíre going on a journey. Where do you start your journey? Youíre journey starts at the data structures you have available to you. In this case, we have a data structure called songs, another data structure called albums, thatís whatís available to us. And what we want to do is go from that starting point through a series of steps to get to the thing that we actually care about at the end, hat little piece of data that we want to display or interact with somehow. So hereís another example. If I want to list all the albums, how do I list all the albums? Well, to list all the albums, albums is a HashSet. So in order to do something with a HashSet I need to say, hey, I want an iterator over all the keys of that HashSet. So albums is the HashSet, I get the keys of the HashSet, which is a collection, and I get an iterator for that collection, which is an iterator over all the keys of the HashSet. And now, as long as my album iterator, which is just my iterator over the keys, has an element, what do I do? I start at albums. I need say I need to get a particular album. Okay. Get. Which album am I going to get? Iím going to get the album whose name is associated with the next elements of the iterator. Right, because itís an iterator over all the names of albums. So get, gives me a particular album. Then, when I have the particular album, I can call two strings on it to get the string form of the album. Okay, any questions about that? Because theyíre going to get even longer, so if there are any questions about sort of the chain of things we call.
If itís making sense, the chain of things we call, nod your head. All right, and if itís not making sense, shake your head. And if itís kind of making sense, just keep looking and ask a question if a question comes to mind. All right. So how do I find a particular song? This is something where Iím going to use the helper method, so itís private to find a particular song. Songs, our unique identifier, is a combination of both the band name or the name of the song and the band name. So how do I check for that? Iím going to go through all my songs, itís an ArrayList so I can count through all the songs. Hereís where things get long. How do I check to see if a song, thatís actually in my data set, matches on its name with the name thatís passed in? I start at songs, get the I song, and I have one particular song. For that particular object I get the song and name. Now, I have a string. I want to check to see if that string is equals to the name thatís passed in. Okay.
And I do the same thing with band names. Song, get the I song, get the band name of that song, and then check to see if thatís equal to the band. And if both of these are equal, then, hey, I found the song, and so Iím going to return an index, which is the index location of that song in my ArrayList, and I can just break out of the four-loop, here. Because once I find it, I say, hey, I found that, I donít need to keep looking, so actually this is one of the rare cases where youíll see a break in a four-loop, is you donít need to finish the loop. You got to what you were looking for and get out of the loop. If you manage to get through this whole loop without ever finding something that matches on both, the name and the band, well, your index remains negative one. So you return negative one to indicate, hey, I didnít find it, because you know negative oneís not a valid index for an ArrayList. So if you return it that means you didnít find a valid element. Okay. How do we use find song? Hereís how add song works. Okay. When you want to thing about add song, you want to think about this property that weíre only ever going to create an object once, and everything else is going to be references to that object. So the way add song is going to work is itís going to return a song object. Okay. And what itís going to do, is itís going to ask us for the name of a song, if the user enters blank line that means they want to stop adding songs so it just returns null to say, hey, you want to stop adding songs, I didnít create a new song, hereís a null to indicate you are done. But if they donít impress enter quick, I also ask for a band name, and then I ask to find the song. Okay.
I call that find song method I just wrote and I say, does that song exist. If the song exists, the song index is not going to be minus one. And that means, that song already exists in the store. So you told me to add a song that already existed in the store. So Iím not going to create a new song because itís already an object in the store that encapsulates all the information for that song, I will return to you a reference to that object, which means I just returned from the songs ArrayList whatever song happens to be at the index that that song actually lives at. Okay. So this just returns an actual object. It actually returns a reference. If you can, think of it as returning a pointer to the object. If I didnít find it in there, then, hey, I need to create the new song, right. Itís sort of like ďRed RainĒ at the end. You wanted to add a song. It didnít exist in the store, let me get the price for that song. Iíll create a new song object and now. Hereís the funky thing, I will add that song to my ArrayList of songs for the whole store, write out to that the new song was added to the store, and Iíll return that new song to you so you can do whatever you want with it.
And so now, you might ask, okay, Marilyn, if I just added a song to the store I donít really care about doing anything with that song, why are you returning the song to me? And thatís true. If I just add a song to the store, if thatís all I care about, I ignore the return value. Thatís actually what I do up here, which is very funky. Right. If you want to add a song, I just call the add song method, it goes ahead and adds the song to the store, if it doesnít already exist, and it returns reference that song object. If all Iím doing is adding a song, I donít care I just ignore it. I donít assign it to anything, I just say, yeah, thanks for returning that object, that was fun, whatever, and just get rid of it. Okay. But the reason why Iíve written it this was is if Iím adding an album, what do I do? I ask for the name of the album, and I check to see if that albumís already in the store. If the albumís already in the store Iím not going to do anything because the albumís already in the store. If the albumís not already in the store, then I ask for the band name and I create a new album. And then I put that album in the store. So album is my HashMap. I put in that HashMap the name of the album is going to be the key and the actual album object is the object. So I add, you know, the album ďSoulĒ to my HashMap.
Now, Iím going to add all the songs. So I have a Y-loop that goes through and keeps adding songs until I get a null from add song to indicate that the user wanted to stop adding songs. But hereís the funky part, every time the user adds a song, right, it comes along and says, hey, you want to create some new album? So letís say I actually want to create some new album over here when I create the album ďSoul,Ē so none of this stuff exists yet. Okay. So to create a new album, I say, hey, I want to create the album ďSoul.Ē It says, okay, thatís fine, create an object for the album ďSoul.Ē It has the name ďSoul,Ē itís by Peter Gabriel. And it says, okay, what songs are on going to be in there? And it starts asking me for songs, because itís going to add them to my ArrayList in here. And so the first song I say is ďIn Your EyesĒ in on that album. It goes and says, hey, find that song, it already exists. It returns a reference to that song, as a pointer that reference is what gets added to my ArrayList. Now, I go and ask for another song. Do you have any more songs? I say, yeah, thereís another song. The song is called ďRed Rain.Ē When I go to create ďRed RainĒ it comes up here to add song, add song comes along, asks for the name and the band, it tries to find the song and says, hey, that song isnít already there, so Iím going to create a new song. It creates a new song called ďRed Rain,Ē by Peter Gabriel, has some price associate with it, and adds it to the list of songs for the store. And then it returns this object, which means it returns a reference to this object, and that reference to the object Ė oops, sorry this got blocked Ė all right, this is where it is creating a new song, and it adds the song to the store.
Right, it adds it as a song, which is that ArrayList up there, and then it returns the object. So when it returns the object, I went too far, the add object does not know I add that song to the album. So this album, weíre going to add a song, and the song weíre going to add is that same object. Itís ďRed Rain.Ē Okay. So thatís the important thing to keep in mind. That object we only created once, and we passed around references to it or we can return references to it, and assign them other places. And thatís how you get consistency in a much bigger date structure. Now, there are a bunch of other things we could do in here. I wonít go through all the excruciating details down here. But we can list the songs on an album, we can update the songís price. And by updating a songís price, all we do is we ask for the song and the band, we find the song in the data set if it existed. If it doesnít exist we just say, hey, itís not in the store, and if it does exist then we read in itís price, and then for the songs in the store, we find the song at that index and set its price. And we know that whatever other albums contain that particular song, if we happen to update the price over here to you know, $6.99, we only update it once and all the places that refer to it automatically will see the updated version, because they point to the same object. Okay.
Any questions about that? So I know itís a lot of complexity to kind of deal with a big data structure like this. But now itís one of those things like, now youíre old enough to kind of see the big honking data structure. Because in the real world, when people think of software engineering the large, these are the kinds of things they need to worry about and thatís where the complexity comes in. Itís keeping track of all your objects and thinking about what objects you actually need to design and build, in order to actually build an application thatís kind of successful to keep track of and makes thing consistent with all the data you have. So any other questions? Uh huh.
Instructor (Mehran Sahami): Oh, can you use the mic, please?
Student: Sorry. So in this application, do all the songs and albums, theyíre also, I guess, singles, or Ė cause the albums are never priced, right? Itís just the individual songs
Instructor (Mehran Sahami): Right. So the albums donít have a price. You could imagine the cost for an album is the total of all the songs on the album. Or you could actually do something funky. Like this is one of those places you can make a policy decision and say, an album is 90 percent of the cost of all the songs on it. And then all the individual song prices can change and any time you just say, whatís the price of the album, total up all the prices of the songs, and take 90 percent of that. So it also allows for very dynamic album pricing.
Student: Thank you.
Instructor (Mehran Sahami): All righty. If thereís any more questions, come on up. Otherwise Iíll see you on Wednesday.
[End of Audio]
Duration: 49 minutes