An Introduction to C++ Programming - Part 2/13

Written by Björn Fahller

Classes
Last month we saw, among others, how we can give a struct well defined values by using constructors, and how C++ exceptions aid in error handling. This month we'll look at classes, a more careful study of object lifetime, especially in the light of exceptions. The stack example from last month will be improved a fair bit too.

A class
The class is the C++ construct for encapsulation. Encapsulation means publishing an interface through which you make things happen, and hiding the implementation and data necessary to do the job. A class is used to hide data, and publish operations on the data, at the same time. Let's look at the "Range" example from last month, but this time make it a class. The only operation that we allowed on the range last month was that of construction, and we left the data visible for anyone to use or abuse. What operations do we want to allow for a Range class? I decide that 4 operations are desirable: The second thing to ask when wishing for a function is (the first thing being what it's supposed to do) is in what ways things can go wrong when calling them, and what to do when that happens. For the questions, I don't see how anything can go wrong, so it's easy. We promise that the functions will not throw C++ exceptions by writing an empty exception specifier.
 * Construction (same as last month.)
 * find lower bound.
 * find upper bound.
 * ask if a value is within the range.

I'll explain this class by simply writing the public interface of it: struct BoundsError {}; class Range {  public: Range(int upper_bound = 0, int lower_bound = 0) throw (BoundsError); // Precondition: upper_bound >= lower_bound // Postconditions: //  lower == upper_bound //  upper == upper_bound int lowerBound throw ; int upperBound throw ; int includes(int aValue) throw ; private: // implementation details. }; This means that a class named "Range" is declared to have a constructor, behaving exactly like the constructor for the "Range" struct from last month, and three member functions (also often called methods,) called "lowerBound", "upperBound" and "includes". The keyword "public," on the fourth line from the top, tells that the constructor and the three member functions are reachable by anyone using instances of the Range class. The keyword "private" on the 3rd line from the bottom, says that whatever comes after is a secret to anyone but the "Range" class itself. We'll soon see more of that, but first an example (ignoring error handling) of how to use the "Range" class: int main(void) {    Range r(5); cout << "r is a range from " << r.lowerBound << " to " << r.upperBound << endl; int i;    for {      cout << "Enter a value (0 to stop) :"; cin >> i;      if (i == 0) break; cout << endl << i << " is " << "with" << (r.includes(i) ? "in" : "out") << " the range" << endl; }    return 0; } A test drive might look like this: [d:\cppintro\lesson2]rexample.exe r is a range from 0 to 5 Enter a value (0 to stop) :5 5 is within the range Enter a value (0 to stop) :7 7 is without the range Enter a value (0 to stop) :3 3 is within the range Enter a value (0 to stop) :2 2 is within the range Enter a value (0 to stop) :1 1 is within the range Enter a value (0 to stop) :0 Does this seem understandable? The member functions "lowerBound", "upperBound" and "includes" are, and behave just like, functions, that in some way are tied to instances of the class Range. You refer to them, just like you do member variables in a struct, but since they're functions, you call them (by using the, in C++ lingo named, function call operator "".)

Now to look at the magic making this happen by filling in the private part, and writing the implementation: struct BoundsError {}; class Range {  public: Range(int upper_bound = 0, int lower_bound = 0) throw (BoundsError); // Precondition: upper_bound >= lower_bound // Postconditions: //  lower == upper_bound //  upper == upper_bound int lowerBound throw ; int upperBound throw ; int includes(int aValue) throw ; private: int lower; int upper; };  Range::Range(int upper_bound, int lower_bound) throw (BoundsError) : lower(lower_bound), /***/ upper(upper_bound) /***/ {  // Preconditions. if (upper_bound < lower_bound) throw BoundsError; // Postconditions. if (lower != lower_bound) throw BoundsError; if (upper != upper_bound) throw BoundsError; }  int Range::lowerBound throw {    return lower; /***/ }  int Range::upperBound throw {    return upper; /***/ }  int Range::includes(int aValue) throw {    return aValue >= lower && aValue <= upper; /***/ } First, you see that the constructor is identical to that of the struct from last month. This is no coincidence. It does the same thing and constructors are constructors. You also see that "lowerBound", "upperBound" and "includes", look just like normal functions, except for the "Range::" thing. It's the "Range::" that ties the function to the class called Range, just like it is for the constructor.

The lines marked /***/ are a bit special. They make use of the member variables "lower_bound" and "upper_bound." How does this work? To begin with, the member functions are tied to instances of the class, you cannot call any of these member functions without having an instance to call them on, and the member functions uses the member variables of that instance. Say for example we use two Range instances, like this: Range r1(5,2); Range r2(20,10); Then r1.lowerBound is 2, r1.upperBound is 5, r2.lowerBound is 10 and r2.upperBound is 20.

So how come the member functions are allowed to use the member data, when it's declared private? Private, in C++, means secret for anyone except whatever belongs to the class itself. In this case, it means it's secret to anyone using the class, but the member functions belong to the class, so they can use it.

So, where is the advantage of doing this, compared to the struct from last month? Hiding data is always a good thing. For example, if we, for whatever reason, find out that it's cleverer to represent ranges as the lower bound, plus the number of valid values between the lower bound and upper bound, we can do this, without anyone knowing or suffering from it. All we do is to change the private section of the class to: We also have another, and usually more important, benefit; a promise of integrity. Already with the struct, there was a promise that the member variable "upper" would have a value greater than or equal to that of the member variable "lower". How much was that promise worth with the struct? This much: Range r(5, 2); r.lower = 25; // Oops! Now r.lower > r.upper!!! Try this with the class. It won't work. The only one allowed to make changes to the member variables are functions belonging to the class, and those we can control.

Destructor
Just as you can control construction of an object by writing constructors, you can control destruction by writing a destructor. A destructor is executed when an instance of an object dies, either by going out of scope, or when removed from the heap with the delete operator. A destructor has the same name as the class, but prepended with the ~ character, and it never accepts any parameters. We can use this to write a simple trace class, that helps us find out the life time of objects. What this simple class does is to write its own parameter string, prepended with a "+" character, when constructed, and the same string, prepended by a "-" character, when destroyed. Let's toy with it! When run, I get this behaviour (and so should you, unless you have a buggy compiler): What conclusions can be drawn from this? With one exception, the object on heap, objects are destroyed in the reversed order of creation (have a careful look, it's true, and it's always true.) We also see that the object, instantiated with the string "leaky" is never destroyed.

What happens with classes containing classes then? Must be tried, right? What's your guess? This means that the contained object ("Tracer") within "SuperTracer" is constructed before the "SuperTracer" object itself is. This is perhaps not very surprising, looking at how the constructor is written, with a call to the "Tracer" class constructor in the initialiser list. Perhaps a bit surprising is the fact that the "SuperTracer" objects destructor is called before that of the contained "Tracer", but there is a good reason for this. Superficially, the reason might appear to be that of symmetry, destruction always in the reversed order of construction, but it's a bit deeper than that. It's not unlikely that the member data is useful in some way to the destructor, and what if the member data is destroyed when the destructor starts running? At best a destructor would then be totally worthless, but more likely, we'd have serious problems properly destroying our no longer needed objects.

So, the curious wonders, what about C++ exceptions? Now here we get into an interesting subject indeed! Let's look at two alternatives, one where the constructor of "SuperTracer" throws, and one where the destructor throws. We'll control this by a second parameter, zero for throwing in the constructor, and non-zero for throwing in the destructor. Here's the new "SuperTracer" along with an interesting "main" function. Here we can study different bugs in different compilers. Both GCC and VisualAge C++ have theirs. What bugs does your compiler have? Here's the result when running with GCC. Comments about the bug found are below the result: The first 4 lines tell that when an exception is thrown in a constructor, the destructor for all so far constructed member variables are destructed, through a call to their destructor, but the destructor for the object itself is never run. Why? Well, how do you destroy something that was never constructed? The next four lines reveal the GCC bug. As can be seen, the exception is thrown in the destructor, however, the member Tracer variable is not destroyed as it should be (VisualAge C++ handles this one correctly.) Next we see the interesting case. What happens here is that an object is created that throws on destruction, and then an object is created that throws at once. This means that the first object will be destroyed because an exception is in the air, and when destroyed it will throw another one. The correct result can be seen in the execution above. Program execution must stop, at once, and this is done by a call to the function "terminate". The bug in VisualAge C++ is that it destroys the contained Tracer object before calling terminate.

What's the lesson learned from this? To begin with that it's difficult to find a compiler that correctly handles exceptions thrown in destructors. More important, though, think *very* carefully, before allowing a destructor to throw exceptions. After all, if you throw an exception because an exception is in the air, your program will terminate very quickly. If you have a bleeding edge compiler, you can control this by calling the function "uncaught_exception" (which tells if an exception is in the air,) and from there decide what to do, but think carefully about the consequences.

An improved stack
The stack from last month was in many ways better than a corresponding C implementation, but it was far from adequate. An easy, C-ish way of improving it, is to implement it as an abstract data type, where functions push, pop, and whatever needed is available to the users. The C++ way is, not surprisingly, to write a stack class. Before going into that, though, some thinking is needed regarding what the stack should do.

Minimum for a stack is functionality to push new elements onto it, and to pop the top element from it. The pop function is a classical headache, because it both changes the state of the stack (removes the top element from it) and returns whatever was the top element. This behaviour is dangerous in terms of errors, because you can easily lose data. What if something fails while removing the top element? Should you return the top element value? If you do, does that indicate that the it has been removed? It's better to make two functions of it, one that returns the top element, and one that removes it. The one that removes it either returns or throws an exception (remember, either a function fails, or does what it's supposed to do, there's no middle way. If it fails, it exits through an exception, otherwise it returns.)

OK, so, we can see a class that, on the surface, looks something like this: This looks fair. Normally copying and assignment (a = b) would be implemented too, but we'll wait with that until next month, or this article will grow far too long. Now let's look at what can go wrong in the different operations. Since top and pop requires that the stack isn't empty, we must allow the user to check if the stack is empty, otherwise we don't leave them a chance, so another function is needed. So, with the problems identified, let's think about what to do when they occur.
 * top. What if the stack is empty? It mustn't be.
 * pop. Again, what if the stack is empty?
 * push. Out of memory.
 * construction. Nothing really.
 * destruction. Tough one. If the stack is in a bad state, it might be indestructible.
 * isEmpty. I don't see how anything can go wrong in here.
 * top and pop on empty stack, throw exception, stack remains empty.
 * Out of memory on push. Throw exception and leave stack unchanged.
 * invalid stack state in destruction? Can we find out of we have them? I don't think we can, without adding significant control data, that probably increases the likelihood of exactly the kind of errors we want to avoid. Thus, I *think* the best solution for this problem is to just be careful with the coding, and hope it doesn't happen.

This leaves us with two different errors: Stack underflow (pop or top on empty stack), and out of memory.

We also found, rather easily, the preconditions for operations pop and top (!isEmpty.)

Now to think of post conditions. What's the post conditions for the different operations?

push(anInt): The stack can't be empty after that (post conditions always reflect successful completion, not failure.) Also top == anInt.

pop: Currently no way to say, but let's change things a bit. Instead of having the method isEmpty we add the method nrOfElements, then nrOfElements will be one less after pop.

top: nrOfElements same after as before.

Construction (from nothing): nrOfElements == 0.

Destruction? Nothing. There's no object left to check the post condition on! We can state a post condition that all memory allocated by the stack object is deallocated, but we can't check it (try to think of a method to do that, and tell me if you find one.)

So, now we can write the public interface of the stack: The promise to always leave the stack unchanged in when exceptions occur means that we must guarantee that whatever internal data structures we're dealing with must always be destructible. This is tricky, but it can be done. This requirement is also implied by our destructor guaranteeing not to throw anything.

*1*: the structs stack_underflow and bad_alloc are empty, we just throw them, and use the struct itself as the information, nothing more is needed. For really new compilers, the new operator throws a pre-defined class called bad_alloc. If you have such a compiler, remove the declaration of it above.

*2*: This looks odd, perhaps, but what this means is that if there are elements on the stack, the top elements must be the same. Or literally as it says in the code comment, either the stack is empty, or the top elements are equal. You'll get used to this reversed looking logic.

*3*: This is how the assignment operator looks like, if included, and below it, the copy constructor (constructing a new stack by copying the contents of an old one.) I said we wouldn't implement these this month, and ironically that is why they are declared private. The reason is that if you don't declare a copy constructor and assignment operator, the C++ compiler will do it for you, and unfortunately, the compiler generated ones are usually not the ones you'd want. I'll talk more about this next month. By declaring them private, however, coping and assignment is explicitly illegal, so it's not a problem.

So, how do we implement this then? Why not like the one from last month, but with an additional element counter? I think that's a perfectly reasonable approach. Here comes the complete class declaration, with the old "stack_element" as a nested struct within the class. The only peculiarity here is that the constructor for the nested struct "stack_element" is defined in line (i.e. at the point of declaration.) As a rule of thumb, this should be avoided, but it's OK for trivial member functions, like this constructor, which only copies values.

So let's look at the implementation, bit by bit. These are rather straight forward. The guarantee that "delete pTop" doesn't throw comes from the fact that the destructor for "stack_element" can't throw (which is because we haven't written anything that can throw, and the contents of "stack_element" itself can't throw since it's fundamental data types only.) Here I admit to being a bit lazy. Strictly speaking, the post condition should be checked, but since all that is done is to return a value, it is obvious that the stack cannot change from this. I leave the post condition, and an explanation for my laziness, as a comment, though, since it's valuable to others reading the sources. It's also valuable if, for some reason, the implementation is changed so that it is not obvious. If that happens, the check should be implemented. This is not trivial, and also contains some news. Let's start from the beginning. "old_nrOfElements" is used both for the post condition check that the number of elements after the push is increased by one, but also when restoring the stack should an exception be thrown. The call to "nrOfElements" could throw "pc_error". If it does, the exception passes "push" and to the caller since we're not catching it. This is harmless since we haven't done anything to the stack yet. On the next line we store the top of stack as it was before the push. This is used solely for restoring the stack in the case of exceptions. This assignment cannot throw since "pOld" and "pTop" are fundamental types (pointers). On the next line a new stack element is created on the heap. Here there are three possibilities. Either the creation succeeds as expected, in which case everything is fine, or we're out of memory (the only possible error cause here since the constructor for "stack_element" cannot throw.) For most of you, an out of memory situation will mean that the return value stored in "pTmp" is 0. That case is taken care of on the next two lines.

If you have a brand new compiler, on the other hand, operator new itself throws "bad_alloc" when we're out of memory. If you have such a compiler, it'll most probably complain about the next two lines. If so, just remove them, since they're unnecessary in that case. OK, either case, if we're out of memory here, "bad_alloc" will be thrown and the stack will be unchanged. Next we start doing things that changes the stack, and since we promise the stack won't be changed in the case of exceptions, things that do change the stack goes into a "try" block. Setting the new stack top and incrementing the element counter is not hard to understand.

The post condition check is interesting, though. Here we have three situations in which an exception results. The call to "nrOfElements" may throw, the call to "top" may throw, and the post condition check itself might fail, in which case we throw ourselves. All these three situations are handled in the catch block. "catch (...)" will catch anything thrown from within the try block above. What we do when catching something, is to free the just allocated memory (which won't throw for the same reason as for the destructor.) We also restore the old stack top and the element counter. Thus the stack is restored to the state it had before entering "push", without having leaked memory. Then, what we must do, is to pass the error on to the caller of "push", and that is what the empty "throw;" does. An empty "throw;" means to re throw whatever it was that was caught. A throw like this is only legal within a catch block (use it elsewhere and see your program terminate rather quickly.) This is not so difficult. If we have no elements on the stack, we throw, otherwise return the top value. As with "nrOfElements", I'm lazy with the post condition check, but careful to document the behaviour should the implementation for some reason change into something less obvious. The exception protection of "pop" works almost exactly the same way as with "push". The thing worth mentioning here, though, is why "delete pOld" is located after the "catch" block and not within the "try" block. Suppose the deletion did, despite its promise, throw something. If it did, it would be caught, and the top of stack would be left to point to something undetermined. As it is now, if it breaks its promise, we too break our promise not to alter the stack when leaving on exception, but we at least make sure the stack is in a usable (and most notably, destructible) state.

After having spent this much time on writing this class, it's time to have a little fun and play with it, don't you think? I'm staying within the limits of the allowed here, but please make changes to the test program, and the stack implementation, to break the rules and see what happens. It should either work, or say why it fails. Um, yes, on the catch clauses I don't bind the exception instance caught to any named parameter. The reason is simply that I don't use it. The knowledge that something of that type has been caught is, in this case, enough.

Now I will break a promise from last month. I won't go into more details with references. That'll be dealt with later, because this is where I end this month's lesson.

Exercise

 * 1) Something that is badly missing in the stack implementation above, is a good check for the integrity of the stack object itself. For example, what if we somehow manage to get "elements" to non-zero, while "pTop" is 0? That's a terrible error that must not occur, and if it does, it must not go undetected. What I'd like you to do, is to see what kind of "internal state" tests that can be done, and to implement them. Please discuss your ideas with me over e-mail (this month, if I take a long time in responding, please don't feel ignored. I'll be net-less most of August.)
 * 2) It's generally considered to be a bad idea to have public data in a class. Can you think of why? Mail me your reasons.

Recap
Again a month filled with news.
 * You have seen how classes can be used to encapsulate internals and publish interfaces with the aid of the access specifiers "public:" and "private:"
 * Member functions are always called together with instances (objects) of the class, and thus always have access to the member data of the class.
 * A member function can access private parts of a class.
 * Destruction of objects is done in the reversed order of construction, except when the objects are allocated on the heap, in which case they're destroyed when the delete operator is called for a pointer to them.
 * We have seen that throwing exceptions in destructors can be lethal. (This is not to say that it should never ever be done, but that a lot of thought is required before doing so, to ensure that the program won't die all of a sudden.)
 * You can now iterate your way to a good design by thinking of 1. What the function should do. 2. What can go wrong. 3. What should happen when something goes wrong. 4. How can a user of the class prevent things from going wrong. When you have satisfactory answers to all four questions for all functionality of your class, you have a safe design.
 * You have seen how it is possible to, by carefully crafting your code, make your member functions "exception safe" without being bloated with too many special cases ( "catch(...)" and "throw;" helps considerably here.)

Next
Next month there will most probably be a break, since I'll be on a well needed vacation. After that, however, we'll have a look at copy construction and assignment (together with a C++ idiom often referred to as the "orthodox canonical form.") I promise to explain the references in more detail too.

Please, by the way, I need your constant feedback. Write me! I want opinions and questions. If you think I'm wrong about things, going to fast, too slow, teaching the wrong things, whatever; tell me, ask me.