An Introduction to C++ Programming - Part 11/13

Written by Björn Fahller

Auto Pointer
[Note: the source code for this month is auto_ptr.h here. Ed.]

In the past two articles, we've seen how a simple smart pointer, called simply ptr was used to make memory handling a little bit easier. That is the core purpose of all smart pointers. You can find a few things in common with them all. They're templates, they relieve you of the burden of remembering to deallocate the memory, their syntax resembles that of pointers, they aren't pointers, and they're dangerous if you forget that you're dealing with smart pointers.

While ptr served its purpose, it's a bit too simplistic to be generally useful. For example there was no way to rebind an object to another pointer, or to tell it not to delete the memory (that too can be useful at times, as we will see later in this article.)

This article is devoted to the only smart pointer provided by the standard C++ library; the class template auto_ptr.

The problem to solve
I do not know what the core issues where when the auto_ptr was designed, but I know what problems the implementation provided does solve.


 * exception safety
 * safe memory area ownership transfer
 * no confusion with normal pointers
 * controlled and visible rebinding and release of ownership
 * works with dynamic types
 * pointer-like syntax for pointer-like behaviour

Let us have a look at each of these in some detail and compare with the previous ptr.

Exception safety
In this respect auto_ptr<T></tt> and ptr<T></tt> are equal. They both delete whatever they point to in their destructor. The only thing that auto_ptr<T></tt> has to offer over ptr<T></tt>, with respect to exception safety, is that we can tell an auto_ptr<T></tt> object that it no longer owns a memory area. This can be used for holding onto something we want to return in normal cases, but deallocate in exceptional situations. Here is a code fragment showing such a situation: In the code above, an exception thrown from f</tt> results in the destruction of the auto_ptr </tt> object p</tt> before the call of the release</tt> member function, which means that the object pointed to will be deleted.

If, however, f</tt> does not throw any exception, p.release</tt> is called, and the value returned from there is passed to the caller. The member function release</tt> releases ownership of the memory area from the auto_ptr </tt> object. The value returned is the pointer to the memory area. Since the object no longer owns the memory area, it will not be deallocated.

Safe memory area ownership transfer
This safety is achieved by cheating in the assignment operator and copy constructor. The reason I've quoted the names assignment and copy, is that the cheat is so bad that it's not really an assignment and definitely not a copy. Rather, both are ownership transfer operations. What happens is that they both modify the right hand side, by relieving it of ownership while accepting ownership itself. Below are some examples of this: I think the above example speaks for itself. An important issue here for those of you who have used early versions of the auto_ptr<T> is that older versions did not become 0 pointers when not owning the memory area, but that is the behaviour set in the final draft of the C++ standard. Of course, the above program snippet is too simplistic to be useful. The properties of the auto_ptr<T></tt> are more useful when working with functions, where it works as both documentation and implementation of ownership transfer. What do you think about this? One of the headaches of using dynamically allocated memory is knowing who is responsible for deallocating the memory at any given moment in a program's lifetime. The auto_ptr<T></tt> makes that rather easy, as can be seen above. Any function that returns an auto_ptr<T></tt> leaves it to the caller to take care of the deallocation, and any function that accepts an auto_ptr<T></tt> requires ownership to work. If something goes wrong between calling <tt>creation</tt> and <tt>termination</tt>, such that <tt>termination</tt> ought not be called, we must take care of the deallocation, but since we have it in an <tt>auto_ptr<T></tt> that is automatically done for us if we return or throw an exception.

This functionality is an advantage that <tt>auto_ptr<T></tt> offer over <tt>ptr<T></tt>, since the latter doesn't have any way of transferring ownership.

No confusion with normal pointers
Since the auto pointers have the behaviour outlined above, it is extremely important that they cannot accidentally be confused with normal pointers. This is done by explicitly prohibiting all implicit conversions between pointer types and <tt>auto_ptr<T></tt> types. The erroneous code below shows how: It is indeed fortunate that the first and last error above are illegal. Imagine the maintenance headaches you could get otherwise. What would the first mean? Would the implicit conversion from <tt>auto_ptr<T></tt> to a raw pointer transfer ownership or not? <tt></tt>All implementations I have seen where such implicit conversions are allowed do not transfer the ownership, which in the situation above means that the memory would be deallocated when the <tt>auto_ptr<T></tt> object returned is destroyed (which it would be immediately after the conversion.) The last is as bad. What about this situation? int i; termination(&i); Ouch! The function would attempt to delete the local variable. For both the result will be something that in standardese is called undefined behaviour, but which in normal English best translates to a crash now, a crash later, or generally funny behaviour (possibly followed by a crash later.) Well, since it is illegal, we do not have to worry about it.

The second error <tt>auto_ptr ap=p;</tt> is perhaps a bit unfortunate since the intended behaviour is clear. That it is illegal comes as a natural consequence of banning the third situation <tt>ap=p</tt> which is not clear. <tt>ap</tt> might be declared somewhere far far away, so that in the code near the assignment it is not obvious if it is an <tt>auto_ptr<T></tt> or a normal pointer.

In this respect <tt>auto_ptr<T></tt> is better than <tt>ptr<T></tt>, since <tt>ptr<T></tt> does allow implicit construction, allowing the last error.

Controlled and visible rebinding and release of ownership
If we want to rebind an <tt>auto_ptr<T></tt> object to another memory area, it is important that the memory area currently owned by the object (if any) is deallocated. The member function <tt>reset</tt> takes care of that. Calling <tt>ap.reset(p)</tt> will deallocate whatever <tt>ap</tt> owns (if anything) and make it own whatever <tt>p</tt> points to.

If we want a normal pointer from an <tt>auto_ptr<T></tt> object, we can get it in two ways, depending on the desired effect. The member function <tt>release</tt> gives us a normal pointer to the memory area owned by the <tt>auto_ptr<T></tt> object, and also gives us the ownership, so that we will be responsible for the deallocation. If we do not want that responsibility, but temporarily need a normal pointer to the memory area, we use the <tt>get</tt> member function. Here is an example showing the differences: Above we see that the function <tt>func</tt> requires a normal pointer, but it does not assume ownership, so we use the <tt>get</tt> member to temporarily get the pointer and pass it to <tt>func</tt>. This function <tt>f</tt> then returns the raw pointer if <tt>func</tt> does its job, but if it fails with an exception, the <tt>auto_ptr </tt> object <tt>p</tt> will deallocate the memory in its destructor.

Since <tt>ptr<T></tt> was specifically designed to disallow transfer of ownership, this functionality is added-value for <tt>auto_ptr<T></tt>.

Works with dynamic types
Just as a normal pointer to a base class can point to an object of a publicly derived class, an <tt>auto_ptr<T></tt> can too. This is not particularly strange: class A {}; class B : public A{}; auto_ptr<A> pa(new B); auto_ptr<B> pb(new B); pa=pb; auto_ptr<A> pa2(pb); The reverse is (of course) not allowed. For <tt>ptr<T></tt> this is not a problem, since the functionality is only required if ownership transfer is allowed.

Pointer-like syntax for pointer-like behaviour
For the small subset of a pointer's functionality that is implemented in the <tt>auto_ptr<T></tt> class template, the syntax is exactly the same. We get access to the element pointed to with <tt>operator*</tt> and <tt>operator-></tt>. This is the only functionality of a pointer that is implemented. Here it is a tie between <tt>auto_ptr<T></tt> and <tt>ptr<T></tt>, since the functionality is exactly the same and so is the syntax.

Implementation
The definition of <tt>auto_ptr<T></tt> looks as follows: Three new details can be seen above. The keyword <tt>explicit</tt> in front of the constructor, and <tt>template <class Y></tt> inside the class definition. Both of these are relatively recent additions to the C++ language and far from all compilers support them. Third, and most important (please take note of this,) is that the copy constructor and assignment operator do take a non-const reference to their right hand side, so that it can be modified. <tt>explicit</tt> is what disallows implicit construction of objects, for example in function calls (see the error example above, when attempting to call a function requiring an <tt>auto_ptr<T></tt> parameter with a normal pointer.) This keyword is, strictly speaking, not needed. There is a fake around it, which you will see when we get to the implementation details. The member templates, as the <tt>template <class Y></tt> used inside the class definition is called, is a way of creating new member functions at need. This is what makes it possible to say <tt> auto_ptr<A> pa; auto_ptr<B> pb; pa=pb </tt>. With this mini-example, and the code above, we can see that a member function: auto_ptr<A>& auto_ptr<A>::operator=(auto_ptr<B>&) throw will be generated. If class <tt>B</tt> is publicly derived from class <tt>A</tt>, the generated code will compile just fine, otherwise we will get an error message from the compiler. This feature can, to the best of my knowledge, not be worked around. It is an essential addition to the C++ language. Unfortunately even fewer compilers support this than support the <tt>explicit</tt> keyword.

The code
Let us do the member functions one by one, beginning with the constructor. The only thing it needs to do is to initialize the <tt>auto_ptr<T></tt> object such that it owns the memory area, and if it points to anything at all, it owns it (by definition.) The <tt>inline</tt> keyword is new for this course, although it has been part of C++ for a very long time. Marking a function <tt>inline</tt> is a way of hinting to the compiler that you think this function is so simple that it can insert the function code directly where needed, instead of making a function call. This is just a hint; a compiler is free to ignore it, and likewise a good compiler may inline even functions not marked as inline (provided you cannot see any difference in the behaviour of the program.) Few compilers are smart enough to inline automatically, however, so there's a place for the <tt>inline</tt> keyword, and it will be used for all member functions of the <tt>auto_ptr<T></tt>.

This constructor is marked <tt>explicit</tt> in the class definition. I mentioned above that the <tt>explicit</tt> keyword is not, strictly speaking, necessary. Here is the promised work-around: The way this works is as follows: By default, implicit conversions are allowed, but only one user defined implicit conversion may take place. It is an error if two or more implicit conversions are required to get the desired effect. Constructing an object is a user defined conversion, and executing a conversion operator is too. Look at this example usage: The code at <tt>//**1</tt> is not in error, because we say that we want an object of type <tt>auto_ptr<int*></tt>. Since we've been so stern about this, we will be obeyed. It may seem like there are two implicit conversions taking place here, one for creating the <tt>explicit<T></tt> object, and one for getting the value out of it, but that is not quite true. Our <tt>auto_ptr </tt> accepts as its parameter an <tt>explicit<int*></tt> which is implicitly created from the pointer value. Then, it is a detail of the innards of the <tt>auto_ptr<T></tt> constructor how it is used, in this case to get the value from it.

The code at <tt>//**2</tt> however, is in error, because the call to <tt>termination</tt> requires two user defined conversions. One from <tt>int*</tt> to <tt>explicit<int*></tt>, and one from <tt>explicit<int*></tt> to <tt>auto_ptr </tt>.

Please see the provided source code for how to allow both versions to coexist for different compilers in the same source file. There is not much strange going on here, except that the parameter is a non-const reference. As mentioned far above, the member <tt>release</tt> relieves the object of ownership and returns the pointer. Thus this constructor makes <tt>p</tt> point to what <tt>t</tt> did point to, and alters <tt>t</tt> so that it becomes the 0 <tt>pointer</tt>. The code for this constructor is, of course, the same as for the previous one. Note that both are necessary. I made a mistake with the <tt>auto_ptr<T></tt> implementation available in the adapted SGI STL, that I thought the latter would imply the former. It doesn't, however, even though it may seem so.

Note, by the way, the syntax for a member template, with the two subsequent <tt>template <...></tt>

Of course, users of compilers that do not implement member templates will get compilation errors on this member function. Please see the source code for how to work around the compilation error (the work around is simply not to have this member function, which means that the resulting <tt>auto_ptr</tt> will be limited in functionality.) If the object owns anything, it will be deleted by the destructor. Note that deleting the 0 pointer is legal, and does nothing at all. If the object does not own anything, <tt>p</tt> will be the 0 pointer. These are not identical with the version of <tt>ptr<T></tt> from the previous issue of the course. One word on the way, though. <tt>operator-></tt> can, of course, only be used if <tt>T</tt> is a struct or class type. On some older compilers, it's even illegal to instantiate <tt>auto_ptr<T></tt> if <tt>T</tt> is not a struct or class type. In most cases this is a minor limitation; after all, it is normally structs and classes you handle this way, and not built-in types. Not much to say, is there? The object is relieved of ownership by making <tt>p</tt> the 0 pointer, and the value previously held by <tt>p</tt> is returned; just as mentioned in the introduction of the class. Deletes what it points to and sets <tt>p</tt> to the given value. Nothing strange, except the safety guard against resetting to the value already held. If we didn't have this guard, resetting to the current value would deallocate the memory and keep the ownership of it, for later deletion again! It seems like a better way is to just do nothing if the situation ever arises. Not much to say about this one.

Efficiency
The question of efficiency pops up now and then. How much does it cost, performance and memory-wise to use the <tt>auto_ptr<T></tt> instead of <tt>ptr<T></tt> from last month?

If you use <tt>auto_ptr<T></tt> instead of <tt>ptr<T></tt>, and use only the functionality that <tt>ptr<T></tt> offers, the price is nothing at all. The constructor, destructor, <tt>operator*</tt> and <tt>operator-></tt> holds exactly the same code for both templates. You pay for what you use only.

Compared to raw pointers and doing your own deletion? I do not know. It will depend a lot on how clever your compiler is with inlining. Most probably close to none at all. If you have a measurable speed difference in a real-world application, I would say the difference is that with <tt>auto_ptr<T></tt> you do many more deletions (i.e. you have mended memory leaks you were not aware of having.)

Recap
The news this month were:
 * The standard class template <tt>auto_ptr<T></tt> handles memory deallocation and ownership transfer.
 * Automatic memory deallocation and ownership transfer reduces the risk for memory leaks, especially when exceptions occur.
 * Implicit conversions between raw pointers and smart pointers is bad (even if it may seem tempting at first.)
 * The <tt>explicit</tt> keyword disallows implicit construction of objects.
 * The <tt>explicit</tt> keyword can be faked.
 * Member templates can be used to create member functions at compile time, just like function templates can be used to create functions at compile time.
 * <tt>inline</tt> hints to a compiler that you think a function is so small that it is better to directly insert the function code where required instead of making a function call.

Exercises

 * Why is it a bad idea to have arrays (or other collections) of <tt>auto_ptr<T></tt>?
 * Can smart pointers be dangerous? When? <tt>auto_ptr<T></tt> too?
 * What is a better name for this function template?
 * What happens if <tt>~T</tt> throws an exception?

Coming up
If I missed something, or you want something clarified further or disagree with me, please drop me a line and I'll address your ideas in future articles.

Next month we'll have a look at a smarter pointer; a reference counted one. I'm beginning to dry up on topics now, however, so please write and give me suggestions for future topics to cover.