An Introduction to C++ Programming - Part 12/13

Written by Björn Fahller

Introduction
Last month's auto_ptr class template, documents and implements ownership transfer of dynamically allocated memory. Often, however, we do not want to be bothered with ownership. We want several places of the code to be able to access the memory, but we also want to be sure that the memory is deallocated when no longer needed. The general solution to this is called automatic garbage collection (something you can read several theses on, and also buy a few commercially available libraries for.) The less general solution is reference counting; no owner, but last one out locks the door. The idea is that a counter is attached to every object allocated. When allocated it is set to 0. When the first smart pointer attaches to it, the count is incremented to 1. Every smart pointer attaching to the resource, increments the reference count; and every smart pointer detaching from a resource (the smart pointer destroyed, or assigned another value) the resource's counter is decremented. If the counter reaches zero, no one is referring to it anymore, so the resource must be deallocated. The weakness of this compared to automatic garbage collection is that it does not work with circular data structures (the count never goes below 1.)

The problems to solve
Many of the problems with a reference counting pointer are the same as for the auto pointer. The list is actually a bit shorter, since there's no need to worry about ownership. This might also be the place to mention a problem not to solve; that of how to stop reference counting a resource. Adding this functionality is not difficult, but it quickly leads to user code that is extremely hard to maintain. This is exactly what we want to avoid, so it is better not to have the functionality.
 * exception safety
 * no confusion with normal pointers
 * controlled and visible rebinding and access
 * works with polymorphic types
 * pointer-like syntax for pointer-like behaviour
 * automatic deletion when no longer referring to the object

Here is how it is supposed to work when we are done: counting_ptr P1(new int(value));

After creating a counting pointer P1, the reference count for the value pointed to is set to one. counting_ptr P2(P1);

When a second counting pointer P2 is created from P1, the object pointed to is not duplicated, but the reference count is incremented. counting_ptr P3(P2); When three counting pointers refer to the same object, the value of the counter is three. P1.manage(new int(other));

As one of the pointers referring to the first object created is reinitialized to another object, the old reference count is decremented (there are only two references to it now) and the new one is assigned a reference count of 1. P2=P1;

When yet one of the pointers move attention from the old object to the new one, the counter for the old one is yet again decremented, and for the new one it is incremented. P3=P2;

Now that the last counting pointer referring to the old object moves its attention away from it, the old objects reference count goes to zero, and the object is deallocated. Now instead the new object has a reference count of 3, since there are three reference counting pointers referring to it.

Interface outline
The interface of the reference counting smart pointer will, for obvious reasons, share much with the auto pointer. The differences lie in accessing the raw pointer and giving the pointer a new value. While these aspects could use the same interface as does the auto pointer, giving the reference counting pointer an identical interface, this would be very unfortunate since their semantics differ dramatically. Here is a suggested interface. template   class counting_ptr {  public: explicit counting_ptr(T* t = 0); counting_ptr(const counting_ptr& t) throw; template       counting_ptr(const counting_ptr& ty) throw; ~counting_ptr throw ; counting_ptr& operator=(const counting_ptr& t) throw ; template       counting_ptr<T>& operator=(const counting_ptr<Y>& ty) throw; T& operator*(void) const throw ; T* operator->(void) const throw ; T* peek(void) const throw; void manage(T* t); };

Compared to the auto pointer, the only differences are that some member functions do not have an empty exception specification, and there is no member function corresponding to auto_ptr<T>::release</tt> (which stops the managing of the pointer.) The member function get</tt> is here named peek</tt>. I think this name better describes what is happening. To me the word get</tt> associates with a transfer, and there is no transfer occurring. All we do is to peek inside and see what the internal raw pointer value is. The member function reset</tt> is here named manage</tt>, and the reason is a big difference in semantics.

Where to store the reference count
Before we can dive into implementation, we must figure out where to store the reference count. It is obvious that the counter belongs to the object referred to, so it cannot reside in the smart pointer object. A solution that easily springs to mind is to use a struct with a counter, and the type referred to, like this: template <class T>  class counting_pointer { public: private: struct representation { unsigned counter; T value; };    representation* ptr; }; With this construct, we have the object and counter together. Unfortunately there are two severe drawbacks. We cannot use it for dynamic binding, since the value</tt> component is indeed a value and dynamic binding only works through pointers and references. It also becomes very difficult to write the constructor and manage</tt> member function.

The work around is simple; use a T*</tt> instead. All we need to work this way is to make sure to allocate this representation struct on heap in the constructor and manage</tt> member functions (and of course to deallocate the struct when we're done with it.)

There is a performance disadvantage with this, however. Whenever accessing the object referred to, we must follow two pointers, the pointer to the representation and the pointer to the object from the representation.

The best solution I have seen is to decouple the representation from the object and instead allocate an unsigned</tt> and in every counting pointer object keep both a pointer to the counter and to the object referred to. This gives the following data section of our counting pointer class template: template <class T>  class counting_pointer { public: private: T* ptr; unsigned* pcount; };

Accessibility
The solution outlined above is so good it almost works. For compilers that do not support member templates, such that the assignment and construction from a counting pointer of another type are impossible, it is all we need.

However, if we want the ability to assign a counting_ptr<T></tt> object from a value of type counting_ptr<Y></tt> if a T*</tt> can be assigned from a Y*</tt>, we must think of something. The problem is that counting_ptr<T></tt> and <tt>counting_ptr<Y></tt> are two distinct types, so both the member variables are private and thus inaccessible. The value of the raw pointer member can be accessed through the public <tt>peek</tt> member function, but we need a solution for accessing the counter without making it publicly available.

This kind of problem is exactly what <tt>friend</tt> declarations are for, but there is a two-fold problem with that. To begin with, extremely few compilers support template friends; a rather new addition to the language. Second, member templates open up holes in the type system you can only dream of. For the curious, please read Scott Meyer's paper on the topic.

One step on the way towards a solution is to see that the management of the counter is independent of the type T, so we can implement the counter managing code in a separate class. A reference counting class may look like: class counting_base {  public: counting_base(unsigned count = 0); counting_base(const counting_base& cb) throw; counting_base& operator=(const counting_base& cb) throw ; ~counting_base(void) throw; int release throw; int reinit(unsigned count=1); private: unsigned* pcount; }; The idea here is that this class handles every aspect of the reference counter. We just need to tell it how to behave, and it reports the results to us.

The default constructor allows us to choose whether we want a reference counter or not. If our reference counting pointer is initialized with the 0 pointer, there is no need to waste time and memory by allocating a reference counter for it.

The member functions <tt>release</tt> and <tt>reinit</tt> return 1 if the old counter is discarded (and hence, we should delete the object we refer to) or 0 if it was just decremented.

Base implementation
It is fairly easy to implement, and makes life easier later on, even when member templates are not available. inline counting_base::counting_base(unsigned count) : pcount(count ? new unsigned(count) : 0) {    if (count && !pcount) throw bad_alloc; } Initialize a counter with the value of the parameter. A count of 0 is represented by a 0 pointer. If you have a modern compiler, the function body throwing the exception will not be necessary; operator new will throw <tt>bad_alloc</tt> in out-of-memory conditions. For older compilers, however, you need to define the <tt>bad_alloc</tt> class. inline counting_base::counting_base(    const counting_base& cb   ) throw : pcount(cb.pcount) {    if (pcount) ++(*pcount); } Copying a reference counting object means adding one to the reference counter (since there is now one more object referring to the counter.) When we have 0 pointers, there is nothing to update. Note that the copy constructor never involves any allocation or deallocation of dynamic memory. In fact, there is nothing in the copy constructor that can throw exceptions. inline counting_base::~counting_base throw {    release; } Destroying a reference counting object means decrementing the reference counter and deallocating it if the count goes to zero (last one out locks the door.) Since this code is needed in the assignment operator, as well as in the public interface for use by the reference counted smart pointer class template, we implement that work in the <tt>release</tt> member function. inline counting_base& counting_base::operator=(    const counting_base& cb   ) throw {    if (pcount != cb.pcount) { release; pcount=cb.pcount; if (pcount) ++(*pcount); }    return *this; } Assignment of reference counting objects means decrementing the reference count for the left hand side object, and incrementing it for the right hand side object; since there will be one less object referring to the counter from the left hand side object, and one more referring to the counter from the right hand side object. inline int counting_base::release throw {    if (pcount && --(*pcount) == 0) { delete pcount; pcount = 0; return 1; }    pcount=0; return 0; } If the reference count goes to zero the counter is deallocated. A return value of 1 means deallocation took place (hinting to the user of this class that it should deallocate whatever object it refers to, since it is the last one referring to it.) In both cases the pointer to the counter is set to 0 as a precaution. It may be that <tt>release</tt> is called just prior to destruction, and the destructor calls <tt>release</tt> to deallocate memory. If the pointer to the counter is not set to zero it means either referring to just deleted memory, or decrementing the reference count twice. inline int counting_base::reinit(unsigned count) {    if (pcount && --(*pcount) == 0) { *pcount = count; return 1; }    pcount = count ? new unsigned(count) : 0; if (count && !pcount) throw bad_alloc; return 0; } Strictly speaking, <tt>reinit</tt> is not needed. Its purpose is to release from the current counter and initialize a new one with a defined value. It is an optimization of memory handling, however, in that if this object was the last reference the counter is reinitialized instead of deallocated and then allocated again. If it was not the last object referring to the counter, a new counter must of course be allocated.

As an exercise, prove to yourself that this reference counting base class does not have any memory handling errors (i.e. it always deallocates what it allocates, never deallocates the same area twice, never accesses uninitialized or just deallocated memory, and never dereferences the 0 pointer.)

Accessibility again
As nice and convenient the above helper class is, it really does not solve the accessibility problem. It does make the implementation a bit easier, though.

The problem remains. class <tt>counting_ptr<T></tt> and <tt>counting_ptr<Y></tt> are different classes and because of this are not allowed to see each others private sections. The easy way out is to use public inheritance, and say that every counting pointer is-a counting base. That is a solution that is simple, sweet and dead wrong. There is no is-a relationship here, but an is-implemented-in-terms-of relationship. Such relationships are implemented through private member data or private inheritance - almost. This is bordering on abuse, but it works fine. Instead of having the member functions of the <tt>counting_base</tt> class public, they can be declared protected. As such they do not become part of the public interface, and the is-a relationship is mostly imaginary.

Implementation of a reference counting pointer
Finally we can get to work and write the reference counting pointer, and since the helper class does most of the dirty work, the implementation is not too convoluted. template <class T>  inline counting_ptr<T>::counting_ptr(T* t)     : counting_base(t ? 1 : 0), pt(t) {  }

Initialize the counter to 1 if we have a pointer value, and 0 otherwise. template <class T>  inline counting_ptr<T>::counting_ptr(     const counting_ptr<T>& t   ) throw : counting_base(t), pt(t.pt) {  }   template <class T> template <class Y>   inline counting_ptr<T>::counting_ptr(     const counting_ptr<Y>& ty   ) throw : counting_base(ty), pt(ty.peek) {  } Note how the latter makes use of the knowledge that a <tt>counting_ptr<Y></tt> is-a <tt>counting_base</tt>. inline counting_ptr<T>::~counting_ptr throw {    if (release) delete pt; } The destructor is important to understand. If you review the functionality of the <tt>release</tt> member function of <tt>counting_base</tt>, you see that it deallocates the counter if, and only if, the count reaches 0 and then returns 1, otherwise it returns 0. This means that in the destructor we will deallocate whatever <tt>pt</tt> points to if, and only if the reference count for it reaches zero and is deallocated. template <class T>  inline counting_ptr<T>& counting_ptr<T>::operator=(    const counting_ptr<T>& t   ) throw {    if (pt != t.pt) { if (release) delete pt; counting_base::operator=(t); pt=t.pt; }    return *this; }  template <class T> template <class Y>   inline counting_ptr<T>& counting_ptr<T>::operator=(    const counting_ptr<Y>& ty   ) throw {    if (pt != ty.peek) { if (release) delete pt; counting_base::operator=(ty); pt=ty.peek; }    return *this; } Please spend some time convincing yourself that the above works. What happens is that if the left hand side and right hand side counting pointers already refer to the same object, nothing happens. If they refer to different objects, on the other hand, the object referred to by the left hand side reference counting pointer is deleted if, <tt>release</tt> returns 1, which it does if, and only if, the reference count goes to zero and the counter is deallocated. Since <tt>release</tt> also resets the counter pointer to zero, the assignment operator for <tt>counting_base</tt> does not decrement the counter again. Instead it is simply assigned. After this the raw pointer value can safely be copied.

The access operators are all trivial and need no further explanation: T* operator-> const T& operator* const T* peek const

Now only the <tt>manage</tt> member function remains: template <class T>  inline void counting_ptr<T>::manage(T* t)   { if (!t && release || t && t != pt && reinit) delete pt; pt = t;  } This one is only slightly tricky. If the parameter is the zero pointer, we want the reference count to also be the zero pointer, and thus <tt>release</tt> is called. Otherwise we want the counter value to be 1, so <tt>reinit</tt> is called instead. If either of those tells that the old counter is discarded, the object referred to is deallocated.

Efficiency
There is no question about it; there is a cost involved in using reference counting pointers. Every time an object is allocated, a pointer is also allocated, and vice-versa for deallocation. Depending on how efficient your compiler's memory manager is for small objects this cost may be negligible, or it may be severe. Every time a counting pointer is assigned, constructed, destroyed and copied, counter manipulation is done, and that costs a few CPU cycles.

Exercises

 * What happens if <tt>~T</tt> throws an exception?
 * What happens if allocation of a new counter fails?
 * The <tt>counting_base</tt> implementation and use is suboptimal. For example at destruction, the <tt>release</tt> member function is called twice, and the first time the counter pointer is set to zero to prevent decrementing the counter twice. Improve the implementation to never (implicitly) set the pointer to zero and yet always be safe.
 * Does the public derivation from <tt>counting_base</tt> open up any holes in the type safety, despite that all member functions are protected?

Coming up
If I missed something, or you want something clarified further or disagree with me, please drop me a line and I'll address your ideas in future articles.

Next month is devoted to Run Time Type Identification (RTTI,) one of the new functionalities in C++.