|
Introduction
Last month's ``auto_ptr<T>'' class template,
documents and implements ownership transfer of dynamically
allocated memory. Often, however, we do not want to be bothered with
ownership. We want several places of the code to be able to access the
memory, but we also want to be sure that the memory is deallocated when no
longer needed. The general solution to this is called automatic garbage
collection (something you can read several theses on, and also buy a few
commercially available libraries for.) The less general solution is
reference counting; no owner, but last one out locks the door. The idea is
that a counter is attached to every object allocated. When allocated it is
set to 0. When the first smart pointer attaches to it, the count is
incremented to 1. Every smart pointer attaching to the resource,
increments the reference count; and every smart pointer detaching from a
resource (the smart pointer destroyed, or assigned another value) the
resource's counter is decremented. If the counter reaches zero, no one is
referring to it anymore, so the resource must be deallocated. The weakness
of this compared to automatic garbage collection is that it does not work
with circular data structures (the count never goes below 1.)
The problems to solve
Many of the problems with a reference counting pointer are the same as for the
auto pointer. The list is actually a bit shorter, since there's no need to worry
about ownership.
- exception safety
- no confusion with normal pointers
- controlled and visible rebinding and access
- works with polymorphic types
- pointer-like syntax for pointer-like behaviour
- automatic deletion when no longer referring to the object
This might also be the place to mention a problem not to solve; that of
how to stop reference counting a resource. Adding this functionality is
not difficult, but it quickly leads to user code that is extremely hard to
maintain. This is exactly what we want to avoid, so it is better not to
have the functionality.
Here is how it is supposed to work when we are done:
counting_ptr<int> P1(new int(value));
After creating a counting pointer ``P1'', the
reference count for the value pointed to is set to one.
counting_ptr<int> P2(P1);
When a second counting pointer ``P2'' is created
from ``P1'', the object pointed to is not
duplicated, but the reference count is incremented.
counting_ptr<int> P3(P2);
When three counting pointers refer to the same object, the value of the
counter is three.
P1.manage(new int(other));
As one of the pointers referring to the first object created is
reinitialized to another object, the old reference count is decremented
(there are only two references to it now) and the new one is assigned a
reference count of 1.
P2=P1;
When yet one of the pointers move attention from the old object to the new
one, the counter for the old one is yet again decremented, and for the new
one it is incremented.
P3=P2;
Now that the last counting pointer referring to the old object moves its
attention away from it, the old objects reference count goes to zero, and
the object is deallocated. Now instead the new object has a reference
count of 3, since there are three reference counting pointers referring to
it.
Interface outline
The interface of the reference counting smart pointer will, for obvious
reasons, share much with the auto pointer. The differences lie in
accessing the raw pointer and giving the pointer a new value. While these
aspects could use the same interface as does the auto pointer, giving the
reference counting pointer an identical interface, this would be very
unfortunate since their semantics differ dramatically. Here is a suggested
interface.
template <class T>
class counting_ptr
{
public:
explicit counting_ptr(T* t = 0);
counting_ptr(const counting_ptr<T>& t) throw();
template <class Y>
counting_ptr(const counting_ptr<Y>& ty) throw();
~counting_ptr() throw ();
counting_ptr<T>&
operator=(const counting_ptr<T>& t) throw ();
template <class Y>
counting_ptr<T>&
operator=(const counting_ptr<Y>& ty) throw();
T& operator*(void) const throw ();
T* operator->(void) const throw ();
T* peek(void) const throw();
void manage(T* t);
};
Compared to the auto pointer, the only differences are that some member
functions do not have an empty exception specification, and there is no
member function corresponding to
``auto_ptr<T>::release()'' (which stops the
managing of the pointer.) The member function
``get'' is here named
``peek''. I think this name better describes what
is happening. To me the word ``get'' associates
with a transfer, and there is no transfer occurring. All we do is to peek
inside and see what the internal raw pointer value is. The member function
``reset'' is here named
``manage'', and the reason is a big difference
in semantics.
Where to store the reference count
Before we can dive into implementation, we must figure out where to store
the reference count. It is obvious that the counter belongs to the object
referred to, so it cannot reside in the smart pointer object. A solution
that easily springs to mind is to use a struct with a counter, and the
type referred to, like this:
template <class T>
class counting_pointer {
public:
private:
struct representation {
unsigned counter;
T value;
};
representation* ptr;
};
With this construct, we have the object and counter together.
Unfortunately there are two severe drawbacks. We cannot use it for dynamic
binding, since the ``value'' component is indeed a
value and dynamic binding only works through pointers and references. It
also becomes very difficult to write the constructor and
``manage'' member function.
The work around is simple; use a ``T*''
instead. All we need to work this way is to make sure to allocate this
representation struct on heap in the constructor and
``manage'' member functions (and of course to
deallocate the struct when we're done with it.)
There is a performance disadvantage with this, however. Whenever
accessing the object referred to, we must follow two pointers, the pointer
to the representation and the pointer to the object from the
representation.
The best solution I have seen is to decouple the representation from
the object and instead allocate an ``unsigned''
and in every counting pointer object keep both a pointer to the counter
and to the object referred to. This gives the following data section of
our counting pointer class template:
template <class T>
class counting_pointer {
public:
private:
T* ptr;
unsigned* pcount;
};
Accessibility
The solution outlined above is so good it almost works. For compilers that
do not support member templates, such that the assignment and construction
from a counting pointer of another type are impossible, it is all we need.
However, if we want the ability to assign a
``counting_ptr<T>'' object from a value of
type ``counting_ptr<Y>'' if a
``T*'' can be assigned from a
``Y*'', we must think of something. The problem is
that ``counting_ptr<T>'' and
``counting_ptr<Y>'' are two distinct types,
so both the member variables are private and thus inaccessible. The value
of the raw pointer member can be accessed through the public
``peek'' member function, but we need a solution
for accessing the counter without making it publicly available.
This kind of problem is exactly what ``friend''
declarations are for, but there is a two-fold problem with that. To begin
with, extremely few compilers support template friends; a rather new
addition to the language. Second, member templates open up holes in the
type system you can only dream of. For the curious, please read Scott Meyer's paper on
the topic.
One step on the way towards a solution is to see that the management of
the counter is independent of the type T, so we can implement the counter
managing code in a separate class. A reference counting class may look
like:
class counting_base
{
public:
counting_base(unsigned count = 0);
counting_base(const counting_base& cb) throw();
counting_base&
operator=(const counting_base& cb) throw ();
~counting_base(void) throw();
int release() throw();
int reinit(unsigned count=1);
private:
unsigned* pcount;
};
The idea here is that the this class handles every aspect of the reference counter.
We just need to tell it how to behave, and it reports the results to us.
The default constructor allows us to choose whether we want a reference
counter or not. If our reference counting pointer is initialized with the
0 pointer, there is no need to waste time and memory by allocating a
reference counter for it.
The member functions ``release'' and
``reinit'' return 1 if the old counter is
discarded (and hence, we should delete the object we refer to) or 0 if it
was just decremented.
Base implementation
It is fairly easy to implement, and makes life easier later on, even when
member templates are not available.
inline counting_base::counting_base(unsigned count)
: pcount(count ? new unsigned(count) : 0)
{
if (count && !pcount) throw bad_alloc();
}
Initialize a counter with the value of the parameter. A count of 0 is represented
by a 0 pointer. If you have a modern compiler, the function body throwing the
exception will not be necessary; operator new will throw ``bad_alloc''
in out-of-memory conditions. For older compilers, however, you need to define
the ``bad_alloc'' class.
inline counting_base::counting_base(
const counting_base& cb
) throw ()
: pcount(cb.pcount)
{
if (pcount) ++(*pcount);
}
Copying a reference counting object means adding one to the reference
counter (since there is now one more object referring to the counter.)
When we have 0 pointers, there is nothing to update. Note that the copy
constructor never involves any allocation or deallocation of dynamic
memory. In fact, there is nothing in the copy constructor that can throw
exceptions.
inline counting_base::~counting_base() throw()
{
release();
}
Destroying a reference counting object means decrementing the reference
counter and deallocating it if the count goes to zero (last one out locks
the door.) Since this code is needed in the assignment operator, as well
as in the public interface for use by the reference counted smart pointer
class template, we implement that work in the
``release'' member function.
inline counting_base&
counting_base::operator=(
const counting_base& cb
) throw()
{
if (pcount != cb.pcount) {
release();
pcount=cb.pcount;
if (pcount) ++(*pcount);
}
return *this;
}
Assignment of reference counting objects means decrementing the reference
count for the left hand side object, and incrementing it for the right
hand side object; since there will be one less object referring to the
counter from the left hand side object, and one more referring to the
counter from the right hand side object.
inline int counting_base::release() throw ()
{
if (pcount && --(*pcount) == 0) {
delete pcount;
pcount = 0;
return 1;
}
pcount=0;
return 0;
}
If the reference count goes to zero the counter is deallocated. A return
value of 1 means deallocation took place (hinting to the user of this
class that it should deallocate whatever object it refers to, since it is
the last one referring to it.) In both cases the pointer to the counter is
set to 0 as a precaution. It may be that
``release'' is called just prior to destruction,
and the destructor calls ``release'' to deallocate
memory. If the pointer to the counter is not set to zero it means either
referring to just deleted memory, or decrementing the reference count
twice.
inline int counting_base::reinit(unsigned count)
{
if (pcount && --(*pcount) == 0) {
*pcount = count;
return 1;
}
pcount = count ? new unsigned(count) : 0;
if (count && !pcount) throw bad_alloc();
return 0;
}
Strictly speaking, ``reinit'' is not needed. Its
purpose is to release from the current counter and initialize a new one
with a defined value. It is an optimization of memory handling, however,
in that if this object was the last reference the counter is reinitialized
instead of deallocated and then allocated again. If it was not the last
object referring to the counter, a new counter must of course be
allocated.
As an exercise, prove to yourself that this reference counting base
class does not have any memory handling errors (i.e. it always deallocates
what it allocates, never deallocates the same area twice, never accesses
uninitialized or just deallocated memory, and never dereferences the 0
pointer.)
Accessibility again
As nice and convenient the above helper class is, it really does not solve
the accessibility problem. It does make the implementation a bit easier,
though.
The problem remains. class
``counting_ptr<T>'' and
``counting_ptr<Y>'' are different classes
and because of this are not allowed to see each others private sections.
The easy way out is to use public inheritance, and say that every counting
pointer is-a counting base. That is a solution that is simple,
sweet and dead wrong. There is no is-a relationship here, but an
is-implemented-in-terms-of relationship. Such relationships are
implemented through private member data or private inheritance - almost.
This is bordering on abuse, but it works fine. Instead of having the
member functions of the ``counting_base'' class
public, they can be declared protected. As such they do not become part of
the public interface, and the is-a relationship is mostly imaginary.
Implementation of a reference counting pointer
Finally we can get to work and write the reference counting pointer,
and since the helper class does most of the dirty work, the implementation
is not too convoluted.
template <class T>
inline counting_ptr<T>::counting_ptr(T* t)
: counting_base(t ? 1 : 0),
pt(t)
{
}
Initialize the counter to 1 if we have a pointer value, and 0 otherwise.
template <class T>
inline counting_ptr<T>::counting_ptr(
const counting_ptr<T>& t
) throw()
: counting_base(t),
pt(t.pt)
{
}
template <class T> template <class Y>
inline counting_ptr<T>::counting_ptr(
const counting_ptr<Y>& ty
) throw()
: counting_base(ty),
pt(ty.peek())
{
}
Note how the latter makes use of the knowledge that a
``counting_ptr<Y>'' is-a
``counting_base''.
inline counting_ptr<T>::~counting_ptr() throw()
{
if (release()) delete pt;
}
The destructor is important to understand. If you review the functionality
of the ``release'' member function of
``counting_base'', you see that it deallocates the
counter if, and only if, the count reaches 0 and then returns 1, otherwise
it returns 0. This means that in the destructor we will deallocate
whatever ``pt'' points to if, and only if the
reference count for it reaches zero and is deallocated.
template <class T>
inline counting_ptr<T>&
counting_ptr<T>::operator=(
const counting_ptr<T>& t
) throw()
{
if (pt != t.pt) {
if (release()) delete pt;
counting_base::operator=(t);
pt=t.pt;
}
return *this;
}
template <class T> template <class Y>
inline counting_ptr<T>&
counting_ptr<T>::operator=(
const counting_ptr<Y>& ty
) throw()
{
if (pt != ty.peek()) {
if (release()) delete pt;
counting_base::operator=(ty);
pt=ty.peek();
}
return *this;
}
Please spend some time convincing yourself that the above works. What
happens is that if the left hand side and right hand side counting
pointers already refer to the same object, nothing happens. If they refer
to different objects, on the other hand, the object referred to by the
left hand side reference counting pointer is deleted if,
``release'' returns 1, which it does if, and only
if, the reference count goes to zero and the counter is deallocated. Since
``release'' also resets the counter pointer to
zero, the assignment operator for
``counting_base'' does not decrement the counter
again. Instead it is simply assigned. After this the raw pointer value can
safely be copied.
The access operators are all trivial and need no further explanation:
T* operator->() const
T& operator*() const
T* peek() const
Now only the ``manage'' member function remains:
template <class T>
inline void counting_ptr<T>::manage(T* t)
{
if (!t && release() || t && t != pt && reinit())
delete pt;
pt = t;
}
This one is only slightly tricky. If the parameter is the zero pointer, we
want the reference count to also be the zero pointer, and thus
``release'' is called. Otherwise we want the
counter value to be 1, so ``reinit'' is called
instead. If either of those tells that the old counter is discarded, the
object referred to is deallocated.
Efficiency
There is no question about it; there is a cost involved in using reference
counting pointers. Every time an object is allocated, a pointer is also allocated,
and vice-versa for deallocation. Depending on how efficient your compiler's
memory manager is for small objects this cost may be negligible, or it may be
severe. Every time a counting pointer is assigned, constructed, destroyed and
copied, counter manipulation is done, and that costs a few CPU cycles.
Exercises
- What happens if ``~T'' throws an exception?
- What happens if allocation of a new counter fails?
- The ``counting_base'' implementation and use is suboptimal. For
example at destruction, the ``release'' member function is called
twice, and the first time the counter pointer is set to zero to prevent
decrementing the counter twice. Improve the implementation to never (implicitly) set the pointer to zero and yet always be safe.
- Does the public derivation from ``counting_base'' open up any holes
in the type safety, despite that all member functions are protected?
Coming up
If I missed something, or you want something clarified further or disagree with
me, please drop me a line and I'll address your ideas in future articles.
Next month is devoted to Run Time Type Identification (RTTI,) one of
the new functionalities in C++.
/Björn.
|
|