Reference counting is technique that allows multiple objects with the same value to share a sinple representation of that value.
Consider a customized naive version of class String;
: its assignment operator is implemented in a naive way:
|
|
When we write statement a = b = c = d = e = "Hello";
where a
, b
, c
, d
and e
are all String
type, we get five objects, each with the same value “Hello”:
┌───┐ ┌───────┐ ┌───┐ ┌───────┐
│ a │ --> │ Hello │ │ b │ --> │ Hello │
└───┘ └───────┘ └───┘ └───────┘
┌───┐ ┌───────┐ ┌───┐ ┌───────┐
│ c │ --> │ Hello │ │ d │ --> │ Hello │
└───┘ └───────┘ └───┘ └───────┘
┌───┐ ┌───────┐
│ e │ --> │ Hello │
└───┘ └───────┘
Ideally, we’d like to change the picture to look like this:
┌───┐
│ a ├──┐
└───┘ |
┌───┐ |
│ c ├──┤
└───┘ |
┌───┐ | ┌───────┐
│ e ├──┼─>│ Hello │
└───┘ | └───────┘
┌───┐ |
│ e ├──┤
└───┘ │
┌───┐ │
│ e ├──┘
└───┘
In practice, we need to keep track of how many objects are sharing - refering to- a value to make sure the best time to destroy or modify the value “Hello”, so we need to add reference count into the picuture:
┌───┐
│ a ├──┐
└───┘ |
┌───┐ |
│ c ├──┤
└───┘ |
┌───┐ | ┌───┐ ┌───────┐
│ c ├──┼─>│ 5 ├───>│ Hello │
└───┘ | └───┘ └───────┘
┌───┐ |
│ d ├──┤
└───┘ │
┌───┐ │
│ e ├──┘
└───┘
Implementing Reference Counting
From the picture above, we can see we need one reference count per string value, instead of one per string object. This implies a decoupling between values and reference counts, leading to our first design: nesting a StringValue
struct in the private part of String
class, so that all the members of String
class get full access to this inner data structure, while everybody else get denied (except friends of the class).
|
|
The primary purpose of StringValue
is to provide a place to associate a particular value with a count of the number of String
objects sharing that value, so there’s need to define copy constructor or assignment operator for this inner struct, and we provide the manipulation of the refCount
field in String
class:
|
|
Copy-on-write
Now comes the troublesome one: an array-bracket operator([]), which allows individual characters within strings to be read and written:
|
|
It’s straightforward to implement the const version, because it’s a read-only operation:
|
|
However, since non-const version of operator[]
might be called to write a character, the implementation must consider more scenario to avoid modifying the value of other String
objects that happen to be sharing the same StringValue
object - since there’s no way for C++ compilers to tell us whether a particular use of operator[]
is for a read or a write, we must be pessimistic and assume all calls to the non-const operator[]
are for writes (Proxy classes casn help us differentiate reads from writes, see MECpp item 30.)
|
|
This technique - to share a value with other objects until we have to write on our own copy of the value - is the well-knwon copy-on-write, which is a specific example of lazy evaluation (MECpp item 17), which is a more general approach to efficiency.
Pointers, References, and Copy-on-write
Consider this code:
|
|
The data structure looks like this:
┌───┐
│s1 ├──┐ ┌───┐ ┌───────┐
└───┘ ├─>│ 2 ├───>│ Hello │
┌───┐ │ └───┘ └───────┘
│s2 ├──┘ ↑
└───┘ p
Now there is a dangerous situation, where pointer p
modifies both s1
and s2
:
|
|
To eliminate the problem, we add a flag to each StringValue
object indicating whether that object is shareable. Initially, the flag is set to true
(indicating shareable), but turn it off whenever the non-const operator[]
is invoked on the value represented by that object (once the flag is set to false
, it stays that way forever).
|
|
A Reference-Counting Base class
Reference counting is useful for more than just strings, so it’s good practice to separate reference counting code in a context-independent manner. This leads us to the design of a base class RCObject
. Any class wishing to take advantage of automatic reference counting may inherit from this class. Basically, for general purpose usage, RCObject
class should include
- the reference count, as well as functions for incrementing and decrementing that count.
- the code for destroying a value when it is no longer in use (count == 0)
- a field that keeps track of whether this value is shareable, as well as functions to query this flag and set it to false
|
|
|
|
In this design, there are a few things worth noting:
- The
refCount
is set to 0 in both constructors to simplifies the set up process for the creaters ofRCObject
s when they setrefCount
to 1 themselves - Copy constructor sets
refCount
to 0, because this function is meant to create a new object representing a value, which is always unshared and referenced only by their creator (who will set uprefCount
properly later). - The assignment operator is unlikely to be called, since
RCObject
is a base class for a shared value object, which means in a reference counting system, it is usually the object pointing to these base-class objects that are assigned to one another, with onlyrefCount
being modified as a result. We don’t declare assignment operatorprivate
, because there’s a chance that someone does have a reason to allow assignment of reference-counted values(e.g., change the string value stored insideStringValue
in the example above), so we adopt this “do nothing” implementation, which is exactly the right thing to do, because the assignment of value objects doesn’t affect the reference count of objects pointing to eitherlhs
orrhs
of assignment operation: this base-class level assignment is meant to changelhs
’s value, meaning all the objects pointing tolhs
now pointing to a new value. - Here we use
delete this;
forremoveReference
, which is safe only if we know that*this
is a heap object. In order to ensure this, we might need technichs discussed in MECpp item 27 to restrictRCObject
to be created only on the heap1.
Now taking advantage of this new reference-counting base class, we modify StringValue
to inherit its reference counting capabilities from RCObject
:
|
|
After this change, RCObject
now provide the manipulation ability of the refCount
field, instead of StringValue
.
Automating Reference Count Manipulations
The RCObject
class only gives us a place to store a reference count, as well as the member functions to manipulate the refCount
field. However, the calls to these functions must still be mannually inserted in other classes: String
copy constructor and assignment operator need to call addReference
and removeReference
on StringValue
objects, which is not good practice for reuse.
To remove such manual works, we introduce smart pointer for help:
|
|
|
|
There are three assumptions in this implementation:
T
has a deep-copying constructor, becausepointee = new T(*pointee);
will callT
’s copy constructor. In the example above,String::StringValue
lack such a user-defined copy constructor, and compiler generated default copy constructor will not copychar*
stringdata
points to, so we need to add a customized version of copy constructor:
|
|
- For the same statement calling
T
’s copy constructor, we assume the type of*pointee
isT
rather thanT
’s derived class. If, however, chances arepoinee
might point toT
’s derived class instances, we need to use a virtual copy constructor. T
should prove all the functionality thatRCObject
does, either or not by inheriting fromRCObject
.
Puting Everyting Together
┌──────────┐
┌──────────┐ │ RCObject │
│ String │ │ class │
│ object │ └──────────┘
│ │ ↑ public inheritance
│ ┌─────┐ │ ┌───────────┐ ┌────────────┐
│ │RCPtr├──┼────────>│StringValue├────────>| Heap Memory|
│ └─────┘ │ pointer │ object | pointer └────────────┘
└──────────┘ └───────────┘
The class declaration looks like this:
|
|
It is worth to note that we don’t need the copy constructor or assignment operator for String
anymore: compiler-generated copy constructor for Stirng
will automatically call the copy constructor for Stirng
’s RCPtr
member, and the copy constructor for that class will perform all the necessary manipulations of the StringValue
object, including its reference count, and the same goes for assignment and destruction. That’s why it is called smart pointer.
Now here is all the implementation:
|
|
|
|
|
|
|
|
Adding Refenrence Counting to Existing Classes
Given some class Widget
that’s in a library we can’t modify, and suppose we want to apply the benefits of reference counting to Widget
without being able to inherit Widget
from RCObject
, we solve the problem with an additional level of indirection by adding a new class CountHolder
, which does three jobs:
- Hold the reference
- Inherit from
RCObject
- Contain a pointer to a
Widget
The only thing left to do is just an equivalent smart pointer as RCPtr
, and we call it RCIPtr
, where “I” stands for “indirect”. Thus, we get someting like this:
┌──────────┐
┌──────────┐ │ RCObject │
│ RCWidget │ │ class │
│ object │ └──────────┘
│ ┌──────┐ │ ↑ public inheritance
│ |RCIPtr| │ ┌───────────┐ ┌─────────────┐
│ |Object├─┼────────>│CountHolder├────────>|Widget Object|
│ └──────┘ │ pointer │ object | pointer └─────────────┘
└──────────┘ └───────────┘
Since here CountHolder
is just an implementation detial of RCIPtr
, we can simply nested it inside RCIPtr
, just as how StringValue
relates with String
.
|
|
Then, for a library class Widget
with following interface:
|
|
We can implementing wrapper RCWidget
by simply forwarding the call through underlying RCIPtr
to a Widget
object:
|
|
As with Stirng
class, there’s no need to write copy constructor, assignment operator, or destructor, because the default versions do the right thing.
-
In this example, we guarantee the value objects should be created only via
new
by declaringStringValue
asprivate
inString
, so onlyString
can createStringValue
objects and the auther of theString
class is able to ensure all such objects are allocated vianew
. ↩︎