Template metaprogramming can shift work from runtime to compile-time (thus enabling earlier error dettection and higher runtime performance), can be used to generate custom code based on combinations of policy choices, and can also be used to avoid generating code inappropriate for particular types.

Template metaprogramming (TMP) is the process of writing template-based C++ programs that execute inside the C++ compiler, ending up with pieces of C++ source code instantiated from templates, which are then compiled as usual. TMP was discovered, not designed, in the early 1990s, and has later been shown to be Turing-complete, which means that it is powerful enough to compute anything (declare variables, perform loops, write and call functions, etc).

Strength of TMP

TMP has two great strengths:

it makes some things easy that would otherwise be hard or impossiblle
it shifts work from runtime to compile-time¹

By making use of TMP, we can expect following good consequences:

Some kinds of errors originally detected at runtime could be found during compilation
C++ progrmas become more efficient in about every way: smaller executables, shorter runtimes, lesser memory requirements

To take a glimpse into how things work in TMP, let’s look at two examples.

Examples

Ex1: if...else conditionals in TMP

As item 47 shows, if...else conditionals in TMP are expressed via templates and template specializations². Let’s take a comparason between the “normal” C++ (using typeid) and TMP (using traits) based on implementing the pseudo part of if (iter is a random access iterator) from STL’s advance

Below is the “normal” C++ approach, which evaluates at runtime:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


template<typename IterT, typename DistT>
void advance(IterT& iter, DistT d)
{
    if (typeid(typename std::iterator_traits<IterT>::iterator_categoray) == 
        typeid(std::random_access_iterator_tag)) {
            iter += d; // use iterator arithmetic for random access iters
    }
    else {
        if (d >= 0) { while(d--) ++iter; }
        else { while(d++) --iter; }
    }
}

Compared to the one using traits, the typeid-based approach has following issues:

less efficient: testing occurs at runtime; executable is larger (since the code to do the testing must be present in the executable)
introduce compilation problems: if iter isn’t a random access iterator, iter will not support operator +=, so iter += d will not be valid, but compilers will still check this part of code and complain about its invalidation, because they are obliged to make sure that all source code is valid, even if it’s not executed.

Ex2: loops in TMP

TMP has no real looping construct, so the effect of loops is accomplished via recursion, or more specifically, recursive template instantiations. As an example, TMP factorial computation demonstates how it works:

1
2
3
4
5
6
7
8
9


template<unsigned n>  // general case: the value of Factorial<n> is n times the value of Factorial<n-1>
struct Factorial {
    enum { value = n * Factorial<n-1>::value };
};

template<>
struct Factorial<0> {
    enum { value = 1 };
};

The looping part of the code occurs where the template instantiation Factorial<n> references the tempalte instantiation Factorial<n-1>, until hitting the special case, Factorial<0>, that causes the recursion to terminate.

Each instantiation of the Factorial template is a struct, and each struct uses the enum hack (item 2) to declare a TMP variable named value, which holds the current value of the factorial computation. After recursive template instantiation, each instantiation gets its own copy of value.

To use Factorial:

1
2
3
4
5


int main()
{
    std::cout << Factorial<5>::value;  // prints 120
    std::cout << Factorial<10>::value; // prints 3628800
}

To sum up the technique keywords: templates and specializations and recursive instantiations and enum hacks.

What can be accomplished in TMP

1. Ensuring dimensional unit correctness

In scientific and engineering applications, it’s essential that dimensional units (e.g., mass, distance, time, etc.) be combined correctly. Using TMP, it’s possible to ensure (during compilation) that all dimensional unit combinations in a program are correct, no matter how complex the calculations - good example for early error detection.

2. Optimizang matrix operations

Consider the following code,

1
2
3
4


typedef SqureMatrix<double, 10000> BigMatrix;
BigMatrix m1, m2, m3, m4, m5;  // create matrices
... // give them values
BigMatrix result = m1 * m2 * m3 * m4 * m5; // compute the product

Calculating result in the “normal” way calls for the creation of four temporary matrices, one for the reuslt of each call to operator*. Furthermore, the independent multiplications generate a sequence of four loops over the matrix elements.

Using an advanced template technology related to TMP called expression templates, it’s possible to eliminate the temporaries and merge the loops, without changing the syntax of the client code above while enabling the program consume less memory and run dramatically fast.

3. Generating custom design pattern implementations

Design patterns like Strategy (item 35), Observer, Visitor, etc. can be implemented in many ways. Using a TMP-based technology called policy-based design, it’s possible to create templates representing independent design choices (“policies”) that can be combined in arbitrary ways to yield pattern implementations with custom behavior.

Generalized beyond the domain of programming artifacts like design patterns, this technology is a basis for what’s known as generative programming.

Programs using TMP may take much longer to compile than their non-TMP counterparts. ↩︎
Basic constructs, such as declaring variables, performing loops, and calling function, may look very different from their “normal” C++ counterparts, but that’s assembly-level TMP - it’s worth to know that libraries for TMP (e.g., Boost’s MPL, item 55) offer a higher-level syntax. ↩︎

Item-48 Be aware of template metaprogramming