Templates

Templates are notorious for the code bloating they produce. Some organisations explicitly forbid usage of templates in their internal C++ coding standards. However, templates is a very powerful tool, it is very difficult (if not impossible) to write generic source code, that can be reused in multiple independent projects/platforms without using templates, and without incurring any significant performance penalties. I think developers, who are afraid or not allowed to use templates, will have to implement the same concepts/modules over and over again with minor differences, which are project/platform specific. To properly master the templates we have to see the Assembler code duplication, that is generated by the compiler when templates are used. Let's try to compile a simple application test_cpp_templates that uses templated function with different type of input parameters:

template <typename T> 
void func(T startValue) 
{ 
    for (volatile T i = startValue; i < startValue * 2; i += 1) {} 
    for (volatile T i = startValue; i < startValue * 2; i += 2) {} 
    for (volatile T i = startValue; i < startValue * 2; i += 3) {} 
    for (volatile T i = startValue; i < startValue * 2; i += 4) {} 
    for (volatile T i = startValue; i < startValue * 2; i += 5) {} 
    for (volatile T i = startValue; i < startValue * 2; i += 6) {} 
} 

int main(int argc, const char** argv) 
{ 
    static_cast<void>(argc); 
    static_cast<void>(argv); 

    int start1 = 100; 
    unsigned start2 = 200; 

    func(start1); 
    func(start2); 

    while (true) {}; 
    return 0; 
}

You may notice that function func is called with two parameters, one of type int the other of type unsigned. These types have both the same size and should generate more or less identical code. Let's take a look at the generated code of main function:

00008504 <main>:
    8504:    e92d4008     push    {r3, lr}
    8508:    e3a00064     mov    r0, #100    ; 0x64
    850c:    ebfffefc     bl    8104 <_Z4funcIiEvT_>
    8510:    e3a000c8     mov    r0, #200    ; 0xc8
    8514:    ebffff3a     bl    8204 <_Z4funcIjEvT_>
    ...

Yes, indeed, there are two calls to two different functions. However, the Assembler code of these functions is almost identical. Let's also try to reuse the same function with the same types but from different source file:

void other() 
{ 
    int start1 = 300; 
    unsigned start2 = 500; 

    func(start1); 
    func(start2); 
}

The generated code is:

000080d8 <_Z5otherv>: 
    80d8:    e92d4008     push    {r3, lr} 
    80dc:    e3a00f4b     mov    r0, #300    ; 0x12c 
    80e0:    eb000007     bl    8104 <_Z4funcIiEvT_> 
    80e4:    e3a00f7d     mov    r0, #500    ; 0x1f4 
    80e8:    eb000045     bl    8204 <_Z4funcIjEvT_> 
    80ec:    e8bd8008     pop    {r3, pc}

We see that the same functions at the same addresses are called, i. e. the linker does its job of removing duplicates of the same functions from different object files.

Let's also try to wrap the same function with a class and add one more template argument:

template <typename T, std::size_t TDummy>
struct SomeTemplateClass
{
    static void func(T startValue)
    {
        for (volatile T i = startValue; i < startValue * 2; i += 1) {}
        for (volatile T i = startValue; i < startValue * 2; i += 2) {}
        for (volatile T i = startValue; i < startValue * 2; i += 3) {}
        for (volatile T i = startValue; i < startValue * 2; i += 4) {}
        for (volatile T i = startValue; i < startValue * 2; i += 5) {}
        for (volatile T i = startValue; i < startValue * 2; i += 6) {}
    }
};

Please note the dummy template parameter TDummy that is not used. Now, we add two more calls to the main function:

int main(int argc, const char** argv)
{
    ...
    SomeTemplateClass<int, 5>::func(500);
    SomeTemplateClass<int, 10>::func(500);

    while (true) {};
    return 0;
}

Note, that the functionality of the calls is identical. The only difference is the dummy template argument. Let's take a look at the generated code:

00008504 <main>:
    ...
    8518:    e3a00f7d     mov    r0, #500    ; 0x1f4
    851c:    ebffff78     bl    8304 <_ZN17SomeTemplateClassIiLj5EE4funcEi>
    8520:    e3a00f7d     mov    r0, #500    ; 0x1f4
    8524:    ebffffb6     bl    8404 <_ZN17SomeTemplateClassIiLj10EE4funcEi>
    8528:    eafffffe     b    8528 <main+0x24>

The compiler generated calls to two different functions, binary code of which is identical.

CONCLUSION: The templates indeed require extra care and consideration. It is also important not to overthink things. The well known notion of “Do not do premature optimisations. It is much easier to make correct code faster, than fast code correct.” is also applicable to code size. Do not try to optimise your template code before the need arises. Make it work and work correctly first.

Last updated