Let’s imagine that you are responsible for the construction—from the ground up—of a brand new module in a big weather prediction system. Your task is to take care of the distribution of complex computations on a large computing grid, while another team has the responsibility for the actual computation algorithms (in a library created two decades previously).
We will see in this chapter what kinds of problems arise when you try to interface two bricks that were created 20 years apart, examine the typical approaches, and see if the template metaprogramming approach brings any benefit.
After two years of development, your distributed weather system is at last done! You’ve been very thorough in applying modern C++ principles all along, and took advantage of pass-by-value everywhere you could. You are happy with the performance, the software is now stable, and you’ve made the design as sound as possible given the time you had.
But now, you need to interface with “the Thing,” aka “The Simulation Library of Awesomeness,” or SLA for short.
The SLA was designed in the 1990s by developers who have now gone insane or missing. Every time you install the SLA on a system, it is no longer possible to run any other kind of software without having a team of senior system administrators perform a week-long ritual to cleanse the machine.
Last but not least, the SLA only believes in one god, and that god is The Great Opaque Pointer. All interfaces are made as incoherent as possible to ensure that you join the writers in an unnamable crazy laughter, ready to be one with The Great Opaque Pointer.
If you didn’t have several years of experience up your sleeve, you would advocate a complete rewrite of the SLA—but you know enough about software engineering to know that “total rewrite” is another name for “suicide mission.”
Are we dramatizing? Yes, we are. But let’s have a look at a function of the SLA:
// we assume alpha and beta to be parameters to the mathematical
// model underlying the weather simulation algorithms--any
// resemblance to real algorithms is purely coincidental
void
adjust_values
(
double
*
alpha1
,
double
*
beta1
,
double
*
alpha2
,
double
*
beta2
);
Now let’s have a look at how you designed your application:
class
reading
{
/* stuff */
public
:
double
alpha_value
(
location
l
,
time
t
)
const
;
double
beta_value
(
location
l
,
time
t
)
const
;
/* other stuff */
};
Let us not try to determine what those alpha and beta values are, whether the design makes sense, or what exactly adjust_values
does. What we really want to see is how we adapt two pieces of software that have very different logic.
Interfacing your software with other software is part of your job. It is easy to mock the lack of logic or cleanliness of a program that has been running and maintained for 25 years, but at the end of the day, it must work; no excuses.
In this case, you might be tempted to take a pragmatic approach and just interface functions as needed, with a wrapper like this:
std
::
tuple
<
double
,
double
,
double
,
double
>
get_adjusted_values
(
const
reading
&
r
,
location
l
,
time
t1
,
time
t2
)
{
double
alpha1
=
r
.
alpha_value
(
l
,
t1
);
double
beta1
=
r
.
beta_value
(
l
,
t1
);
double
alpha2
=
r
.
alpha_value
(
l
,
t2
);
double
beta2
=
r
.
beta_value
(
l
,
t2
);
adjust_values
(
&
alpha1
,
&
beta1
,
&
alpha2
,
&
beta2
);
return
std
::
make_tuple
(
alpha1
,
beta1
,
alpha2
,
beta2
);
}
You can see that we use a tuple to “return a bunch of otherwise unrelated stuff.” This is a common pattern in modern C++, and later you will see why using tuples has some advantages when it comes to metaprogramming.
But if we look again at the manual approach, we can see a certain number of issues:
You could retort, “Fine, let’s make it generic,” as shown here:
template
<
typename
Reading
>
std
::
tuple
<
double
,
double
,
double
,
double
>
get_adjusted_values
(
const
Reading
&
r
,
location
l
,
time
t1
,
time
t2
)
{
double
alpha1
=
r
.
alpha_value
(
l
,
t1
);
double
beta1
=
r
.
beta_value
(
l
,
t1
);
double
alpha2
=
r
.
alpha_value
(
l
,
t2
);
double
beta2
=
r
.
beta_value
(
l
,
t2
);
adjust_values
(
&
alpha1
,
&
beta1
,
&
alpha2
,
&
beta2
);
return
std
::
make_tuple
(
alpha1
,
beta1
,
alpha2
,
beta2
);
}
Sure, it’s an improvement, but not a big improvement. To which you will reply, “Fine, let’s make the methods generic!” as in this example:
template
<
typename
AlphaValue
,
typename
BetaValue
>
std
::
tuple
<
double
,
double
,
double
,
double
>
get_adjusted_values
(
AlphaValue
alpha_value
,
BetaValue
beta_value
,
location
l
,
time
t1
,
time
t2
)
{
double
alpha1
=
alpha_value
(
l
,
t1
);
double
beta1
=
beta_value
(
l
,
t1
);
double
alpha2
=
alpha_value
(
l
,
t2
);
double
beta2
=
beta_value
(
l
,
t2
);
adjust_values
(
&
alpha1
,
&
beta1
,
&
alpha2
,
&
beta2
);
return
std
::
make_tuple
(
alpha1
,
beta1
,
alpha2
,
beta2
);
}
And you would call the function as follows:
reading
r
;
// some code
auto
res
=
get_adjusted_values
(
[
&
r
](
double
l
,
double
t
){
return
r
.
alpha_value
(
l
,
t
);
},
[
&
r
](
double
l
,
double
t
){
return
r
.
beta_value
(
l
,
t
);
},
/* values */
);
What we will see here is how we can push this principle of reusability and genericity much further, thanks to template metaprogramming.
What we want to avoid writing is all the systematic code that takes the results from our C++ methods, puts them in the correct form for the C function, passes them to that C function, and gets the results in a form compatible with our C++ framework.
We can call it the boilerplate.
With template metaprogramming techniques, we will make the compiler work for us and avoid a lot of mistakes and tedious work.
You may be thinking, “I can write a Python script that will generate the code for me.” This is indeed doable, if the wrapping code isn’t too complex and you will not require a comprehensive C++ parsing. It will increase the complexity of building and maintaining your application, however, because in addition to requiring a compiler, you will now require a scripting language, probably with a certain set of libraries. This kind of solution is another form of automation.
You might also create an abstraction around the library, or at least a facade. You’ll still have one problem left, though: you have to write all of the tedious code.
But… computers are very good at repetitive tasks, so why not program the computer to write the facade for you? Wouldn’t that greatly increase your productivity?
Why not give it a try? In other words, let’s write a program that will generate the program. Let’s metaprogram!
If we look at the problem from a higher perspective, we see that we have on one side methods working with values, and on the other side functions working with pointers.
The typical C++ approach for a function that takes one parameter and returns one parameter is straightforward:
template
<
typename
ValueFunction
,
typename
PointerFunction
>
double
magic_wand
(
ValueFunction
vf
,
PointerFunction
pf
,
double
param
)
{
double
v
=
vf
(
param
);
pf
(
&
v
);
return
v
;
}
We take a callable, vf
, that accepts a double
as a parameter and returns a double
as a parameter. Because we’re using a template, we don’t need to be specific about what exactly vf
is (it can be a function, a functor, or a method bound to an object instance).
The callable pf
accepts a pointer to a double
as a parameter and updates the value. We then return the updated value.
We called that function magic_wand
because it’s the magic wand that makes your type problem go away!
But the problem is that we have more than one function and more than one parameter. We therefore need to somehow guess the type of the function, manipulate the type to correctly extract values, pass a pointer to these values to the PointerFunction
, and return the result.
If you pause to think about it, you’ll quickly realize that we need two capabilities:
In other words, we’d like to write C++ that modifies types and not values. Template metaprogramming is the perfect tool for compile-time type manipulations.
Let us take a look at a general case. How could we write a program that takes a double
and transforms it into a pointer to a double
?
Since C++11, the standard library has come with a fair number of functions to manipulate types. For example, if you’d like to transform a double
into a double *
, you can do this:
#include <type_traits>
// double_ptr_type will be double *
using
double_ptr_type
=
std
::
add_pointer
<
double
>::
type
;
And vice versa:
#include <type_traits>
// double_type will be double
using
double_type
=
std
::
remove_pointer
<
double
*>::
type
;
// note that removing a pointer from a nonpointer type is safe
// the type of double_too_type is double
using
double_too_type
=
std
::
remove_pointer
<
double
>::
type
;
These kinds of type manipulations (adding and removing pointers, references, and constness) are basic building blocks and extremely useful when dealing with type constraints. For example, your template parameter might have to be a const
reference when you actually need a value. With these tools you can ensure that your type is exactly what you need.
The generic version of the magic wand can take an arbitrary number of functions, concatenate the results into a structure, pass pointers to these results to our legacy C function that will apply the weather model, and return its output.
In other words, in pseudocode, we want something like this:
MagicListOfValues
generic_magic_want
(
OldCFunction
old_f
,
ListOfFunctions
functions
,
ListOfParameters
params
)
{
MagicListOfValues
values
;
/* wait, something is wrong, we can't do this
for(auto f : functions)
{
values.push_back(f(params));
}
*/
olf_f
(
get_pointers
(
values
));
return
values
;
}
The only problem is that we can’t do that.
Why? The first problem is that we need a collection of values, but those values might have heterogeneous types. Granted, in our example we return double
s and we could use a vector
.
The other problem is a performance issue—why resize the collection at runtime when you know exactly its size at compile time? And why use the heap when you can use the stack?
That’s why we like tuples. Tuples allow for heterogeneous types to be stored, their size is fixed at compile time, and they can avoid a lot of dynamic memory allocation.
That raises some questions, though. How do we build these tuples based on the parameters of our legacy C function? How do we iterate on a tuple? How do we work on the list of functions? How do we pass parameters?
The first step of the process is, for a given function F
, to build a tuple matching the parameters.
We will use the pattern matching algorithms of partial template specialization to do that:
template
<
typename
F
>
struct
make_tuple_of_params
;
template
<
typename
Ret
,
typename
...
Args
>
struct
make_tuple_of_params
<
Ret
(
Args
...)
>
{
using
type
=
std
::
tuple
<
Args
...
>
;
};
// convenience function
template
<
typename
F
>
using
make_tuple_of_params_t
=
typename
make_tuple_of_params
<
F
>::
type
;
In C++11, the semantics of the ...
operator have been changed and greatly extended to enable us to say to the compiler, “I expect a list of types of arbitrary length.” It has no relationship anymore with the old C ellipsis operator. This operator is a pillar of modern C++ template metaprogramming.
With our new function, we can therefore do the following:
template
<
typename
F
>
void
magic_wand
(
F
f
)
{
// if F is in the form void(double *, double *)
// make_tuple_of_params is std::tuple<double *, double *>
make_tuple_of_params_t
<
F
>
params
;
// ...
}
We now have a tuple of params we can load with the results of our C++ functions and pass to the C function. The only problem is that the C function is in the form void(double *, double *, double *,
double
*)
, and we work on values.
We will therefore modify our make_tuple_of_params
functor accordingly:
template
<
typename
Ret
,
typename
...
Args
>
struct
make_tuple_of_derefed_params
<
Ret
(
Args
...)
>
{
using
type
=
std
::
tuple
<
std
::
remove_ptr_t
<
Args
>
...
>
;
};
Now the function works as follows:
template
<
typename
F
>
void
magic_wand
(
F
f
)
{
// if F is in the form void(double *, double *)
// make_tuple_of_params is std::tuple<double, double>
make_tuple_of_derefed_params
<
F
>
params
;
// ...
}
We just need to load up the results!
Now that we can extract the contents of the C function’s parameters, we need to assemble them in objects that we can manipulate easily in C++.
Indeed, you might be tempted to write this:
template
<
typename
Functions
,
typename
Params
>
void
magic_wand
(
/* stuff */
,
Functions
...
f
,
Params
...
p
)
{
// stuff
}
After all, you have a list of functions and a list of parameters, and you want both of them. The only problem is, how can the compiler know when the first list ends and the second list begins?
Again, tuples come to the rescue:
template
<
typename
...
Functions
,
typename
...
Params
>
void
magic_wand
(
/* stuff */
,
const
std
::
tuple
<
Functions
...
>
&
f
,
const
std
::
tuple
<
Params
...
>
&
p1
,
const
std
::
tuple
<
Params
...
>
&
p2
)
{
// stuff
}
This enables the compiler to know that multiple tuples of arbitrary and unrelated lengths are expected. You could, of course, make a tuple of tuples if you expect more than two sets of parameters, but there’s no need to make our example more complex than it needs to be.
Although compilers are getting very good at removing unnecessary copies, and rvalue references help with moving objects, be mindful of what you put inside your tuples and how many of them you create.
Passing the values, in our example, becomes the following:
magic_wand
(
/* stuff */
,
// our C++ functions
std
::
make_tuple
(
[
&
r
](
double
l
,
double
t
){
return
r
.
alpha_value
(
l
,
t
);
},
[
&
r
](
double
l
,
double
t
){
return
r
.
beta_value
(
l
,
t
);
}),
// first set of params
std
::
make_tuple
(
l
,
t1
),
// second set of params
std
::
make_tuple
(
l
,
t2
));
Which means that inside the body of the magic_wand
function, we will have tuples containing the functions we need to call and the parameters we need to pass to them.
We’ve progressed, but we have not arrived. On one hand we have tuples of values to pass to the C function; on the other hand, we have a tuple of functions and parameters.
We now want to fill the tuple of values with the results, which means calling every function inside the tuple and passing the correct parameters:
template
<
typename
LegacyFunction
,
typename
...
Functions
,
typename
...
Params
>
auto
magic_wand
(
LegacyFunction
legacy
,
const
std
::
tuple
<
Functions
...
>
&
functions
,
const
std
::
tuple
<
Params
...
>
&
params1
,
const
std
::
tuple
<
Params
...
>
&
params2
)
{
make_tuple_of_derefed_params_t
<
LegacyFunction
>
params
=
{
/* we would like to do
for(auto f : functions)
{
f(params1);
}
for(auto f : functions)
{
f(params2);
}*/
};
// rest of the code
}
In C++14 you don’t need to be explicit about the return type of a function; the type can be determined at compile time contextually. Using auto
in this case greatly simplifies the writing of generic functions.
In template metaprogramming, there is no iterative construct. You can’t iterate on your list of types by using for
. You can, however, use recursion to apply a callable on every member of the tuple. This approach has been used since 2003 to great effect, but it has the disadvantage of generating a huge amount of intermediate types and therefore increases compilation time.
Whenever you can, you should use the ...
operator to apply a callable to every member of a list. This is faster, it doesn’t generate all the unneeded intermediate types, and the code is often more concise.
How can we use the ...
operator for that? Here, we will create a sequence that matches the size of the tuple in order to apply a functor to each member:
template
<
typename
F
,
typename
Params
,
std
::
size_t
...
I
>
auto
dispatch_params
(
F
f
,
Params
&
params
,
std
::
index_sequence
<
I
...
>
)
{
return
f
(
std
::
get
<
I
>
(
params
)...);
}
What happens here is the following:
template
<
typename
F
,
typename
Params
,
std
::
size_t
...
I
>
auto
dispatch_params
(
F
f
,
Params
&
params
,
std
::
index_sequence
<
I
...
>
)
{
// not real C++ code
return
f
(
std
::
get
<
0
>
(
params
),
std
::
get
<
1
>
(
params
),
std
::
get
<
2
>
(
params
),
...,
std
::
get
<
N
>
(
params
));
// where N is the last index
}
The advantage is that all of the work is done by the compiler and it’s much faster than recursion (or macros).
The trick is to create an index sequence—whose sole purpose is to give us an index on which to apply the ...
operator—of the right size. This is done as follows:
static
const
std
::
size_t
params_count
=
sizeof
...(
Params
);
std
::
make_index_sequence
<
params_count
>
();
At compile time, when you need to know how many elements you have in your list, you use sizeof...()
. Note that in this case we stored that into a static const
variable, but it would actually be better to use a std::integral_constant
. You will learn more about this in Chapter 3.
We are getting very close to solving our problem; that is, automating the generation of facade code to adapt the simulation library to our distributed system.
But the problem is not fully solved yet because we need to somehow “iterate” on the functions. We will modify our dispatch function so that it accepts the tuple of functions as a parameter and takes an index, as demonstrated here:
template
<
std
::
size_t
FunctionIndex
,
typename
FunctionsTuple
,
typename
Params
,
std
::
size_t
...
I
>
auto
dispatch_params
(
FunctionsTuple
&
functions
,
Params
&
params
,
std
::
index_sequence
<
I
...
>
)
{
return
(
std
::
get
<
FunctionIndex
>
(
functions
))
(
std
::
get
<
I
>
(
params
)...);
}
And we will use the same index_sequence
trick to call dispatch_params
on every function of the tuple:
template
<
typename
FunctionsTuple
,
std
::
size_t
...
I
,
typename
Params
,
typename
ParamsSeq
>
auto
dispatch_functions
(
FunctionsTuple
&
functions
,
std
::
index_sequence
<
I
...
>
,
Params
&
params
,
ParamsSeq
params_seq
)
{
return
std
::
make_tuple
(
dispatch_params
<
I
>
(
functions
,
params
,
params_seq
)...);
}
The previous code enables us to aggregate the result of the successive calls to each element of the tuple into a single tuple.
The final code thus becomes:
template
<
typename
LegacyFunction
,
typename
...
Functions
,
typename
...
Params
>
auto
magic_wand
(
LegacyFunction
legacy
,
const
std
::
tuple
<
Functions
...
>
&
functions
,
const
std
::
tuple
<
Params
...
>
&
params1
,
const
std
::
tuple
<
Params
...
>
&
params2
)
{
static
const
std
::
size_t
functions_count
=
sizeof
...(
Functions
);
static
const
std
::
size_t
params_count
=
sizeof
...(
Params
);
make_tuple_of_derefed_params_t
<
LegacyFunction
>
params
=
std
::
tuple_cat
(
dispatch_functions
(
functions
,
std
::
make_index_sequence
<
functions_count
>
(),
params1
,
std
::
make_index_sequence
<
params_count
>
()),
dispatch_functions
(
functions
,
std
::
make_index_sequenc
<
functions_count
>
(),
params2
,
std
::
make_index_sequence
<
params_count
>
()));
/* rest of the code */
}
As you can see, the logic of our function makes generalization to an arbitrary list of parameters possible.
We now have loaded in a tuple the results of our C++ method calls. Now we want to pass a pointer to these values to the C function. With all the concepts we have seen so far, we know how to solve that problem.
We need to determine the size of our results tuple, which we can do by calling the std::tuple_size
function (which is compile-time) and do exactly what we’ve done previously to pass all of the parameters:
template
<
typename
F
,
typename
Tuple
,
std
::
size_t
...
I
>
void
dispatch_to_c
(
F
f
,
Tuple
&
t
,
std
::
index_sequence
<
I
...
>
)
{
f
(
&
std
::
get
<
I
>
(
t
)...);
}
The only twist is that we will take the address to the tuple member because the C function requires a pointer to the value to update. It is safe because std::get<>
returns a reference to the tuple value.
Here is the completed function:
template
<
typename
LegacyFunction
,
typename
...
Functions
,
typename
...
Params
>
auto
magic_wand
(
LegacyFunction
legacy
,
const
std
::
tuple
<
Functions
...
>
&
functions
,
const
std
::
tuple
<
Params
...
>
&
params1
,
const
std
::
tuple
<
Params
...
>
&
params2
)
{
static
const
std
::
size_t
functions_count
=
sizeof
...(
Functions
);
static
const
std
::
size_t
params_count
=
sizeof
...(
Params
);
using
tuple_type
=
make_tuple_of_derefed_params_t
<
LegacyFunction
>
;
tuple_type
t
=
std
::
tuple_cat
(
dispatch_functions
(
functions
,
std
::
make_index_sequence
<
functions_count
>
(),
params1
,
std
::
make_index_sequence
<
params_count
>
()),
dispatch_functions
(
functions
,
std
::
make_index_sequenc
<
functions_count
>
(),
params2
,
std
::
make_index_sequence
<
params_count
>
()));
static
const
std
::
size_t
t_count
=
std
::
tuple_size
<
tuple_type
>::
value
;
dispatch_to_c
(
legacy
,
params
,
std
::
make_index_sequence
<
t_count
>
());
return
params
;
}
Wouldn’t it be nice if we didn’t need to specify the type of the result of the tuple concatenation? After all, the compiler knows which kind of tuple it’s going to be. But in that case, how could we compute the size of the resulting tuple?
We can use the decltype
directive to access the type of a variable:
auto
val
=
/* something */
;
decltype
(
val
)
// get type of val
This simplifies the code and removes the need for the make_tuples_of_params_t
functor, as shown here:
template
<
typename
LegacyFunction
,
typename
...
Functions
,
typename
...
Params
>
auto
magic_wand
(
LegacyFunction
legacy
,
const
std
::
tuple
<
Functions
...
>
&
functions
,
const
std
::
tuple
<
Params
...
>
&
params1
,
const
std
::
tuple
<
Params
...
>
&
params2
)
{
static
const
std
::
size_t
functions_count
=
sizeof
...(
Functions
);
static
const
std
::
size_t
params_count
=
sizeof
...(
Params
);
auto
params
=
std
::
tuple_cat
(
dispatch_functions
(
functions
,
std
::
make_index_sequence
<
functions_count
>
(),
params1
,
std
::
make_index_sequence
<
params_count
>
()),
dispatch_functions
(
functions
,
std
::
make_index_sequence
<
functions_count
>
(),
params2
,
std
::
make_index_sequence
<
params_count
>
()));
static
constexpr
auto
t_count
=
std
::
tuple_size
<
decltype
(
params
)
>::
value
;
dispatch_to_c
(
legacy
,
params
,
std
::
make_index_sequence
<
t_count
>
());
return
params
;
}
You could also improve the efficiency of the code by using rvalue references and ensuring that you use perfect forwarding semantics.
How can we use what we’ve built to finalize facade generation?
For clarity, we will use an explicit return type, but we could use auto
. Using an explicit return type has the advantage of generating a compilation error if your type conversions are incorrect.
Another important reason for this decision is that we can consider get_adjusted_values
as a public API function. Using an auto
return type makes the function more difficult to use because its return type isn’t clear. Your users aren’t compilers!
Let’s have a look at the code:
template
<
typename
Reading
>
std
::
tuple
<
double
,
double
,
double
,
double
>
get_adjusted_values
(
Reading
&
r
,
location
l
,
time
t1
,
time
t2
)
{
return
magic_wand
(
adjust_values
,
std
::
make_tuple
(
[
&
r
](
double
l
,
double
t
)
{
return
r
.
alpha_value
(
l
,
t
);
},
[
&
r
](
double
l
,
double
t
)
{
return
r
.
beta_value
(
l
,
t
);
}),
std
::
make_tuple
(
l
,
t1
),
std
::
make_tuple
(
l
,
t2
));
}
The power of this new function is that if the legacy C function or the C++ object changes, there will be little to no code rewriting to be done.
Writing the wrappers will also be extremely straightforward, safe, and productive: just call the magic_wand
function with the required values. You can make it even more generic by wrapping the parameters in other functors and deducing the right types as needed.
And guess what? It’s also possible to write code to generate all the wrappers for you based on the function profiles. We’ve seen in this chapter all of the building blocks to achieve that.
Did we accomplish our mission? We’d like to believe that, yes, we did.
With the use of a couple of template metaprogramming tricks, we managed to drastically reduce the amount of code required to get the job done. That’s the immediate benefit of automating code generation. Less code means fewer errors, less testing, less maintenance, and potentially better performance.
This is the strength of metaprogramming. You spend more time carefully thinking about a small number of advanced functions, so you don’t need to waste your time on many trivial functions.
Now that you have been exposed to template metaprogramming, you probably have many questions. How can I check that my parameters are correct? How can I get meaningful error messages if I do something wrong? How can I store a pure list of types, without values?
More importantly, can these techniques be made reusable?
Let’s take it from the beginning…