"The time has come," the Walrus said,
"To talk of many things:
Of tuples, trees and composites;
Of visitors and kings."
1
Welcome
“Good morning, everyone, and welcome to the Wonderland Social Club annual treasure hunt. I am the Walrus.” ( coo-coo coo-choo ) “Well, not a walrus, but I am quite long in the tooth.” ( groan )
“This year the clues are all in trees. On each clue sheet there’s a clue to an item of treasure. Some clue sheets also contain two further clues, which lead to more clue sheets. With each treasure clue there is an indication of the value of the treasure at that location.”
“You have until 6 o’clock this evening to find as much treasure as you can. The team with the most valuable hoard of treasure will be the winner. The first clue is outside in the garden. See you back here in the Carpenter’s Arms at 6 o’clock. Good luck everybody!”
Planning the Route
There were three teams: four trainee nurses called the Pre-Tenders, three employees of the Royal Mail called the Post Men and two publicans called the Inn Keepers. The Pre-Tenders decided to do the easy clues first; the Post Men chose to visit the nearest places first; and the Inn Keepers settled for finding the most valuable treasure first.
Overload readers will have spotted immediately that the treasure hunters’ problem involves the traversal of an abstract binary tree. The Walrus had drawn the tree on a sheet of paper so that he could refer to it when he was adding up the scores at the end of the day. And, as it turned out, the Pre-Tenders would visit the treasure locations in pre-order sequence, the Post Men in post-order sequence and the Inn Keepers in in-order sequence.
Encoding the Problem
Bill, a nerdy-looking youth with thick oyster-shell glasses had spotted this, too. He was a C# programmer and was often to be seen in the corner of the bar with a beer and a lap-top. To him the treasure hunters’ tree seemed to be a set of dynamically constructed, garbage collected, polymorphic objects. “It’s a binary tree”, he said to his best mate, Ben. “Yes”, agreed Ben, but Ben’s mental imagery was very different. “Nested structs”, said Ben.
struct L { char a, i; }; struct C { L l; char e; }; |
Listing 1 - A simple tree as nested structs. |
Bill looked at him blankly for a moment, decided Ben must have been joking and replied with an ironic, “Yeah, right”. But Ben was a C++ programmer and he wasn’t joking. “No, really” said Ben, pulling out his own lap-top, “Look, I’ll show you what I mean”.
Ben drew a tree with just five nodes, A, L, I, C and E, as shown in Listing 1. He defined classes for the root node, C, and the other non-leaf node L. For the leaf nodes he just used
chars
.
Bill was not impressed at all. “I thought you were a C++ programmer”, he retorted. “That’s C code. Where are the classes? Where are the methods? What’s that public data doing there?” “This just captures the
structure
of the tree”, Ben explained. “I’ve used a
char
as a place-holder for the treasure clues and the value of the treasure.”
Bill wasn’t convinced. Inserting his
Design Patterns
CD into his PC he brought up the Composite structure diagram and typed in an improved version of Ben’s classes. “Even I can write better C++ than that”, Bill scoffed. Referring to the example in the
Design Patterns
book, Bill wrote the code shown in Listing 2. Now it was Ben’s turn to be unimpressed. Bill’s code was more than three times longer (32 non-blank lines to 9), it had more types in more complex relationships, it brought up apparently irrelevant implementation issues (such as using vectors or lists and how to implement
Leaf::push_back()
) and it was less efficient in time and space (particularly if the tree nodes were allocated on the heap). But Ben was more interested in other things.
class Component { public: virtual ~Component() {} virtual void operation() = 0; virtual void push_back(Component*) = 0; typedef std::vector<Component*>::iterator Iterator; virtual Iterator begin() = 0; virtual Iterator end() = 0; }; class Leaf : public Component { public: Leaf(int v) : leaf_value(v) {} int value() { return leaf_value; } private: virtual void operation() { /* . . . */ } virtual void push_back(Component*) { throw "Can't add to Leaf!"; } virtual Iterator begin() { return children.begin(); } virtual Iterator end() { return children.end(); } int leaf_value; std::vector<Component*> children; // always empty }; class Composite : public Component { private: virtual void operation() { /* . . . */ } virtual void push_back(Component* c) { children.push_back(c); } virtual Iterator begin() { return children.begin(); } virtual Iterator end() { return children.end(); } std::vector<Component*> children; }; |
Listing 2 - Classes for the Composite Pattern in C++. |
“OK”, said Ben, in an uncharacteristically conciliatory tone, “but your tree has to be built at run time and it gives special status to
Component::operation()
. What if the tree structure is fixed at compile time? And what if we want to capture the tree structure, but don’t know what operations will be performed on the nodes?” Bill’s face fell. He could see what was coming. Any second now Ben was going to say something about those confounded C++ templates. Two of the trainee nurses from the Pre-Tenders team had been powdering their noses. As they came out of the Ladies and left the pub to start the treasure hunt Bill’s face brightened momentarily. But his brief reverie was broken by Ben’s words. “We should be able to write a function template that applies an arbitrary function to each node in a tree. Any tree. Something like this.” And Ben typed in the code in Listing 3.
int main() { C tree = /* ... */; visit<in_order>(tree, Write(cout)); return 0; } |
Listing 3 - A generic tree visit function. |
“Let’s stick with our 5-node tree and let’s not worry about how it’s created, for now. Write is a function object that writes information about a node to an
ostream
. And
visit
is a function template that walks the tree and calls a function (
Write
in this case) for each node. The
in_order
template parameter is a policy class that specifies in-order traversal.”
The Generalised Visit Algorithm
Thinking out loud, Ben said: “When we visit a node we need to apply the given function to the node itself and each of its child nodes. The parent defines a natural order on the children and we’ll use that to visit each child in turn. The traversal policy defines when to visit the parent (before the children, after the children or somewhere in between). It’s like having a parent that’s an STL container of children and the traversal policy provides an iterator into the container – except that it all happens at compile time. So the algorithm is:
- visit the children in the range [first, pvp)
- visit the parent
- visit the children in the range [pvp, last) where pvp is the ‘parent visit position’ iterator obtained from the traversal policy.”
Ben’s code is shown in Listing 4.
template<typename Traversal, typename Node, typename Op> void visit(Node& root, Op op) { typedef typename begin<Node>::type first; typedef typename end<Node>::type last; typedef typename Traversal::template apply<Node>::type pvp; children<first, pvp>::template visit_each<Traversal>(root, op); op(root); children<pvp, last>::template visit_each<Traversal>(root, op); } |
Listing 4 - The generalised visit algorithm. |
At first this looked more like hieroglyphics than source code to Bill. He could see Ben’s “iterators”
first
,
last
and
pvp
, but they were
types
and real iterators are
objects
. He could see function calls that seemed to correspond to the three steps of the visit algorithm, too, but it was far from clear how the
visit_each
functions would work. The “iterators” were being used to instantiate the
children
class template instead of being passed to the
visit_each
function, so how could the function step through the children as the algorithm requires?
Compile-Time Iterators
Ben was well aware that the code needed some explanation. “Compile time algorithms work with constants and types, which are both immutable”, he went on. “There’s no such thing as a compile-time variable. Instead of incrementing an iterator we must create a new iterator pointing to the next element in the container. For example, the root node in our 5-node tree has two children: l and e. We could think of using a pointer-to-memberof-C as an iterator. Then an iterator to the first child of C would point to
C::l
and we can imagine incrementing that iterator to point to
C::e
. But C++ doesn’t have a pointer-to-member-of-C type – it only has pointer-to-member-of-C-of-type-L (
L C::*
); and that can’t be incremented because it would change type in the process (becoming
int C::*
).”
Bill wasn’t sure if he understood this, but he wasn’t going to admit it, so he let Ben continue. “Instead we can define a metafunction that takes a compile-time iterator as a template parameter and generates a new compile-time iterator pointing to the next element. Something like this.” Ben produced listing 5.
template<typename C, typename M, M C::*> struct next; template<> struct next<C, L, &C::l> { static int L::* const value; }; int C::* const next<C, L, &C::l>::value = &C::e; |
Listing 5 - The compile-time analogue of incrementing an iterator. |
“We pass in
&C::l
(as a template parameter) and we get out
&C::e
(as a class static value).”
“But that’s hideous”, said Bill. “It is pretty ugly”, admitted Ben. “And it only works for child nodes that are accessible data members of the parent node, too. That’s very limiting. But the
visit
function doesn’t use pointer-to-member values as iterators – it uses types, and that’s much more flexible.”
Ben was enjoying himself. “Suppose L was a base class of C instead of a member”, he enthused. “We can’t use a pointer-tomember value to identify a base class, but we can encode a pointer-to-member value as a type.” (In fact, the
next<>
metafunction in Listing 5 does just that.) “And then we can use types as iterators to base classes
and
data members. It’s probably worth declaring class templates for these two families of types:
member_iterator<>
points to a node that’s a member of its parent and
base_iterator<>
points to a node that’s a base class sub-object of its parent.”
“I’ll refer to these collectively as ‘child iterators’. The
begin<>
and
end<>
meta-functions can return classes instantiated from these templates for the
visit<>
function to use. Those metafunctions are the compile-time analogue of the
begin()
and
end()
functions provided by STL containers.”
“OK”, said Bill slowly, still far from convinced, “We can’t increment these iterators, but we still have to compare them and dereference them, right? How do we do that?”
Iterating Through the Children
“That’s a good question”, replied Ben. “We need to see how the
visit_each<>
function works to answer that.”
“As you can see, there are no loop constructs. There can’t be because loops require a loop variable and there are no compile-time variables. So we use recursion instead. The idea is to visit the first child (by calling the
visit<>
function) and then call
visit_each<>
recursively to visit all the remaining children.” Ben paused to take another gulp of his Black Sheep bitter and Bill seized the opportunity to ask why
visit_each<>
is a static function of a template class. “I’ll come to that in a minute”, Bill responded, “but for now you just need to know that the primary
children<>
template defines the general case and the specialisation provides the terminating condition for the recursion.”
Ben pondered for a moment before continuing. “If
First
is a
member_iterator<>
the
visit_each<>
function can retrieve a pointer-to-member value from it, which must be bound to an object using the
.*
or
->*
operator before we can access the data member it refers to. We can’t store an object reference or pointer in the iterator because it’s a
compile-time
iterator and references and pointers are run-time entities – they don’t have a value at compile time. And that means we can’t dereference a
member_iterator<>
– it doesn’t hold enough information. So, the
visit_each<>
function must bind a compile-time iterator to a (run-time) reference to the parent node, which produces a reference to the child node pointed to by the iterator. And, of course, if we have to do this for
member_iterator<>
s we should do the same for other child iterators such as
base_iterator<>
s.”
template<typename C, typename M, M C::*> struct member_iterator; template<typename D, typename B> struct base_iterator; |
Listing 6 - Member and base class iterators. |
“Is that clear?”, asked Ben. “Errm, I think so”, said Bill, who wasn’t used to thinking about the interaction of compile-time operations with the run-time environment. “So, a child iterator is a type, not an object; it can’t be incremented; and it can’t be dereferenced. That’s a weird sort of iterator!” Ben agreed, “Weird, yes, but it’s still an iterator in my book”.
Bill was studying the
visit_each<>
function and another question occurred to him: “
bind<>
seems to be a class template with a nested function. Why isn’t it just a template function?” “Ah”, said Ben, “that’s so that we can define separate bind functions for member-iterators and base-iterators. If C++ supported partial specialisation of function templates we could define partial specialisations for each of the two iterator families; but it doesn’t, so I’ve just put a static function in a class template and defined partial specialisations of the class template.”
template<typename First, typename Last> struct children { template<typename Traversal, typename Node, typename Op> static void visit_each(Node& node, Op op) { visit<Traversal>(bind<First>::function(node), op); typedef typename next<First>::type Next; children<Next,Last>::template visit_each<Traversal>(node, op); } }; template<typename Iter> struct children<Iter, Iter> { template<typename Traversal, typename Node, typename Op> static void visit_each(Node&, Op) {} }; |
Listing 7 - The visit_each<> function. |
Hardly pausing for breath Ben continued his commentary. “Here, the
parent<>
and
child<>
meta-functions just extract the type of the parent and child nodes from the child iterator.” (See Appendix 1.) “The
member_iterator<>
specialisation uses the
.*
operator to bind a pointer-to-member to the parent; the
base_iterator<>
specialisation just uses the implicit conversion from derived class reference to base class reference.” (Although Ben didn’t think to mention it, he could have used the Boost
enable_if<>
templates to define separate overloads of the bind function instead.)
template<typename Iter> struct bind; template<typename C, typename M, M C::* member> struct bind< member_iterator<C, M, member> > { typedef member_iterator<C, M, member> Iter; typedef typename child<Iter>::type Child; typedef typename parent<Iter>::type Parent; static Child& function(Parent& parent) { return parent.*member; } }; template<typename D, typename B> struct bind< base_iterator<D,B> > { typedef base_iterator<D,B> Iter; typedef typename child<Iter>::type Child; typedef typename parent<Iter>::type Parent; static Child& function(Parent& parent) { return parent; } }; |
Listing 8 - Simulated partial specialisations of the bind function. |
Ben leant back in his chair with a satisfied smile. “The rest is easy”, he said. “The
visit_each<>
function uses the
next<>
meta-function to generate an iterator to the next child node and calls itself to visit the remaining children (if any). Eventually, the two iterators become equal and the
children<Iter,Iter>
specialisation comes into play. At that point an empty
visit_each<>
function is instantiated and the recursion terminates.”
Bill felt cheated. “So we don’t compare the iterators, either”, he said tetchily. “At least, not explicitly.” “Well, no, I suppose not”, Ben replied with an air of superiority, “but that’s how it is with compile-time algorithms.”
Epilogue
“Glad to see everyone back on time”, said the Walrus cheerily. “Now, let’s see how you all got on.” And with that he collected the clue sheets and sat down to work out what the teams had scored. Some 20 minutes later he stood up again and gravely announced that he had a problem – all three teams had the same score! “So I’ll offer a bonus point to the first team to find Tweedledum and Tweedledee.” One of the Pre-Tenders gave a little squeal, waved one hand in the air and pointed excitedly with the other in the direction of Bill and Ben. “It’s them”, she cried. And everyone could see she was right. They looked like two peas in a pod and they were arguing like schoolboys.
“Compile-time is better”, said Ben. “Nah, run-time”, insisted Bill. “Compile-time.” “Run-time.” ...
Phil Bass
phil@stoneymanor.demon.co.uk
Full code listings provided in appendices.
Appendix 1 – Visit.hpp
The following code implements the compile-time visit algorithm itself. |
struct nil {}; template<typename Iter> struct parent; template<typename Iter> struct child; template<typename Iter> struct bind; template<typename Iter> struct next; template<typename C, typename M, M C::*> struct member_iterator; template<typename C, typename M, M C::* member> struct parent< member_iterator<C,M,member> > { typedef C type; }; template<typename C, typename M, M C::* member> struct child< member_iterator<C,M,member> > { typedef M type; }; template<typename C, typename M, M C::* member> struct bind< member_iterator<C, M, member> > { typedef member_iterator<C, M, member> Iter; typedef typename child<Iter>::type Child; typedef typename parent<Iter>::type Parent; static Child& function(Parent& parent) { return parent.*member; } }; template<typename D, typename B> struct base_iterator; template<typename D, typename B> struct parent< base_iterator<D,B> > { typedef D type; }; template<typename D, typename B> struct child< base_iterator<D,B> > { typedef B type; }; template<typename D, typename B> struct bind< base_iterator<D,B> > { typedef base_iterator<D,B> Iter; typedef typename child<Iter>::type Child; typedef typename parent<Iter>::type Parent; static Child& function(Parent& parent) { return parent; } }; template<typename T> struct begin { typedef nil type; }; template<typename T> struct end { typedef nil type; }; template<typename First, typename Last> struct children; template<typename Traversal, typename Node, typename Op> void visit(Node& root, Op op) { typedef typename begin<Node>::type first; typedef typename end<Node>::type last; typedef typename Traversal::template apply<Node>::type pvp; children<first, pvp> ::template visit_each<Traversal>(root, op); op(root); children<pvp, last> ::template visit_each<Traversal>(root, op); } template<typename First, typename Last> struct children { template<typename Traversal, typename Node, typename Op> static void visit_each(Node& node, Op op) { visit<Traversal> (bind<First>::function(node), op); typedef typename next<First>::type Next; children<Next,Last> ::template visit_each<Traversal>(node, op); } }; template<typename Iter> struct children<Iter, Iter> { template<typename Traversal, typename Node, typename Op> static void visit_each(Node&, Op) {} }; |
Appendix 2 - Visiting Alice (with children as nested structs)
A program using the visit algorithm described here needs to provide some information about the structure of the tree whose nodes are to be visited. This is done by defining suitable child-iterator types, the next
|
struct L { char a, i; }; typedef member_iterator<L, char, &L::a> L_a_iterator; typedef member_iterator<L, char, &L::i> L_i_iterator; template<> struct next<L_a_iterator> { typedef L_i_iterator type; }; template<> struct next<L_i_iterator> { typedef nil type; }; template<> struct begin<L> { typedef L_a_iterator type; }; struct C { L l; char e; }; typedef member_iterator<C, L, &C::l> C_l_iterator; typedef member_iterator<C, char, &C::e> C_e_iterator; template<> struct next<C_l_iterator> { typedef C_e_iterator type; }; template<> struct next<C_e_iterator> { typedef nil type; }; template<> struct begin<C> { typedef C_l_iterator type; }; struct in_order { template<typename T> struct apply { typedef nil type; }; }; template<> struct in_order::apply<C> { typedef C_e_iterator type; }; template<> struct in_order::apply<L> { typedef L_i_iterator type; }; |
The traversal policy is in the form of a meta-function class (a class with a nested class template). This gives the flexibility of template template parameters while still allowing the policy to be used by compile-time algorithms operating on types. |
Appendix 3 - Visiting Alice (with a base class as a child)
The case of the ALICE tree where L is a base class of C is only slightly different. The root node, C, needs a constructor and the child-iterator pointing to node L becomes a base_iterator<>. The following code shows the support required for node C (most of which is the same as for the example in which L is a member of C). |
struct C : L { C(const L& l, char e_) : L(l), e(e_) {} char e; }; typedef base_iterator<C, L> C_l_iterator; typedef member_iterator<C, char, &C::e> C_e_iterator; template<> struct next<C_l_iterator> { typedef C_e_iterator type; }; template<> struct next<C_e_iterator> { typedef nil type; }; template<> struct begin<C> { typedef C_l_iterator type; }; template<> struct in_order::apply<C> { typedef C_e_iterator type; }; |
For compile-time trees defined using tuples and inheritance it is possible to provide the child iterators in a library. For example, the ALICE tree could be defined as: |
#include <boost/tuple.hpp> using boost::tuple; typedef tuple<char,char> L; typedef tuple<L,char> C; C tree = C( L('A','I'), 'E' ); |
Since Boost tuples are implemented using inheritance it is easy to provide predefined base-iterators and meta-functions supporting the visit<> algorithm. In this case, no information about the structure of the tree needs to be provided in the client code; visiting Alice comes for free. |
-
After Lewis Caroll’s “The Walrus and the Carpenter”. By the way, he lied about the kings. ↩