Visiting Alice

"The time has come," the Walrus said,
"To talk of many things:
Of tuples, trees and composites;
Of visitors and kings." ¹

Welcome

“Good morning, everyone, and welcome to the Wonderland Social Club annual treasure hunt. I am the Walrus.” ( coo-coo coo-choo ) “Well, not a walrus, but I am quite long in the tooth.” ( groan )

“This year the clues are all in trees. On each clue sheet there’s a clue to an item of treasure. Some clue sheets also contain two further clues, which lead to more clue sheets. With each treasure clue there is an indication of the value of the treasure at that location.”

“You have until 6 o’clock this evening to find as much treasure as you can. The team with the most valuable hoard of treasure will be the winner. The first clue is outside in the garden. See you back here in the Carpenter’s Arms at 6 o’clock. Good luck everybody!”

Planning the Route

There were three teams: four trainee nurses called the Pre-Tenders, three employees of the Royal Mail called the Post Men and two publicans called the Inn Keepers. The Pre-Tenders decided to do the easy clues first; the Post Men chose to visit the nearest places first; and the Inn Keepers settled for finding the most valuable treasure first.

Overload readers will have spotted immediately that the treasure hunters’ problem involves the traversal of an abstract binary tree. The Walrus had drawn the tree on a sheet of paper so that he could refer to it when he was adding up the scores at the end of the day. And, as it turned out, the Pre-Tenders would visit the treasure locations in pre-order sequence, the Post Men in post-order sequence and the Inn Keepers in in-order sequence.

Encoding the Problem

Bill, a nerdy-looking youth with thick oyster-shell glasses had spotted this, too. He was a C# programmer and was often to be seen in the corner of the bar with a beer and a lap-top. To him the treasure hunters’ tree seemed to be a set of dynamically constructed, garbage collected, polymorphic objects. “It’s a binary tree”, he said to his best mate, Ben. “Yes”, agreed Ben, but Ben’s mental imagery was very different. “Nested structs”, said Ben.

struct L
{
    char a, i;
};
struct C
{
    L l;
    char e;
};

Listing 1 - A simple tree as nested structs.

Bill looked at him blankly for a moment, decided Ben must have been joking and replied with an ironic, “Yeah, right”. But Ben was a C++ programmer and he wasn’t joking. “No, really” said Ben, pulling out his own lap-top, “Look, I’ll show you what I mean”.

Ben drew a tree with just five nodes, A, L, I, C and E, as shown in Listing 1. He defined classes for the root node, C, and the other non-leaf node L. For the leaf nodes he just used chars .

Bill was not impressed at all. “I thought you were a C++ programmer”, he retorted. “That’s C code. Where are the classes? Where are the methods? What’s that public data doing there?” “This just captures the structure of the tree”, Ben explained. “I’ve used a char as a place-holder for the treasure clues and the value of the treasure.”

Bill wasn’t convinced. Inserting his Design Patterns CD into his PC he brought up the Composite structure diagram and typed in an improved version of Ben’s classes. “Even I can write better C++ than that”, Bill scoffed. Referring to the example in the Design Patterns book, Bill wrote the code shown in Listing 2. Now it was Ben’s turn to be unimpressed. Bill’s code was more than three times longer (32 non-blank lines to 9), it had more types in more complex relationships, it brought up apparently irrelevant implementation issues (such as using vectors or lists and how to implement Leaf::push_back() ) and it was less efficient in time and space (particularly if the tree nodes were allocated on the heap). But Ben was more interested in other things.

class Component
{
public:
    virtual ~Component() {}
    virtual void operation() = 0;
    virtual void push_back(Component*) = 0;

    typedef std::vector<Component*>::iterator Iterator;
    virtual Iterator begin() = 0;
    virtual Iterator end() = 0;
};

class Leaf : public Component
{
public:
    Leaf(int v) 
    : leaf_value(v) {}

    int value()
    {
        return leaf_value;
    }

private:
    virtual void operation()
    {
        /* . . . */
    }

    virtual void push_back(Component*)
    {
        throw "Can't add to Leaf!";
    }

    virtual Iterator begin()
    {
        return children.begin();
    }

    virtual Iterator end()
    {
        return children.end();
    }

    int leaf_value;
    std::vector<Component*> children;
    // always empty
};

class Composite : public Component
{
private:
    virtual void operation()
    {
        /* . . . */
    }

    virtual void push_back(Component* c)
    {
        children.push_back(c);
    }

    virtual Iterator begin()
    {
        return children.begin();
    }

    virtual Iterator end()
    {
        return children.end();
    }

    std::vector<Component*> children;
};

Listing 2 - Classes for the Composite Pattern in C++.

“OK”, said Ben, in an uncharacteristically conciliatory tone, “but your tree has to be built at run time and it gives special status to Component::operation() . What if the tree structure is fixed at compile time? And what if we want to capture the tree structure, but don’t know what operations will be performed on the nodes?” Bill’s face fell. He could see what was coming. Any second now Ben was going to say something about those confounded C++ templates. Two of the trainee nurses from the Pre-Tenders team had been powdering their noses. As they came out of the Ladies and left the pub to start the treasure hunt Bill’s face brightened momentarily. But his brief reverie was broken by Ben’s words. “We should be able to write a function template that applies an arbitrary function to each node in a tree. Any tree. Something like this.” And Ben typed in the code in Listing 3.

int main()
{
    C tree = /* ... */;
    visit<in_order>(tree, Write(cout));
    return 0;
}

Listing 3 - A generic tree visit function.

“Let’s stick with our 5-node tree and let’s not worry about how it’s created, for now. Write is a function object that writes information about a node to an ostream . And visit is a function template that walks the tree and calls a function ( Write in this case) for each node. The in_order template parameter is a policy class that specifies in-order traversal.”

The Generalised Visit Algorithm

Thinking out loud, Ben said: “When we visit a node we need to apply the given function to the node itself and each of its child nodes. The parent defines a natural order on the children and we’ll use that to visit each child in turn. The traversal policy defines when to visit the parent (before the children, after the children or somewhere in between). It’s like having a parent that’s an STL container of children and the traversal policy provides an iterator into the container – except that it all happens at compile time. So the algorithm is:

visit the children in the range [first, pvp)
visit the parent
visit the children in the range [pvp, last) where pvp is the ‘parent visit position’ iterator obtained from the traversal policy.”

Ben’s code is shown in Listing 4.

template<typename Traversal, typename Node, typename Op>

void visit(Node& root, Op op)
{
    typedef typename begin<Node>::type first;
    typedef typename end<Node>::type last;
    typedef typename Traversal::template apply<Node>::type pvp;

    children<first, pvp>::template visit_each<Traversal>(root, op);
    op(root);
    children<pvp, last>::template visit_each<Traversal>(root, op);
}

Listing 4 - The generalised visit algorithm.

At first this looked more like hieroglyphics than source code to Bill. He could see Ben’s “iterators” first , last and pvp , but they were types and real iterators are objects . He could see function calls that seemed to correspond to the three steps of the visit algorithm, too, but it was far from clear how the visit_each functions would work. The “iterators” were being used to instantiate the children class template instead of being passed to the visit_each function, so how could the function step through the children as the algorithm requires?

Compile-Time Iterators

Ben was well aware that the code needed some explanation. “Compile time algorithms work with constants and types, which are both immutable”, he went on. “There’s no such thing as a compile-time variable. Instead of incrementing an iterator we must create a new iterator pointing to the next element in the container. For example, the root node in our 5-node tree has two children: l and e. We could think of using a pointer-to-memberof-C as an iterator. Then an iterator to the first child of C would point to C::l and we can imagine incrementing that iterator to point to C::e . But C++ doesn’t have a pointer-to-member-of-C type – it only has pointer-to-member-of-C-of-type-L ( L C::* ); and that can’t be incremented because it would change type in the process (becoming int C::* ).”

Bill wasn’t sure if he understood this, but he wasn’t going to admit it, so he let Ben continue. “Instead we can define a metafunction that takes a compile-time iterator as a template parameter and generates a new compile-time iterator pointing to the next element. Something like this.” Ben produced listing 5.

template<typename C, typename M, M C::*> 
struct next;

template<>
struct next<C, L, &C::l>
{
    static int L::* const value;
};

int C::* const next<C, L, &C::l>::value = &C::e;

Listing 5 - The compile-time analogue of incrementing an iterator.

“We pass in &C::l (as a template parameter) and we get out &C::e (as a class static value).”

“But that’s hideous”, said Bill. “It is pretty ugly”, admitted Ben. “And it only works for child nodes that are accessible data members of the parent node, too. That’s very limiting. But the visit function doesn’t use pointer-to-member values as iterators – it uses types, and that’s much more flexible.”

Ben was enjoying himself. “Suppose L was a base class of C instead of a member”, he enthused. “We can’t use a pointer-tomember value to identify a base class, but we can encode a pointer-to-member value as a type.” (In fact, the next<> metafunction in Listing 5 does just that.) “And then we can use types as iterators to base classes and data members. It’s probably worth declaring class templates for these two families of types: member_iterator<> points to a node that’s a member of its parent and base_iterator<> points to a node that’s a base class sub-object of its parent.”

“I’ll refer to these collectively as ‘child iterators’. The begin<> and end<> meta-functions can return classes instantiated from these templates for the visit<> function to use. Those metafunctions are the compile-time analogue of the begin() and end() functions provided by STL containers.”

“OK”, said Bill slowly, still far from convinced, “We can’t increment these iterators, but we still have to compare them and dereference them, right? How do we do that?”

Iterating Through the Children

“That’s a good question”, replied Ben. “We need to see how the visit_each<> function works to answer that.”

“As you can see, there are no loop constructs. There can’t be because loops require a loop variable and there are no compile-time variables. So we use recursion instead. The idea is to visit the first child (by calling the visit<> function) and then call visit_each<> recursively to visit all the remaining children.” Ben paused to take another gulp of his Black Sheep bitter and Bill seized the opportunity to ask why visit_each<> is a static function of a template class. “I’ll come to that in a minute”, Bill responded, “but for now you just need to know that the primary children<> template defines the general case and the specialisation provides the terminating condition for the recursion.”

Ben pondered for a moment before continuing. “If First is a member_iterator<> the visit_each<> function can retrieve a pointer-to-member value from it, which must be bound to an object using the .* or ->* operator before we can access the data member it refers to. We can’t store an object reference or pointer in the iterator because it’s a compile-time iterator and references and pointers are run-time entities – they don’t have a value at compile time. And that means we can’t dereference a member_iterator<> – it doesn’t hold enough information. So, the visit_each<> function must bind a compile-time iterator to a (run-time) reference to the parent node, which produces a reference to the child node pointed to by the iterator. And, of course, if we have to do this for member_iterator<> s we should do the same for other child iterators such as base_iterator<> s.”

template<typename C, typename M, M C::*> struct member_iterator;
template<typename D, typename B> struct base_iterator;

Listing 6 - Member and base class iterators.

“Is that clear?”, asked Ben. “Errm, I think so”, said Bill, who wasn’t used to thinking about the interaction of compile-time operations with the run-time environment. “So, a child iterator is a type, not an object; it can’t be incremented; and it can’t be dereferenced. That’s a weird sort of iterator!” Ben agreed, “Weird, yes, but it’s still an iterator in my book”.

Bill was studying the visit_each<> function and another question occurred to him: “ bind<> seems to be a class template with a nested function. Why isn’t it just a template function?” “Ah”, said Ben, “that’s so that we can define separate bind functions for member-iterators and base-iterators. If C++ supported partial specialisation of function templates we could define partial specialisations for each of the two iterator families; but it doesn’t, so I’ve just put a static function in a class template and defined partial specialisations of the class template.”

template<typename First, typename Last>
struct children
{
    template<typename Traversal, typename Node, typename Op>
    static void visit_each(Node& node, Op op)
    {
        visit<Traversal>(bind<First>::function(node), op);

        typedef typename next<First>::type Next;

        children<Next,Last>::template visit_each<Traversal>(node, op);
    }
};

template<typename Iter>
struct children<Iter, Iter>
{
    template<typename Traversal, typename Node, typename Op>
    static void visit_each(Node&, Op) {}
};

Listing 7 - The visit_each<> function.

Hardly pausing for breath Ben continued his commentary. “Here, the parent<> and child<> meta-functions just extract the type of the parent and child nodes from the child iterator.” (See Appendix 1.) “The member_iterator<> specialisation uses the .* operator to bind a pointer-to-member to the parent; the base_iterator<> specialisation just uses the implicit conversion from derived class reference to base class reference.” (Although Ben didn’t think to mention it, he could have used the Boost enable_if<> templates to define separate overloads of the bind function instead.)

template<typename Iter> struct bind;

template<typename C, typename M, M C::* member>
struct bind< member_iterator<C, M, member> >
{
    typedef member_iterator<C, M, member> Iter;
    typedef typename child<Iter>::type Child;
    typedef typename parent<Iter>::type Parent;

    static Child& function(Parent& parent)
    {
        return parent.*member;
    }
};

template<typename D, typename B>
struct bind< base_iterator<D,B> >
{
    typedef base_iterator<D,B> Iter;
    typedef typename child<Iter>::type Child;
    typedef typename parent<Iter>::type Parent;

    static Child& function(Parent& parent)
    {
        return parent;
    }
};

Listing 8 - Simulated partial specialisations of the bind function.

Ben leant back in his chair with a satisfied smile. “The rest is easy”, he said. “The visit_each<> function uses the next<> meta-function to generate an iterator to the next child node and calls itself to visit the remaining children (if any). Eventually, the two iterators become equal and the children<Iter,Iter> specialisation comes into play. At that point an empty visit_each<> function is instantiated and the recursion terminates.”

Bill felt cheated. “So we don’t compare the iterators, either”, he said tetchily. “At least, not explicitly.” “Well, no, I suppose not”, Ben replied with an air of superiority, “but that’s how it is with compile-time algorithms.”

Epilogue

“Glad to see everyone back on time”, said the Walrus cheerily. “Now, let’s see how you all got on.” And with that he collected the clue sheets and sat down to work out what the teams had scored. Some 20 minutes later he stood up again and gravely announced that he had a problem – all three teams had the same score! “So I’ll offer a bonus point to the first team to find Tweedledum and Tweedledee.” One of the Pre-Tenders gave a little squeal, waved one hand in the air and pointed excitedly with the other in the direction of Bill and Ben. “It’s them”, she cried. And everyone could see she was right. They looked like two peas in a pod and they were arguing like schoolboys.

“Compile-time is better”, said Ben. “Nah, run-time”, insisted Bill. “Compile-time.” “Run-time.” ...

Phil Bass
phil@stoneymanor.demon.co.uk

Full code listings provided in appendices.

Appendix 1 – Visit.hpp

The following code implements the compile-time visit algorithm itself.

struct nil {};
template<typename Iter> struct parent;
template<typename Iter> struct child;
template<typename Iter> struct bind;
template<typename Iter> struct next;
template<typename C, typename M, M C::*> struct member_iterator;

template<typename C, typename M, M C::* member>
struct parent< member_iterator<C,M,member> >
{
    typedef C type;
};

template<typename C, typename M, M C::* member>
struct child< member_iterator<C,M,member> >
{
    typedef M type;
};

template<typename C, typename M, M C::* member>
struct bind< member_iterator<C, M, member> >
{
    typedef member_iterator<C, M, member> Iter;
    typedef typename child<Iter>::type Child;
    typedef typename parent<Iter>::type Parent;

    static Child& function(Parent& parent)
    {
        return parent.*member;
    }
};

template<typename D, typename B> struct base_iterator;

template<typename D, typename B>
struct parent< base_iterator<D,B> >
{
    typedef D type;
};

template<typename D, typename B>
struct child< base_iterator<D,B> >
{
    typedef B type;
};

template<typename D, typename B>
struct bind< base_iterator<D,B> >
{
    typedef base_iterator<D,B> Iter;
    typedef typename child<Iter>::type Child;
    typedef typename parent<Iter>::type Parent;

    static Child& function(Parent& parent)
    {
        return parent;
    }
};

template<typename T> struct begin
{
    typedef nil type;
};

template<typename T> struct end
{
    typedef nil type;
};

template<typename First, typename Last> struct children;

template<typename Traversal, typename Node, typename Op>
void visit(Node& root, Op op)
{
    typedef typename begin<Node>::type first;
    typedef typename end<Node>::type last;
    typedef typename Traversal::template apply<Node>::type pvp;

    children<first, pvp> ::template visit_each<Traversal>(root, op);
    op(root);
    children<pvp, last> ::template visit_each<Traversal>(root, op);
}

template<typename First, typename Last>
struct children
{
    template<typename Traversal, typename Node, typename Op>
    static
    void visit_each(Node& node, Op op)
    {
        visit<Traversal> (bind<First>::function(node), op);
        typedef typename next<First>::type Next;
        children<Next,Last> ::template visit_each<Traversal>(node, op);
    }
};

template<typename Iter>
struct children<Iter, Iter>
{
    template<typename Traversal, typename Node, typename Op>
    static void visit_each(Node&, Op) {}
};

Appendix 2 - Visiting Alice (with children as nested structs)

A program using the visit algorithm described here needs to provide some information about the structure of the tree whose nodes are to be visited. This is done by defining suitable child-iterator types, the next meta-function, the begin metafunction and specialisations of the traversal order policy templates. The following code shows the definitions for the ALICE tree used in the main text.

struct L
{
    char a, i;
};

typedef member_iterator<L, char, &L::a> L_a_iterator;
typedef member_iterator<L, char, &L::i> L_i_iterator;

template<> struct next<L_a_iterator> { typedef L_i_iterator type; };
template<> struct next<L_i_iterator> { typedef nil          type; };
template<> struct begin<L>           { typedef L_a_iterator type; };

struct C
{
    L l;
    char e;
};

typedef member_iterator<C, L,    &C::l> C_l_iterator;
typedef member_iterator<C, char, &C::e> C_e_iterator;

template<> struct next<C_l_iterator> { typedef C_e_iterator type; };
template<> struct next<C_e_iterator> { typedef nil          type; };
template<> struct begin<C>           { typedef C_l_iterator type; };

struct in_order
{
    template<typename T> struct apply { typedef nil type; };
};

template<> struct in_order::apply<C> { typedef C_e_iterator type; };
template<> struct in_order::apply<L> { typedef L_i_iterator type; };

The traversal policy is in the form of a meta-function class (a class with a nested class template). This gives the flexibility of template template parameters while still allowing the policy to be used by compile-time algorithms operating on types.

Appendix 3 - Visiting Alice (with a base class as a child)

The case of the ALICE tree where L is a base class of C is only slightly different. The root node, C, needs a constructor and the child-iterator pointing to node L becomes a base_iterator<>. The following code shows the support required for node C (most of which is the same as for the example in which L is a member of C).

struct C : L
{
    C(const L& l, char e_) : L(l), e(e_) {}
    char e;
};

typedef base_iterator<C, L> C_l_iterator;
typedef member_iterator<C, char, &C::e> C_e_iterator;

template<> struct next<C_l_iterator> { typedef C_e_iterator type; };
template<> struct next<C_e_iterator> { typedef nil          type; };
template<> struct begin<C>           { typedef C_l_iterator type; };
template<> struct in_order::apply<C> { typedef C_e_iterator type; };

For compile-time trees defined using tuples and inheritance it is possible to provide the child iterators in a library. For example, the ALICE tree could be defined as:

#include <boost/tuple.hpp>
using boost::tuple;
typedef tuple<char,char> L;
typedef tuple<L,char> C;
C tree = C( L('A','I'), 'E' );

Since Boost tuples are implemented using inheritance it is easy to provide predefined base-iterators and meta-functions supporting the visit<> algorithm. In this case, no information about the structure of the tree needs to be provided in the client code; visiting Alice comes for free.

After Lewis Caroll’s “The Walrus and the Carpenter”. By the way, he lied about the kings. ↩