This article is the result of the conversations between the two authors (Phil Bass and Stefan Heinzmann) that were triggered by the latter's article in this very issue of Overload [ Heinzmann ]. Stefan originally wanted to have the resolution of the problems outlined in his article published in the next issue in order to keep you in creative suspension for a while, but the Editor found that to be too cruel, so we tried to finish this article in record time.
Article [ Heinzmann ] ended with 4 unsolved problems which are repeated here:
-
It requires the ugly cast for passing the null pointer as the third argument to lookup
-
lookup returns the result by value, which can be inefficient
-
It is still unclear why I couldn't use the typedef s from std::binary_function in the LessKey predicate (only with GCC)
-
Neither do I know why the compiler wanted to convert the wrong way between Val and EVal
The first two problems relate to the lookup function template, while the last two problems relate to the LessKey predicate. In addition, the text itself lists as a fifth problem: How to ensure at compile time that the lookup table is sorted. Just for a change, let's tackle those problems in reverse order.
Ensuring the Lookup Table is Sorted
Thaddaeus Frogley suggested that rather than sorting the table at compile time, which may be impossible, the debug build could check the sorting when the program starts up. You would need to write a function that checks the table sorting and call it in an assert macro that is run at program startup. As an additional service, the checking code could generate a sorted version of the table in a format that can be cut and pasted into the source code, so that the burden of keeping the table sorted is not on the struggling programmer. We'll not actually show any code for this here, as we believe that it is fairly straightforward. Furthermore, this issue was not the main point in [ Heinzmann ] anyway.
The LessKey Predicate
Here, we owe you an explanation why the typedef s in the binary_function base class could not be used in the definition of LessKey::operator() . It turns out that this is because of the name lookup rules, namely two-phase lookup. If you want to know the full story you need to turn to The Book [ Vandevoorde- ] chapter 9.4.2, but here's the bottom line:
As std::binary_function is a base class for LessKey that depends on LessKey 's template parameters, it is called a dependent base class. The C++ standard says that nondependent names are not looked up in dependent base classes. Hence a standard conforming compiler will not find result_type and its siblings and emit an error message. The error messages that were actually emitted by GCC weren't particularly helpful, in particular the mentioning of typename was misleading, but the code was definitely wrong. So what can we do about it? There are a number of possibilities here, though none are particularly appealing:
-
Dodge the issue in the way Stefan did it, i.e. by not using result_type etc. inside LessKey . The downside of this is that the fairly complicated element type must be written out several times.
-
Explicitly qualify the typedef names with the name of the base class. This makes the names dependent, therefore they are looked up and found in the base class. However that removes the whole point of using the typedef s, because the base class name is a complicated template.
-
Bring the typedef names into the scope of the derived class with a using declaration. That doesn't save any typing either, because the base class name needs to be spelled out, too.
-
Add another typedef to the derived class. That is also fairly verbose, so it may not be an improvement over the first solution.
That's disappointing, isn't it? It means that deriving from std::binary_function in order to make the predicate adaptable according to the rules of the STL is only half as useful as you'd wish it to be. It makes you wonder whether you want to derive from std::binary_function at all. After all, you can make your predicate adaptable by providing the required typedefs yourself, like this:
templatestruct LessKey { typedef bool result_type; typedef const Pair &first_argument_type; typedef first_argument_type second_argument_type; result_type operator()(first_argument_type a, second_argument_type b) const { return a.key < b.key; } };
If you don't want the predicate to be adaptable, you can apply a clever trick that appeared in [ Sposato ]: You make the predicate work on the key type directly. This removes the need to construct an element with all its associated problems mentioned in [ Heinzmann ]. If we needn't construct an element, we need no default constructor for the Val type either, which is an additional advantage. Here's the code:
templatestruct CleverLessKey { typedef const Pair Elem; bool operator()(Elem &elem, const Key &key) const { return elem.key < key; } bool operator()(const Key &key, Elem &elem) const { return key < elem.key; } };
Note the overloading of operator() to allow passing the arguments in any order. Inside the lookup function template the key can now be passed directly to equal_range :
templateEVal lookup(const Pair (&tbl)[n], const Key &key, const Val &def) { typedef CleverLessKey Pred; typedef const Pair Elem; std::pair range = std::equal_range(tbl, tbl+n, key, Pred()); if(range.first != range.second) return range.first->val; return def; }
This bends the intuitive rule of what a good predicate is; ordinarily you would think both argument types had to be the same, but as far as we know there's nothing in the C++ standard that would make this illegal. It certainly works with VC++ 7.1 and GCC 3.3.
The Comeau compiler [ comeau ] disagrees, however. Apparently, the library implementation used by Comeau contains compile-time concept checks that verify whether the two argument types of the predicate are the same. As ours are not, those checks fail and the compiler rejects the code. We feel that this is overly restrictive.
That leads us to the weird error mentioned in [ Heinzmann ] where the compiler apparently wanted to convert an int to a const char [4] . We owe you an explanation here, too. Let us repeat the relevant code where the error occurs:
templateEVal lookup(const Pair (&tbl)[n], const Key &key, const Val &def) { typedef LessKey Pred; typedef const Pair Elem; Elem entry = { key, Val() }; // error here std::pair range = std::equal_range(tbl, tbl+n, entry, Pred()); if(range.first != range.second) return range.first->val; return def; }
The error message was strange, but there's indeed an error. The compiler correctly deduces the following type for Val , namely const char [4] . That means that Val() tries to default construct a temporary of type const char [4] , which is impossible. No conversion from int to const char [4] is involved, the error displayed by the compiler is again rather misleading here.
The fix employed in [ Heinzmann ] was correct, but there is another, more elegant possibility:
templateEVal lookup(const Pair (&tbl)[n], const Key &key, const Val &def) { typedef LessKey Pred; typedef const Pair Elem; Elem entry = { key }; // second part of Pair // omitted std::pair range = std::equal_range(tbl, tbl+n, entry, Pred()); if(range.first != range.second) return range.first->val; return def; }
As you may have noted, it is not actually necessary to explicitly initialize the val member of the variable entry , as we don't use it anyway. Omitting the initializer for val causes it to be value-initialized, which in practice does the same as calling EVal() explicitly.
Anyway, we will use our clever predicate henceforth, despite the problems with the Comeau compiler.
The lookup Function Template
Now that we've got those niggles out of the way, we
can turn to the remaining problems. Remember what the
question was: Given the following program fragment, how can
we implement the
lookup()
function
as a function template that works for arbitrary
instantiations of
Pair
templatestruct Pair { Key key; Val val; }; const Pair table[] = { { 0, "Ok" }, { 6, "Minor glitch in self-destruction module" }, { 13, "Error logging printer out of paper" }, { 101, "Emergency cooling system inoperable" }, { 2349, "Dangerous substances released" }, { 32767, "Game over, you lost" } }; int main() { const char *result = lookup(table,6,(char*)0); std::cout << (result ? result : "not found") << std::endl; }
The implementation we arrived at in the last chapter still has those two problems:
-
It requires an ugly cast for passing the null pointer as the third argument to lookup.
-
lookup returns the result by value, which can be inefficient.
Replacing equal_range by lower_bound
First, note that the lookup() function given above is based on std::equal_range() , but actually behaves more like std::lower_bound() . It returns the mapped value of the first element matching the key, if any. Since lower_bound() is a simpler and slightly more efficient algorithm we will use it in the subsequent code samples. However, it is a bit trickier to use, as you'll see shortly.
While equal_range returns a pair of iterators, lower_bound returns just one. If there are suitable elements (according to the predicate) in the collection, the returned iterator points to the first one found. Otherwise it points to the place where such an element could be inserted without breaking the sorting order. This makes it more complicated to test for success. Here's what lookup would look like using lower_bound instead of equal_range :
templateEVal lookup(const Pair (&tbl)[n], const Key &key, const Val &def) { CleverLessKey pred; const Pair *pos = std::lower_bound(tbl, tbl+n, key, pred); if(pos == tbl+n || pred(key,*pos)) return def; return pos->val; }
Generalization for Arbitrary Maps
Lets take a step back and look at the problem from a generic angle: Why restrict ourselves to arrays? Surely, it should be possible to write lookup() so that it works with any map-like container. In particular, we would expect to be able to write:
// ... table definition, etc. as before ... using namespace std; int main() { mapmessage_map; message_map[6] = "Minor glitch in self-destruction module"; cout << lookup(message_map, 6, "not found”) << endl; cout << lookup(message_map, 6, 0) << endl; cout << lookup(table, 6, "not found") << endl; cout << lookup(table, 6, 0) << endl; }
Although this is a slightly different problem, it points the way towards a solution to the original problem that removes the remaining blemishes in Stefan's code.
The lookup function needs to be implemented in quite different ways for maps and arrays. For a map, lookup should call std::map<>::find() ; for an array, it should call std::lower_bound() . These two functions have different interfaces. The former is a member function taking a single parameter; the latter is a non-member function taking four parameters (for the overload we need). Somehow, the lookup function template must deduce these differences from the type of the container passed to it.
In principle, we could just provide separate overloads for maps and arrays:
templateconst Val& lookup(const std::map &, const Key&, const Val&); template const Val& lookup(const Pair (&)[n], const Key&, const Val&);
But that gets us back to where we came in. Stefan's article was all about the difficulties of implementing the second of those lookup() function overloads.
An alternative approach treats lookup as an algorithm that can be applied to any map-like container. There is a single lookup() function, but there can be several types of “map”. Or, more precisely, we define a generic Map concept, write the lookup() function template in terms of the Map interface and provide as many implementations of the Map concept as we need (including std::map<> s and arrays of Pair<> s).
// pseudo-code templateconst MapVal& lookup(const Map&, const MapKey&, const MapVal&);
Here, we have used MapKey and MapVal to stand for the Map's key and mapped value types. In real C++ code these types would be deduced from the Map type. In the case of std::map<> the MapKey and MapVal types are immediately available: MapKey is std::map<>::key_type and MapVal is std::map<>::mapped_type . In the case of arrays things are less straightforward. Arrays are not classes, so we can't add nested typedef s. Instead we can use the traits class technique .
templateconst typename map_traits
Now we can put information about the differences between maps and arrays in the traits class and use that information in our implementation of lookup() .
Note that now there's only one template parameter for the compiler to deduce: The type of the map itself. Only the first argument to the lookup function participates in this deduction, as the types of the other arguments are dependent on the result of this deduction. This greatly reduces the chance for deduction problems such as ambiguities.
Implementation of lookup() Traits
As noted above, lookup() will need to call either std::map<>::find() or std::lower_bound() . A map_traits<>::find() function is introduced to hide this from the lookup() function itself. Then lookup() needs to check whether the key was found and return either the default value or the value part of the element matching the key. Here, we use a map_traits<>::end() function to get the appropriate past-the-end iterator for the ‘key found' test. Retrieving the mapped value via an iterator depends on the element type (not the map type), so an element traits template with a mapped_value() member is used for that.
The traits functions represent operations that might be useful in other contexts. In those cases writing
map_element_traits::mapped_value(element)
for example, is both tedious and verbose. So, we provide nonmember wrapper functions to simplify the code in these situations and take advantage of them in the lookup() function itself.
// The mapped_value() convenience function templateinline const typename map_element_traits ::mapped_type& mapped_value(const Elem& element) { return map_element_traits ::mapped_value(element); } // The find() convenience function template inline typename map_traits ::const_iterator find(const Map& map, const typename map_traits ::key_type& key) { return map_traits ::find(map, key); } // The lookup() function template const typename map_traits ::mapped_type& lookup(const Map& map, const typename map_traits ::key_type& target_key, const typename map_traits ::mapped_type& default_value) { typename map_traits ::const_iterator i = find(map, target_key); return (i == end(map)) ? default_value : mapped_value(*i); }
The traits templates are declared (but not defined) and then specializations are defined for each of the map-like containers we wish to support. This prevents the instantiation of the traits templates with unexpected template arguments (e.g. via template argument deduction), which should help to minimize the number of incomprehensible error messages if the programmer makes a mistake.
templatestruct map_element_traits; template struct map_traits;
Traits for std::map<>
The traits templates for std::map<> are straightforward. A partial specialisation of the map element traits template is provided for any std::pair<> .
templatestruct map_element_traits< std::pair > { typedef std::pair value_type; typedef typename value_type:: first_type key_type; typedef typename value_type::second_type mapped_type; static const key_type& key( const value_type& element) { return element.first; } static const mapped_type& mapped_value( const value_type& element) { return element.second; } };
Similarly, a partial specialization of the map traits template is provided for any std::map<> .
templatestruct map_traits< std::map > { typedef std::map map_type; typedef typename map_type::key_type key_type; typedef typename map_type::mapped_type mapped_type; typedef typename map_type::value_type value_type; typedef typename map_type::const_iterator const_iterator; static const_iterator begin(const map_type& map) { return map.begin(); } static const_iterator end(const map_type& map) { return map.end(); } static const_iterator find(const map_type& map, const key_type& key) { return map.find(key); } };
These traits classes adapt std::map for use with our lookup() function.
Traits for Arrays of Key,Value Pairs
The traits for arrays of
Pair
templatestruct map_element_traits< Pair > { typedef Pair value_type; typedef Key key_type; typedef Val mapped_type; static const key_type& key( const value_type& element) { return element.key; } static const mapped_type& mapped_value( const value_type& element) { return element.val; } };
And there is a partial specialisation of
map_traits<>
for arrays of
Pair
templatestruct map_traits< Pair [n] > { typedef Key key_type; typedef Val mapped_type; typedef Pair value_type; typedef const value_type* const_iterator; static const_iterator begin( const value_type (&map)[n]) { return &map[0]; } static const_iterator end( const value_type (&map)[n]) { return &map[n]; } static const_iterator find( const value_type (&map)[n], const key_type& target_key) { const_iterator i = std::lower_bound(begin(map), end(map), target_key, key_value_compare ()); return (i == end(map) || key(*i) != target_key) ? end(map) : i; } };
The find() function uses lower_bound() passing a key comparison predicate with two asymmetric function call operators. The predicate class is generated from the following template:
templatestruct key_value_compare { typedef typename map_element_traits ::key_type key_type; typedef typename map_element_traits ::value_type value_type; bool operator()(const value_type& x, const value_type& y) const { return map_element_traits ::key(x) < map_element_traits ::key(y); } bool operator()(const value_type& elem, const key_type& key) const { return map_element_traits ::key(elem) < key; } bool operator()(const key_type& key, const value_type& elem) const { return key < map_element_traits ::key(elem); } };
This is a generalisation of the CleverLessKey class shown above. It uses the key() function from the map element traits to ensure that the predicate class can be generated for any element of a map-like class and only those elements. A third operator() overload is provided so that two elements can be compared to each other. We omitted the additional typedef s needed for adaptability.
Problem Solved! - Problem Solved?
The code presented in the previous sections does fix all
the imperfections in Stefan's version. At least, it
does if you are using the GCC compiler (Code tested on gcc
3.2 and 3.3.). But VC++ 7.1 produces an error. It turns out
that it fails to find the
map_traits
specialization for
Pair
So here is the entire code in all its glory:
#include#include #include #include // Generic map element declarations. template struct map_element_traits; template inline const typename map_element_traits ::mapped_type& mapped_value(const Elem& element) { return map_element_traits ::mapped_value(element); } template inline const typename map_element_traits ::key_type& key(const Elem& element) { return map_element_traits ::key(element); } template struct key_value_compare { typedef typename map_element_traits ::key_type key_type; typedef typename map_element_traits ::value_type value_type; bool operator()(const value_type& x, const value_type& y) const { return map_element_traits ::key(x) < map_element_traits ::key(y); } bool operator()(const value_type& elem, const key_type& key) const { return map_element_traits ::key(elem) < key; } bool operator()(const key_type& key, const value_type& elem) const { return key < map_element_traits ::key(elem); } }; // Generic map declarations. template struct map_traits; template inline typename map_traits ::const_iterator find(const Map& map, const typename map_traits ::key_type& key) { return map_traits ::find(map, key); } template inline typename map_traits ::const_iterator end(Map const& map) { return map_traits ::end(map); } // The lookup() function template const typename map_traits ::mapped_type& lookup(const Map& map, const typename map_traits ::key_type& target_key, const typename map_traits ::mapped_type& default_value) { typename map_traits ::const_iterator pos = find(map, target_key); return (pos == end(map)) ? default_value : mapped_value(*pos); } // specializations for std::map template struct map_element_traits< std::pair > { typedef std::pair value_type; typedef typename value_type:: first_type key_type; typedef typename value_type::second_type mapped_type; static const key_type& key(const value_type& element) { return element.first; } static const mapped_type& mapped_value( const value_type& element) { return element.second; } }; template struct map_traits< std::map > { typedef std::map map_type; typedef typename map_type::key_type key_type; typedef typename map_type::mapped_type mapped_type; typedef typename map_type::value_type value_type; typedef typename map_type::const_iterator const_iterator; static const_iterator begin(const map_type& map) { return map.begin(); } static const_iterator end(const map_type& map) { return map.end(); } static const_iterator find(const map_type& map, const key_type& key) { return map.find(key); } }; // Our own Pair type suitable for aggregate // initialization template struct Pair { Key key; Val val; }; // Specializations for Pair template struct map_element_traits< Pair > { typedef Pair value_type; typedef Key key_type; typedef Val mapped_type; static const key_type& key( const value_type& element) { return element.key; } static const mapped_type& mapped_value( const value_type& element) { return element.val; } }; template struct map_traits< Pair [n] > { // for GCC typedef Key key_type; typedef Val mapped_type; typedef Pair value_type; typedef const value_type* const_iterator; static const_iterator begin( const value_type (&map)[n]) { return &map[0]; } static const_iterator end( const value_type (&map)[n]) { return &map[n]; } static const_iterator find( const value_type (&map)[n], const key_type& target_key) { const_iterator i = std::lower_bound( begin(map), end(map), target_key, key_value_compare ()); return (i == end(map) || key(*i) != target_key) ? end(map) : i; } }; template struct map_traits< const Pair [n] > { // for VC++ typedef Key key_type; typedef Val mapped_type; typedef Pair value_type; typedef const value_type* const_iterator; static const_iterator begin( const value_type (&map)[n]) { return &map[0]; } static const_iterator end( const value_type (&map)[n]) { return &map[n]; } static const_iterator find( const value_type (&map)[n], const key_type& target_key) { const_iterator i = std::lower_bound( begin(map), end(map), target_key, key_value_compare ()); return (i == end(map) || key(*i) != target_key) ? end(map) : i; } }; // Test code typedef const Pair Elem; Elem table[] = { { 0, "Ok" }, { 6, "Minor glitch in self-destruction module" }, { 13, "Error logging printer out of paper" }, { 101, "Emergency cooling system inoperable" }, { 2349, "Dangerous substances released" }, { 32767, "Game over, you lost" } }; using namespace std; int main() { map message_map; message_map[6] = "Minor glitch in self-destruction module"; const char *result = lookup(message_map, 6, "not found"); cout << "lookup(map, 6, \"not found\") = " << result << endl; result = lookup(message_map, 6, 0); cout << "lookup(map, 6, 0) = " << result << endl; result = lookup(table, 5, "not found"); cout << "lookup(table, 5, \"not found\") = " << result << endl; result = lookup(table, 6, 0); cout << "lookup(table, 6, 0) = " << result << endl; }
Afterwords
Phil's Afterword
First, I completely agree with Stefan that C++ is too big, complicated and difficult to use for most programmers and most organisations. C++ is a great language for developing high quality, flexible and efficient software components - but (and I've said this before) most software is not like that.
The main lesson we can take from this apparently simple exercise is that experience counts. I became involved in Stefan's problem because I felt instinctively that the first step should have been to reduce the number of template parameters, not (as Stefan tried to do) to increase it. How did I know this would help? I'd been there before.
My advice to C++ programmers struggling with templates (or any other part of the language) is this: learn the rules, don't guess; don't panic; simplify the problem as far as you can; think carefully; experiment; and, finally, don't be afraid to ask for help.
If there is a moral to this story I would say it is that software is a very young discipline. As a profession we still have a lot to learn. One of the problems that remains unsolved is how to design a programming language that is easy to learn, is easy to use, is applicable to a wide range of applications and which generates compact and efficient code. In my opinion, C++ is about the state of the art. It makes most of this possible, but it's not always easy.
Phil Bass
Stefan's Afterword
So we've solved all the problems. And I've learned a lot in the process! That's a happy end, isn't it?
I don't think so. Look at what I wanted to achieve at the beginning and what we ended up with. All this is just a clever way to call std::lower_bound , isn't it? (Or the find member function of std::map ). Ok, I'm a bit sarcastic here.
If we subtract the test code and the code related to std::map , which don't really count here, we have written in excess of 100 lines of source code, some of it quite tricky. And most of it is just the scaffolding needed to make std::lower_bound usable in a nice way with constant ROMable key/value pairs. If we include all compiler quirks and crappy error messages, this matter was suitable for filling a fair number of Overload pages.
Clearly there's something amiss here. If this sort of thing does not get easier a lot of programmers will get frustrated by C++.
Stefan Heinzmann
Editor's Afterword
Reading Stefan's contributions to this issue brought back memories for me of an article I wrote nine years ago (Overload 8). There isn't space to reproduce it in this issue, or the editorial response it elicited (this was longer than the article). I won't go into the detail of the arguments, but just quote from the end of that response:
"At the end of the day, I basically agree with Alan - C++ is harder to use than C - and I think his comparison between a Stylophone and a violin is well drawn. I don't blame the language (and I don't really think Alan does either) - I blame IT management for giving everyone a violin and saying "right, now play a tune!" What C++ highlights is the need for better training, better tools and more realistic expectations." - Sean A Corfield
Nine years have passed and nothing significant has changed. C++ is still to hard to use, lacks decent tools and expectations are seldom realistic.
Alan Griffiths
References
[Heinzmann] Stefan Heinzmann: “The tale of a struggling template programmer”, this issue of Overload
[Vandevoorde-] D. Vandevoorde, N. M. Josuttis: C++ Templates: The complete guide , Addison-Wesley 2003
[Sposato] Rich Sposato: “A More Flexible Container”, Overload issue 58 December 2003
[comeau] http://www.comeaucomputing.com