A library should be customisable and have good performance. Matthew Wilson shows how to achieve both.
This article, the second in the series on the new FastFormat formatting library, discusses various ways in which the library can be extended. In doing so, it reveals some important aspects of the FastFormat design, and discusses some of the mechanisms by which it achieves its high performance characteristics.
Introduction
The two main subjects of the article are custom argument types - usually in the form of user-defined classes - and custom sink types. It would be natural to discuss the use of custom argument types first, because that is likely to be the most common way in which the library is extended. Both subjects involve performance considerations, but examining how custom sinks are defined will give you a better appreciation for the internals of FastFormat, and make clearer some of the design decisions that come into play when defining custom argument adaptors.
Before I start on either of those, though, I'm going to have a bit of a soapbox moment, to get you in the mood.
Performance? Really??
One of my fun-but-in-a-different-way day jobs is writing and/or conducting technical interviews for senior engineers, architects and development managers on behalf of my clients. One of my favourite, most revealing, questions is a seemingly simple parsing scenario, with loose similarities to the examples we've already seen. It seems to matter little what programming language candidates wish to employ, the devil is in the detail of the algorithms they choose to use, and the adjustments they make in answer to my (ceaselessly) changing requirements.
Let's look again at the main Professor Yaffle example discussed in part 1, wearing a language-independent programming hat.
std::string forename = "Professor"; char surname[] = "Yaffle"; int age = 134; std::string result; AcmeFormat(result, "My name is %0 %1; I am %2 years old; call me %0", forename, surname, age)
Picture a whiteboard, some coloured pens, a penetrating and relentless interviewer, and just a few minutes to come up with an efficient replacement strategy. If I were to ask you in what form, rather than how, you would effect the replacements, you may well come up with the following:
- Take the "My name is " bit of the format
- Take the forename
- Take the " " bit of the format
- Take the surname
- Take the "; I am " bit of the format
- Take the age and turn it into a string
- Take the " years old; call me " bit of the format
- Take the forename again
- Concatenate them all together
Suppose I then gave you a large-square-grided sheet of paper (memory), a pair of scissors ( malloc() ) and some pens ( memcpy() ) and asked you to produce a 1xN rectangle piece containing exactly the result described above. In that case, I hope that your algorithm would be:
- Calculate the sum of all part lengths to determine the total length of the result, which in this case is 11 + 9 + 1 + 6 + 7 + 3 + 20 + 5 = 58
- Cut out a 1x58 square sheet of paper.
- Copy the exact number of characters for each part of the resulting statement, starting each at the square after the last one in the preceding part.
I hope that you would not use an algorithm such as:
- Cut out a 1x12 piece and write "My name is " in it.
- Repeat Step 1 for the remaining 7 individual parts to give a total of eight pieces.
- Take the first two pieces from this pile, and determine their combined length.
- Cut out a piece of this size.
- Copy in the contents of the first piece.
- Copy in the contents of the second piece, directly after the last square occupied by the first piece's contents, e.g. "My name is Professor"
- Discard the first two pieces in the pile - "My name is " and "Professor" - and place the new piece on top of the pile.
- Repeat steps 3-6 (a further 6 times) until only one piece, the result, remains.
Sound like fun? Definitely not! Not to mention the amount of wasted paper. The second algorithm would not help you in the interview. So why is it that we're prepared to tolerate such things operating in many (if not most) of the world's largest and most important software systems?
The main reason that FastFormat is so much faster than its peers is that they follow the second algorithm, and it follows the first. Neither the FastFormat core nor the application layer do any intermediate memory allocation, copying or concatenation. Naturally, the question is how. That will be discussed as we go through the remaining parts of this series.
Custom sink types
Before we look at the sink mechanism, we have to discuss string slices. A slice is a view onto a contiguous area of memory. In the case of a (character) string, a slice is a read-only view onto an array of character elements forming part, or the whole, of a string.
In FastFormat, a string slice is represented by the type ff_string_slice_t , which is defined as a length + a pointer. ( ff_char_t is char or wchar_t , for multibyte or wide string builds, respectively.)
struct ff_string_slice_t { size_t len; // # of chars ff_char_t* ptr; // ptr to 1st char };
This is the only type understood by the FastFormat core. Furthermore, pointers to arrays of slices are the type that sinks receive to represent the replacement/concatenation results. Let's look at how this works. Say we want to write a sink for the Windows OutputDebugString() API function:
void OutputDebugString(TCHAR const* s);
There are three important things to note about this function. First, it takes a single C-style string, so whatever we pass to it must be nul-terminated. It should also be non-NULL, by the way.
Second, the function outputs to a debug stream potentially shared by all threads on the host system. Although the function itself operates atomically, it means that if you try to do things such as the following it is possible, indeed likely, that the three parts of your output will be interleaved with output from other threads/processes on the system.
void fn(char const* str); { OutputDebugString("fn("); OutputDebugString(str); OutputDebugString(")\n"); ... // rest of fn()
This means that we must combine all parts of the statement, including new-line, if required, before sending it to OutputDebugString() .
Finally, what appears as a single function is actually a #define to one of the following two actual functions, depending on whether the UNICODE pre-processor symbol is defined.
void OutputDebugStringA(char const* s); void OutputDebugStringW(wchar_t const* s); #ifdef UNICODE # define OutputDebugString OutputDebugStringW #else /* ? UNICODE */ # define OutputDebugString OutputDebugStringA #endif /* UNICODE */
Standard string sinks
Before we get into the implementation of the sink for OutputDebugString() , I'd like to walk you through the stock sink support, which works with any type, particularly std::basic_string , which provides the reserve() and append() methods defined in Listing 1. The function is an action shim [ XSTLv1 ] - a composite shim type that both controls and may modify its primary parameter - called fastformat::sinks::fmt_slices , meaning that it is an overload of a function named fmt_slices() defined in the namespace fastformat::sinks .
// File: fastformat/shims/action/fmt_slices/ // generic_string.hpp in namespace // fastformat::sinks template <typename S> S& fmt_slices( S& sink , int flags , size_t total , size_t numResults , ff_string_slice_t const* results ) { sink.reserve(sink.size() + total + 2); { for(size_t i = 0; i != numResults; ++i) { ff_string_slice_t const& slice = results[i]; if(0 != slice.len) { sink.append(slice.ptr, slice.len); } }} if(flags::ff_newLine & flags) { const ff_string_slice_t newLine = fastformat_getNewlineForPlatform(); sink.append(newLine.ptr, newLine.len); } return sink; } |
Listing 1 |
First consider the signature. It's a function template - to support either std::string or std::wstring , depending on the ambient character encoding of the build - with five parameters. The first parameter, sink , is a mutating reference to the sink, which allows for it to be changed. The second parameter is a bit-mask of flags that moderate the formatting operation: currently two stock flags are defined:
- fastformat::flags::ff_newLi
- fastformat::flags::ff_flush
The final two parameters, numResults and results , define an array of string slices representing all the constituent parts of the resulting statement.
The third parameter, total , is the total length of all the string slices. It is an advisory, to enable optimisation in any allocation that may have to be performed to assemble the results. In this case, it facilitates the call to reserve() , which means that there will only be, at most, one memory allocation associated with preparing the result. This prescience regarding required memory is one of the secondary reasons why FastFormat is fast.
The loop is pretty straightforward: each slice is appended to the sink via the append() method, specifying the pointer and length. It is very important that length is always used along with pointer, because (i) the pointer may not point to a nul-terminated string, and (ii) the pointer may actually be NULL when the length is 0. In the case of std::basic_string , the requirement of 21.3.5.2/6 and 21.3.1/6 necessitate the conditional test against length.
The only remaining task of the function body is to handle the request - via fmtln() or writeln() - for a new-line to be written. Once again, this is achieved using the sink's append() method. It writes from a special instance of ff_string_slice_t returned from the helper function fastformat_getNewlineForPlatform() . This function returns a slice of one or two characters in length, depending on whether the platform's newline is "\r", "\r\n" or "\n" . (Since specifying the wrong value to reserve() does not result in a functional error, the use of the magic number 2 is valid because I 'know' that it cannot be more than that. If the newline slice ever changed, then I would change the implementation of the string sink accordingly.)
Finally, the sink reference is returned, allowing for concatenation of format statements, if required (which is seldom, by the way).
std::string sink; ff::fmt(ff::fmt(sink, "{0}", 1), "{0}", 2); // sink => "12" ff::write(ff::write(sink2, 1), 2);
OutputDebugString sink
Armed with this knowledge, let's look at the OutputDebugString() sink. One striking difference to the string sink is that OutputDebugString() is a function: there are no instances of a class to use as sinks. So we must make one (Listing 2).
// file: fastformat/sinks/OutputDebugString.hpp // in namespace fastformat::sinks class OutputDebugString_sink { ... // T.B.D. }; inline OutputDebugString_sink& fmt_slices( OutputDebugString_sink& sink , int flags , size_t total , size_t numResults , ff_string_slice_t const* results ) { ... // T.B.D. } |
Listing 2 |
This would be used as follows:
void fn(char const* str); { ff::sinks::OutputDebugString_sink sink; ff::fmtln("fn({0})", str); ... // rest of fn()
The first question to ask when implementing the class and the associated action shim function is whether the logic should go in the class, or in the function. Some sink classes, such as speech_sink , are stateful, remembering options that moderate their output behaviour. Therefore, for consistency, I always place the logic in the class, and implement the action shim function in terms of the class's write() method, as shown in Listing 3. If you're sure your sink won't need to be stateful, feel free to do it all in the function and just have a simple empty struct for the sink type.
class OutputDebugString_sink { public: /// Member Types typedef OutputDebugString_sink class_type; public: /// Construction OutputDebugString_sink() {} public: /// Operations class_type& write(int flags, size_t total, size_t numResults, ff_string_slice_t const* results); }; inline OutputDebugString_sink& fmt_slices( ... ) { return sink.write(flags, total, numResults, results); } |
Listing 3 |
Now that we know the structure of the code, all that remains is to implement the write() method. Remembering our first two design constraints - the need to supply a non-NULL nul-terminated C-style string, and the shared final output destination - it's clear that we cannot follow the example of the standard string sink and write out a slice at a time. Rather, we must write into an intermediate buffer, appending a nul-terminator, and a new-line if required.
The STLSoft libraries [ STLSOFT ] have a class template called auto_buffer [ EVAB ] [ IC++ ] , which provides a middle ground between the speed of stack allocation and the flexibility of heap allocation. Simply, it has a fixed internal buffer from which it attempts to fulfil requests for memory. If the request is too large, it is satisfied from the heap. In many circumstances, this can lead to dramatic performance improvements [ EVAB ] [ IC++ ] . The more you learn about FastFormat, the more you'll see auto_buffer lending a high-performing hand in even the most unexpected places. Use of auto_buffer is the third reason why FastFormat is fast. (One point to note: even though it shares much with the interface of std::vector , it is important to realise that it is not a container, and you must not attempt to use it assuming any more intelligence than it is documented to have. See section 16.2 of [ XSTLv1 ] for more discussion on this point.)
So what does this have to do with our new sink? Well, one of the utility functions that comes with the library, concat_slices() , takes an auto_buffer instance, along with the array of slices, and concatenates them all together, resizing the buffer as necessary. We can use this to simplify the implementation of write() (Listing 4).
#include <fastformat/util/sinks/helpers.hpp> class_type& OutputDebugString_sink::write( int flags , size_t total , size_t numResults , ff_string_slice_t const* results ) { const ff_string_slice_t newLine = fastformat_getNewlineForPlatform(); stlsoft::auto_buffer<ff_char_t> buff( 1 + total + ((flags::ff_newLine & flags) ? newLine.len : 0)); fastformat::util::concat_slices(buff, numResults, results); if(flags::ff_newLine & flags) { ::memcpy(&buff[total], newLine.ptr, sizeof(ff_char_t) * newLine.len); total += newLine.len; } buff[total] = '\0'; OutputDebugString(buff.data()); return *this; } |
Listing 4 |
In this case, we need to know exact lengths, so we get hold of the platform newline at the start. We then calculate the exact length required for the auto_buffer , which will throw std::bad_alloc if the request cannot be satisfied. If all goes well, concat_slices() is invoked, and the slice contents are written into the buffer.
The last parts of the preparation are to write in the newline, if requested, and to nul-terminate the string. Then we just invoke OutputDebugString() . Q.E.D.
Except ... as some eagle-eyed readers may already have pondered, this is assuming consistency between the presence/absence of UNICODE and FASTFORMAT_USE_WIDE_STRINGS , the pre-processor symbol whose definition dictates whether the FastFormat library is built for wide strings or left as multibyte strings. It's possible that a user may demand that FastFormat be wide string while not correspondingly defining UNICODE . Putting aside whether you (or I) think this is meaningful/desirable, we can easily side step the whole issue, by simply using overloading.
Along with the sink and the action shim, the fastformat/sinks/OutputDebugString.hpp header also defines the helper structure OutputDebugString_helper , to which we can defer the decision-making (Listing 5).
struct OutputDebugString_helper { static void fn(char const* s) { ::OutputDebugStringA(s); } static void fn(wchar_t const* s) { ::OutputDebugStringW(s); } }; . . . class_type& OutputDebugString_sink::write( . . . ) { . . . buff[total] = '\0'; OutputDebugString_helper::fn(buff.data()); return *this; } |
Listing 5 |
And that's the final version. It writes atomically, provides nul-termination and, if requested, appends a new line, and works regardless of the character encodings of the library and/or the application.
Atomicity
Just a last word on atomicity: In Part 1 [ FF1 ] I made just criticism of the other libraries that do not support atomic output, and observed that it is essential that it is the library, and not the user, that handles it. You can see from the two action shim implementations we've considered that it is possible to avoid paying the cost of copying and concatenating in a context where atomicity is a moot point, while being able to easily apply it otherwise. In this respect, FastFormat supports the best of both worlds, with the simple caveat that the writer of a sink must do the right thing.
Custom argument types
Probably the most common way in which a user would wish to extend a formatting library is in adding support for custom types. The remainder of this article will illustrate how that is done.
First, we need a user-defined type to be passed as an argument to the format statements. Listing 6 shows the definition of a simple superhero type.
class superhero { public: /// Member Types typedef std::string string_type; typedef superhero class_type; public: /// Construction superhero(string_type const& name, int weight, int strength, int goodness) : name(name) , weight(weight) , strength(strength) , goodness(goodness) {} private: class_type& operator =(class_type const&); public: /// Member Variables const string_type name; const int weight; const int strength; const int goodness; }; |
Listing 6 |
Now let's try and insert one into some format statements, in Listing 7.
#include <fastformat/ff.hpp> #include <fastformat/sinks/ostream.hpp> ... superhero thing("The Thing", 200, 99, 100); superhero batman("Batman", 100, 80, 95); ff::writeln(std::cout, "Ben Grimm is ", thing); ff::fmtln( std::cout, "Bruce Wayne is {0}", batman); |
Listing 7 |
If you compile this, you'll get a number of errors along the lines of:
. . ./fastformat/internal/generated/helper_functions.hpp(160) : error: 'stlsoft::c_str_data_a' : none of the 4 overloads could convert all the argument types . . . while trying to match the argument list '(const superhero)' . . ./fastformat/internal/generated/helper_functions.hpp(160) : error: 'stlsoft::c_str_len_a' : none of the 4 overloads could convert all the argument types . . . while trying to match the argument list '(const superhero)'
The compiler has failed to find matching string access shim overloads - of stlsoft::c_str_data_a() and stlsoft::c_str_len_a() - for the superhero type. This is to be expected, since we haven't yet defined any.
In point of fact, it's not actually necessary to define string access shims for our type. Indeed, there are several options for working with a user-defined type:
- Inserters (functions or classes)
- Type filters
- String access shims [IC++][XSTLv1]
With the first two approaches, what you define is instead an intermediary type for which string access shims are already defined. An obvious type would be std::string (or std::wstring , for wide string builds), although we'll see later that there are better options.
The hero format
Let's stipulate that the format for a super-hero is as follows:
<name> {weight=<weight>, strength=<strength>, goodness=<goodness>}
Inserter function
Let's start by building the simplest option, an inserter function. When defining stock inserters (for Pantheios [ PAN ] , anyway, since I've not done the FastFormat ones yet), it's easy to think of names, such as pantheios::integer , fastformat::real , and so forth. When it comes to your own types, it can be a little trickier, since you want to be succinct, and you can't give the inserter the same name unless you put it into a different namespace (which will hinder succinctness). For this example, when dressing up a bunch of superheros I think the name is obvious. Listing 8.1 shows a first attempt.
std::string edna(superhero const& hero) { std::string result; char num[21]; result += hero.name; result += " {weight="; result.append(num, sprintf(num, "%d", hero.weight)); result += ", strength="; result.append(num, sprintf(num, "%d", hero.strength)); result += ", goodness="; result.append(num, sprintf(num, "%d", hero.goodness)); result += '}'; return result; }; |
Listing 8.1 |
Well, that will work, but it's ugly, not terribly maintainable, and not in the slightest bit localised. Furthermore, it's not efficient, and not strictly robust (although no sprintf() should ever return a negative result in this case).
We can handle most of the performance issue just by adding in a call to reserve() before the first concatenation, taking into account the length of the name, the literal fragments, and the maximum sizes of the three integer attributes (Listing 8.2)..
std::string edna(superhero const& hero) { std::string result; char num[21]; result.reserve(hero.name.size() + 32 + ( 3 * 20)); result += hero.name; . . . |
Listing 8.2 |
But that still leaves us with the other problems. What we really need here is a good formatting library . . .
I hope you're ahead of me here. We can rewrite this in terms of one of the FastFormat APIs. If we want to maximise performance, and we are able to forego localisation, then we'd use FastFormat.Write , as in Listing 8.3.
std::string edna(superhero const& hero) { std::string result; ff::write(result, hero.name, " {weight=", hero.weight , ", strength=", hero.strength, ", goodness=", hero.goodness , "}"); return result; }; |
Listing 8.3 |
Note that we don't return the result of ff::write() , because the compiler doesn't know that the returned value is actually result, and we don't want to stymie its ability to apply the named return value optimisation [ IC++ ] .
If it must be localisable, then we'd use FastFormat.Format . Note the double {{ to produce the literal { in the result; see Listing 8.4.
std::string edna(superhero const& hero) { std::string result; ff::fmt(result, "{0} {{weight={1}, strength={2}, goodness={3}}" , hero.name, hero.weight , hero.strength, hero.goodness); return result; }; |
Listing 8.4 |
And to actually localise, we could use a resource bundle, as in Listing 8.5.
#include <fastformat/bundles/properties_bundle.hpp> ff::properties_file_bundle const& getAppBundle(); std::string edna(superhero const& hero) { std::string result; ff::properties_file_bundle const& bundle = getAppBundle(); ff::fmt(result, bundle["superhero.format"] , hero.name, hero.weight , hero.strength, hero.goodness); return result; }; |
Listing 8.5 |
I hope you'll see how convenient is the statelessness of FastFormat, allowing us to implement an inserter function using the library itself.
With any of these inserter functions, we can now successfully format a superhero:
ff::writeln(std::cout, "Ben Grimm is ", edna(thing)); ff::fmtln(std::cout, "Bruce Wayne is {0}", edna(batman));
The obvious little fly in the ointment is that edna() has to be called explicitly, and this intrudes slightly on the expressiveness of our application code.
Inserter class
If/when I write a future article on Pantheios, I'll explain the reason why inserter classes are preferred, since they can employ lazy evaluation to forego paying costs if logging is not enabled. With FastFormat, arguments to format statements are always used, so the use of classes is unnecessary, and functions suffice. (This is good, because they're a fair bit simpler.)
Type-filter
If we want to be able to have the original formatting statements work (without edna() ), we have two options. The more specific of these, the filter-type mechanism , provides compatibility that only works with FastFormat. It involves overloading a conversion shim [ IC++ ] [ XSTLv1 ] - a primary shim type that involves conversion of instances of heterogeneous types to a single type - called fastformat::filters::filter_type , meaning that it is an overload of a function named filter_type() defined in the namespace fastformat::filters .
Let's look at how this can be implemented for our superhero type in Listing 9.
// in namespace fastformat:: filters inline std::string filter_type( superhero const& hero , superhero const* , char const volatile* ) { std::string result; ff::fmt(result, "{0} {{weight={1}, strength={2}, goodness={3}}" , hero.name, hero.weight , hero.strength, hero.goodness); return result; } |
Listing 9 |
The body of this should be immediately recognisable, as it's a straight lift from edna() . (I hope she's not aggressively litigious!) What is probably not so recognisable is the strange function signature of the shim overload. What are the purposes of the second and third arguments, both of which are unused?
To understand these parameters we must peek a little inside the FastFormat application layer templates, which are responsible for translating your nice, heterogeneous application layer statements into arrays of string slices. Consider the two parameter overload of the fastformat::writeln() API function shown in Listing 10.
// file: fastformat/internal/generated/api_functions.hpp // in namespace fastformat template<typename S , typename A0, typename A1 > inline S& writeln(S& sink , A0 const& arg0, A1 const& arg1) { return fastformat::internal::helpers::write_outer_helper_2( sink , flags::ff_newLine , fastformat::filters::filter_type(arg0, &arg0, static_cast<ff_char_t const volatile*>(0)) , fastformat::filters::filter_type(arg1, &arg1, static_cast<ff_char_t const volatile*>(0)) ); } |
Listing 10 |
The third parameter simply informs the conversion shim overload which character encoding it's being asked to work with. The purpose of the parameter is to allow different implementations for multibyte and wide string forms. In our example, we only defined the char-form, and it will only work in a multibyte build. We could instead have actually specified the third parameter as ff_char_t const volatile* , which would have allowed us to be encoding-agnostic.
The purpose of the second parameter is considerably less obvious. To understand this, we need to have a review of C++ law (and lore).
The pedantic pointer idiom
In C++, matching functions takes into account implicit conversions. Consider the following class hierarchy, and three functions that dump out information on instances of these classes.
class superhero {}; class extrasuperhero : public superhero {}; void dump(extrasuperhero const* xhero); void dump(superhero const* hero); void dump(void const* pv);
If we declare instances of the two hero types, and pass their addresses to dump() , all will be well.
superhero hero; extrasuperhero xhero; dump(&hero); dump(&xhero);
If we now remove the dump(extrasuperhero const*) overload, the code still compiles, but xhero will be dumped in the form of a superhero . Since an extrasuperhero isa superhero , this is probably ok, although that may not be so. If we now also remove the dump(superhero const*) overload, the code still compiles, but both heros will just be dumped like raw pointers. An ignominious end for such great men (or women)!
Readers who read part 1 [ FF1 ] will recognise this as the source of the design flaws that prevent IOStreams and Boost.Format from being adequately robust. The way around this is to define a single function template, dump() , and to pass off the work to appropriately defined two-parameter worker functions, as follows:
template <typename T> void dump(T const* t) { dump(t, &t); } void dump(extrasuperhero const* xhero, extrasuperhero const**); void dump(superhero const* hero, superhero const**); void dump(void const* xhero, void const**);
We add a second pointer parameter that is the address of the first parameter. By doing so, we sidestep any implicit conversions in the primary parameter, because the implicit conversions do not apply at an extra level of indirection. Just because superhero const* may happily convert to void const* , superhero const** will not implicitly convert to void const** .
So, if we now remove the dump(extrasuperhero const* xhero, extrasuperhero const**) overload, we will find that the request to dump &xhero will not compile.
I call this technique the pedantic pointer idiom . (For alliterative purposes, I ache to call it the pedantic pointer pattern , but it can't really claim to be a pattern.) It is used in several STLSoft components, and in my commercial work, to enforce 100% type-safety. Clearly it finds good use in the FastFormat application layer, facilitating infinite extensibility while enforcing total robustness.
The one issue is that each time you derive from superhero you need to define a new two-parameter overload of dump() . You may see this as a cost; I see it as a huge benefit: implicit conversion being far less worth than it is effort. Naturally, this also applies to the filter_type conversion shim overloads. It's hardly onerous though, since if you don't need any new formatting you can just use a forwarding function, as in Listing 11.
// in namespace fastformat:: filters inline std::string filter_type( superhero const& hero , superhero const* , char const volatile* ); inline std::string filter_type( extrasuperhero const& hero , extrasuperhero const* , char const volatile* p) { superhero const& regular_hero = hero; return filter_type(regular_hero, ®ular_hero, p); ) |
Listing 11 |
One last point I'd like to make: the type-filter mechanism takes effect before any application of string access shims, so you can use a type-filter to override an existing conversion of a type (implemented using string access shims) that you don't happen to care for.
String access shims
The type-filter mechanism defines conversions that are usable only with FastFormat. Now, in most cases this is a positive thing. However, you may also be using other STLSoft-related libraries - I'm mainly thinking of a superlative logging API library here J - and wish to share your types' to-string conversions between them all. If so, you may instead define string access shims for your types. Let's do that now for our superhero type.
To do so, you must understand the rules for access shims. Unfortunately, there's not the space here to explain all the rules for shims; for that you'll have to consult Imperfect C++ [ IC++ ] and/or Extended STL, volume 1 [ XSTLv1 ] . (The most comprehensive and definitive explanation will be found in my next book, Breaking Up The Monolith , but since it's not yet finished, it's not much good to you.) Instead, I will show you how to do it, and point out the major issues as we go.
The string access shims are actually three sets of four shims. For simplicity we will consider only the ones that are used with multibyte strings: stlsoft::c_str_ptr_a , stlsoft::c_str_ptr_null_a , stlsoft::c_str_data_a , stlsoft::c_str_len_a . Further simplifying, we need only consider the pair of shims stlsoft::c_str_data_a and stlsoft::c_str_len_a for our extension of FastFormat. Analogous versions exist of all four exist for wide strings, with the _w suffix, and you'll need to define stlsoft::c_str_data_w and stlsoft::c_str_len_w for your extension if you wish to use FastFormat in wide string guise. (The use of the _a suffix for multibyte, rather than _m , is just historical, but unfortunately we're stuck with it.)
All shims have name , intent , category and ostensible return type . A shim is allowed to return any type that is implicitly convertible to the ostensible return type. For our two shims these are ( stlsoft::c_str_data_a ; obtain a pointer to the string representation of the given type; Access ; char const* ) and ( stlsoft::c_str_len_a ; obtain the length of the string representation of the given type; Access ; size_t ). The degenerate forms of each are given in Listing 12.
// in namespace stlsoft inline char const* c_str_data_a( char const* s ) { return s; } inline size_t c_str_len_a( char const* s ) { return (NULL != s) ? ::strlen(s) : 0; } |
Listing 12 |
It is a requirement that, for any matched pair, c_str_len_a() always yields exactly the number of characters available at the pointer returned by c_str_data_a() . Definitions for std::string are equally simple, as shown in Listing 13.
// in namespace stlsoft inline char const* c_str_data_a( std::string const& s ) { return s.data(); } inline size_t c_str_len_a( std::string const& s ) { return s.size(); } |
Listing 13 |
The complexity comes when dealing with types that are not strings, and do not already contain a viable string form representing their state. Our superhero type is one such. In this case, we must synthesise the string on the fly, as shown in Listing 14.
inline stlsoft::basic_shim_string<char> c_str_data_a(superhero const& hero) { stlsoft::basic_shim_string<char> result; ff::fmt(result, "{0} {{weight={1}, strength={2} , goodness={3}}" , hero.name, hero.weight , hero.strength, hero.goodness); return result; } inline size_t c_str_len_a( superhero const& hero ) { size_t n = hero.name.size() + 32; char buff[21]; // NOTE: not checking -ve return value! n += sprintf(buff, "%d", hero.weight); n += sprintf(buff, "%d", hero.strength); n += sprintf(buff, "%d", hero.goodness); return n; } |
Listing 14 |
Once again, for convenience we've used FastFormat to implement the conversion to string. If we were planning to use this string access shim in a context without FastFormat we'd have to resort to sprintf() or plain string concatenation. (But not IOStreams or Boost.Format, eh?!). Note that this would not detract from FastFormat's robustness claims, because the correctness of a custom conversion component such as these string access shim overloads can be assessed and verified independently of FastFormat (i.e. in a (practically) exhaustive test harness). For the length, we've added the length of the superhero format (minus the sizes of the insertions and the { escape character) plus the length of the name, plus the lengths of the string forms of the three integers. This is the common methodology of the string access shim pairs when synthesising string forms: the c_str_data[_a|_w]() overload creates the string form, and the corresponding c_str_len[_a|_w]() overload calculates its exact length.
You're probably looking at the definitions with three questions:
- What is a shim string?
- What happens when it goes out of scope?
- Does that sprintf() stuff in the second function look a bit dodgy?
A shim string is a specialisation of stlsoft::basic_shim_string , which is a component specifically designed to act as the intermediary return value for string access shims. It has two important characteristics:
- It uses an stlsoft::auto_buffer internally, such that many cases of conversion can be performed without a heap allocation
- It provides an implicit conversion operator to char const* (or wchar_t const* , when specialised with wchar_t ), which means that it fulfils the requirements of the shim's ostensible return type.
The second question touches on an important part of shim lore. Because conversion shims return instances of types by value, it is important that they are either copied or are used within the expression in which the shim is invoked. If not, crashes will ensue.
Access shims are a composite of attribute and conversion shims. Where you would use an attribute shim to access the string form of a std::string , because it is already in string form, you must use a conversion shim to access the string form of, say, struct tm . This composite nature means that the most restrictive rules from each of the primary shim categories must apply. In the case of access shims, you must observe rule on the use of return values of conversion shims [ XSTLv1 ] :
The return value from an access shim must not be used outside the lifetime of the expression in which the shim is invoked.
Thankfully, FastFormat, Pantheios and the other libraries and programs (I know of) that make use of strings observe this rule, and all is well. A temporary is returned, its value used, and then it is destroyed, all in the right order. If you're feeling adventurous, check out the fastformat/internal/generated/helper_functions.hpp file in the distribution to see how this is done.
The answer to the third question is: it depends. If I were writing for a non-localised context, I would use another STLSoft component, the integer_to_string() function suite [ I2S ] [ IC++ ] , to effect the length calculation, as they're quicker than sprintf() , and do not have a (potential) failure return to worry us. However, all that would be moot if I were writing for a localised context, since I would not consider the extra cycles saved in making manual calculations worth the risk of getting them wrong. Instead, I would take advantage of the fact that stlsoft::basic_shim_string also has an implicit conversion operator to size_t and implement c_str_len_a() as:
inline size_t c_str_len_a( superhero const& hero ) { return c_str_data_a(hero); }
This looks strange, and indeed it is. It's very rarely good design for a class to have one implicit conversion operator [ EC++ ] [ GC++ ] [ IC++ ] , never mind two! But for this special-purpose class, it is not only proper, it is also very useful: the function pair is guaranteed to be correct (in terms of the number of characters available), and because there's likely to be no memory allocation anyway, the performance impact of doing the conversion twice is not likely to be that big. Nonetheless, it is not zero, and so this length-safe double conversion is the exceptional way of doing shims, not the norm.
As is always the case, you can have increased performance as long as you're prepared to wear the attendant increase in effort and/or risk. FastFormat allows you to be master of your own domain.
Summary
This article has discussed customising FastFormat in terms of adding new sinks, and of adding explicit and implicit support for user-defined types. In doing so, it has shone light on several aspects of the design and implementation that support the library's superior robustness, flexibility and performance, including introducing the pedantic pointer idiom.
The next and final part of the series will look at advanced functional usages and performance customisations, how FastFormat co-exists and cooperates with other libraries (both open-source and commercial), and sees some examples from its use in real-world projects.
As before, requests, comments, abuse, and offers of help are all welcome, via the project website on SourceForge: http://sourceforge.net/projects/fastformat .
References
[EC++] Effective C++, 3rd Edition, Scott Meyers, Addison-Wesley, 2005
[EVAB] 'Efficient Variable Automatic Buffers', Matthew Wilson, C/C++ User's Journal, December 2003
[FF1] 'An Introduction to FastFormat, part 1: The State of the Art', Matthew Wilson, Overload #89, February 2009; http://accu.org/index.php/journals/1539
[GC++] C++ Gotchas, Steve Dewhurst, Addison-Wesley, 2002
[IC++] Imperfect C++, Matthew Wilson, Addison-Wesley 2004; http://www.imperfectcplusplus.com/
[I2S] 'Efficient Integer To String Conversions', Matthew Wilson, C/C++ User's Journal, December 2002; http://www.ddj.com/cpp/184401596
[PAN] 'The Pantheios Logging API Library', http://www.pantheios.org/ ; to see why it's the best choice in C++ logging APIs, check out http://www.pantheios.org/performance.html#sweet-spot , which shows graphically how Pantheios can be up to two-orders of magnitude faster than the rest.
[STLSOFT] http://www.stlsoft.org/
[XSTLv1] Extended STL, volume 1, Matthew Wilson, Addison-Wesley 2007; http://www.extendedstl.com/
Erratum |
The feature comparison table from part 1 [ FF1 ] had a few defects in it. (This probably resulted from a manual preparation of its strings, rather than concatenating them robustly. Ho hum!) It was entirely my fault, and no reflection on the superb skills and dedication of the Overload staff. The correct version is available at [ http://www.fastformat.org/errata/overload/introduction-to-fastformat-part-1/table4.htm ] . |