C++20 has brought in many changes. Spencer Collyer gives an introduction to the new formatting library.
Much of the talk about the C++20 standard has focused on the ‘big four’ items, i.e. modules, concepts, coroutines, and ranges. This tends to obscure other improvements and additions that may have a bigger impact on the general programmer who spends their lives working on application code.
One such addition is the new text formatting library, std::format
. This brings a more modern approach to text formatting, akin to Python’s str.format
. This article is intended as a brief introduction to the library, outlining the main items that allow you to produce formatted text.
The original proposal for the library [P0645] was written by Victor Zverovich and was based on his {fmt}
library [fmtlib]. It was subsequently extended and modified by further proposals. A brief history can be found in a blog post [Zverovich19].
This article deals with the formatting of fundamental types and strings. A later article will describe how you can write formatters for your own types.
Current implementation status
At the time of writing (September 2021), support for std::format
in major compilers is patchy.
The C++ library support page for GCC [GCClib] indicates that support is not yet available. A query to the libstdc++ mailing list received the response that no work on implementing it was currently known.
For Clang, work is being carried out on an implementation, and the progress can be found at [ClangFormat]. It is expected that full support will be available in Clang 14, due for release in 2022.
For MSVC, the C++ library support page [MSVClib] indicates that support is available for std::format
, but with a caveat that to use it you currently need to pass the /std:c++latest
flag, because of ongoing work on the standard.
Given the above, the code samples in this article were compiled using the {fmt}
library, version 8.0.1. This version provides std::format
compatible output. Versions of {fmt}
before 8.0.0 had some differences, especially regarding some floating-point formatting and locale handling.
To convert the listings to use the standard library when available, replace the #include <fmt/format.h>
with #include <format>
, and remove the using namespace fmt
line. One small wrinkle is that {fmt}
has its own string_view
class, so on the rare occasions when we use string_view
in the examples, it is always qualified with the std
namespace.
Text formatting functions
This section describes the main std::format
functions. These are all you need if you just want to produce formatted text.
format
The first function provided by std::format
is called format
. Listing 1 gives an example of how you would use it, along with lines that produce the same output using printf
and iostreams. Output of this program is given in Figure 1.
#include <fmt/format.h> #include <iostream> #include <cstdio> #include <string> using namespace std; using namespace fmt; int main() { int i = 10; double f = 1.234; string s = "Hello World!"; printf("Using printf : %d %g %s\n", i, f, s.c_str()); cout << "Using iostreams: " << i << " " << f << " " << s << "\n"; cout << format("Using format : {} {} {}\n", i, f, s); } |
Listing 1 |
Using printf : 10 1.234 Hello World! Using iostreams: 10 1.234 Hello World! Using format : 10 1.234 Hello World! |
Figure 1 |
As can be seen from the listing, the interface to the format
function is similar to the one for printf
. It takes a format string specifying the format of the data to be output, followed by a list of values to use to replace fields defined in the string. In printf
the fields to be replaced are indicated by preceding the format instructions with %
, while in format
they are delimited by {
and }
characters.
Looking at the strings passed to format
in the listing, it is obvious that there is nothing in the replacement fields that indicates the types of values to be output. Unlike printf
, format
knows what types of arguments have been passed. This is because it is defined as a template function with the following signature:
template<class... Args> string format(string_view fmt, const Args&... args);
The fmt
argument is the format string specifying what the output should look like. The args
arguments are the values we want to output. Note that format
returns a string, so to output it you need to write the string somewhere – in the example, we simply send it to cout
.
The format string syntax will be described in more detail later, but for now, it is sufficient to know that the {}
items output the corresponding value from args
using its default formatting.
Because format
knows what the types of each argument are, if you try to use incompatible formatting with a value it will throw an exception. Listing 2 demonstrates this, where we give an integer argument, but the format type is a string one. This function produces the following output:
#include <fmt/format.h> #include <iostream> using namespace std; using namespace fmt; int main() { int i = 10; try { cout << format("Using format: {:s}\n", i); } catch (const format_error& fe) { cout << "Caught format_error: " << fe.what() << "\n"; } } |
Listing 2 |
Caught format_error: invalid type specifier
This contrasts with printf
, which in all likelihood will at best output garbage with no indication why, and at worst can crash your program.
format_to and format_to_n
The format
function always returns a new string on each call. This is a problem if you want your output to be built up in several stages, as you would have to store each string produced and then stitch them all together at the end when outputting them.
To avoid this, you can use the format_to
function. This appends the formatted text to the given output. The signature for this function is as follows:
template<class Out, class... Args> Out format_to(Out out, string_view fmt, const Args&... args);
The first parameter, out
, is an output iterator, which has to model OutputIterator<const char&>
. The formatted output is sent to this output iterator. The function returns the iterator past the end of the written text.
Listing 3 shows how you might use format_to
to output all the values in a vector. The output is a back_insert_iterator<string>
, which matches the constraint, and appends the formatted values to the end of the string. Output from this program is in Figure 2.
#include <fmt/format.h> #include <iostream> #include <iterator> #include <vector> using namespace std; using namespace fmt; string VecOut(const vector<int>& v) { string retval; back_insert_iterator<string> out(retval); for (const auto& i: v) { out = format_to(out, "{} ", i); } return retval; } int main() { vector<int> v1{2, 3, 5}; cout << VecOut(v1) << "\n"; vector<int> v2{1, 2, 4, 8, 16, 32}; cout << VecOut(v2) << "\n"; vector<int> v3{1, 4, 9, 16, 25, 36, 49, 64, 81, 100}; cout << VecOut(v3) << "\n"; } |
Listing 3 |
2 3 5 1 2 4 8 16 32 1 4 9 16 25 36 49 64 81 100 |
Figure 2 |
If you also need to limit the number of characters written, use the format_to_n
function. The signature for this function is similar to that for format_to
, as follows:
template<class Out, class... Args> format_to_n_result<Out> format_to_n(Out out, iter_difference_t<Out> n, string_view fmt, const Args&... args);
This takes the maximum number of characters to write in parameter n
. The return value of this function is a format_to_n_result
structure, which contains the following members:
out
– Holds the output iterator past the text written.size
– Holds the size that the formatted string would have had, before any potential truncation to a length ofn
. This can be used to detect if the output has been truncated, by checking if the value is greater than then
passed in.
The VecOut
function in Listing 4 is similar to the one in Listing 3, but this time it limits the number of characters written for each value to 5. As can be seen from the output in Figure 3, the third value in v2
is truncated from 1000000 to 10000 – probably something you’d only want to do if you were putting together a toy program to illustrate how the format_to_n
function works.
#include <fmt/format.h> #include <iostream> #include <iterator> #include <vector> using namespace std; using namespace fmt; string VecOut(const vector<int>& v) { string retval; back_insert_iterator<string> out(retval); for (const auto& i: v) { auto res = format_to_n(out, 5, "{}", i); retval += ' '; } return retval; } int main() { vector<int> v1{1, 100, 10000}; cout << VecOut(v1) << "\n"; vector<int> v2{1, 1000, 1000000}; cout << VecOut(v2) << "\n"; } |
Listing 4 |
1 100 10000 1 1000 10000 |
Figure 3 |
formatted_size
If you need to know how many characters would be output for a particular format string and set of arguments, you can call formatted_size
. This can be used if you want to create a buffer of the right size to accept the output. The function has the following signature:
template<class... Args> size_t formatted_size(string_view fmt, const Args&... args);
The size_t
value returned gives the length that the formatted string would have with the given arguments. If you are using this to create a buffer to write a C-style string to, remember that the value returned would not include any terminating '\0'
character unless you include it in the format string.
Listing 5 illustrates the use of formatted_size
. The output is in Figure 4. It may appear that the length output is incorrect but remember that the terminating newline character is included in the format string.
#include <fmt/format.h> #include <iostream> #include <cstdio> #include <string> using namespace std; using namespace fmt; int main() { int i = 10; double f = 1.234; string s = "Hello World!"; string fmt_str{"{} {} {}\n"}; cout << "Length of formatted data: " << formatted_size(fmt_str, i, f, s) << "\n"; cout << format(fmt_str, i, f, s); cout << "123456789|123456789|12\n"; } |
Listing 5 |
Length of formatted data: 22 10 1.234 Hello World! 123456789|123456789|12 |
Figure 4 |
Wide-character support
The functions described above all deal with classes (string
, string_view
, the output iterators) that use char
to represent the characters being handled. If you need to use wchar_t
characters, there is an overload for each of the functions which take or return the appropriate class using wchar_t
. For instance, the format
function that uses wchar_t
has the following signature:
template<class... Args> wstring format(wstring_view fmt, const Args&... args);
Note that as of the C++20 standard, std::format
does not handle any of the charN_t
types (e.g. char16_t
, char32_t
).
Error reporting
Any errors detected by std::format
are reported by throwing objects of the class format_error
. This is derived from the std::runtime_error
class so it has a what
function that returns the error string passed when the exception is created. Listing 2, presented previously, shows an example of catching a format_error
.
Format string
The format strings used by the std::format
functions consist of escape sequences, replacement fields, and other text. They are based on the style used by the Python str.format
, for anyone familiar with that. A similar style is also used in the .NET family of languages, and in Rust.
Escape sequences
The two escape sequences recognised are {{
and }}
, which are replaced by {
and }
respectively. You would use them if you need a literal {
or }
in the output.
Obviously this is distinct from the normal string escapes that the compiler requires if you want to insert special characters in the string, such as \n
. By the time the std::format
functions see the string, these will have already been replaced by the compiler.
Replacement fields
A replacement field controls how the values passed to the std::format
function are formatted. A replacement field has the following general format:
'{
'[arg-id][':
'format-spec]'}
'
where:
- arg-id
If given, this specifies the index of the argument in the value list that is output by the replacement field. Argument indexes start at 0 for the first argument after the format string.
- format-spec
Gives the format specification to be applied to the value being handled. Note that if you give a format-spec, you have to precede it with a
:
, even if you do not give an arg-id.
Later sections will give more details on arg-ids and format-specs. Examples of valid replacement fields are {}
, {0}
, {:10d}
, {1:s}
.
Other text
Any text that isn’t part of a replacement field or an escape sequence is output literally as it appears in the format string.
Argument IDs
The first item in a replacement field is an optional arg-id. This specifies the index of the value in the argument list that you want to use for that replacement field. Argument index values start at 0.
If not specified, the arguments are simply used in the order that they appear in the function call. This is known as automatic argument numbering. For instance, in Listing 1 the format
call has no arg-ids, so the arguments are just used in the order i
, f
, s
.
A given format string cannot have a mix of manual and automatic argument numbering. If you use an arg-id for one replacement field you have to use arg-ids for all replacement fields in the format string.
A simple use for this argument numbering can be seen in Listing 6, where it is used to output the same value in three different bases, along with lines that do the same thing for both printf
and iostreams. The output from this is in Figure 5.
#include <fmt/format.h> #include <iostream> #include <string> using namespace std; using namespace fmt; int main() { int i = 10; printf("%d %o %x\n", i, i, i); cout << i << " " << std::oct << i << " " << std::hex << i << std::dec << "\n"; cout << format("{0} {0:o} {0:x}\n", i); } |
Listing 6 |
10 12 a 10 12 a 10 12 a |
Figure 5 |
Another important use for this facility will be described later in the section ‘Internationalization’.
Note that the format string does not have to specify arg-ids for all the arguments passed to the function. Any that are not given will simply be ignored. An example of this is shown in Listing 7, with the output in Figure 6.
#include <fmt/format.h> #include <iostream> #include <string> using namespace std; using namespace fmt; void write_success(int warnings, int errors) { string fmtspec = "Compilation "; if (errors == 0) { fmtspec += "succeeded"; if (warnings != 0) { fmtspec += " with {0} Warning(s)"; } } else { fmtspec += "failed with {1} Error(s)"; if (warnings != 0) { fmtspec += " and {0} Warning(s)"; } } fmtspec += "\n"; cout << format(fmtspec, warnings, errors); } int main() { write_success(0, 0); write_success(10, 0); write_success(0, 10); write_success(10, 10); } |
Listing 7 |
Compilation succeeded Compilation succeeded with 10 Warning(s) Compilation failed with 10 Error(s) Compilation failed with 10 Error(s) and 10 Warning(s) |
Figure 6 |
Format specifications
The standard format-spec has the following general format1:
[[fill]align][sign][#
][0
][width][prec][L
][type]
There should be no spaces between each item in the format-spec. Also, every item is optional, except that if fill is specified, it must be immediately followed by align. If align is given, any ' 0
' will be ignored.
Anyone familiar with printf
format strings will see that std::format
uses a very similar style. However, there are some significant differences, so the following sections describe each item in the above format in detail, except for the ' L
' character, which will be left until the section on internationalization.
Note that like printf
format specifiers, but unlike many iostreams manipulators, the values given in a format-spec only apply to the current field and don’t affect any later fields.
The type option is called the presentation type. The valid values for each fundamental type are given below, along with a description of what effect they have. Remember that, unlike printf
, std::format
knows the type of value being output, so if you just want the default format for that value, you can omit the type option.
Text alignment and fill
The align value is a single character that gives the alignment to use for the current field. It can have any of the values <
, >
, or ^
. The meaning of these is as follows:
<
– The value is left-justified in the field width. This is the default for string fields.>
– The value is right-justified in the field width. This is the default for numeric fields.^
– The value is centred in the field width. Any padding will be distributed evenly on the left and right sides of the value. If an odd number of padding characters is needed, the extra one will always be on the right.
If the first character in the format-spec is immediately followed by one of the alignment characters, that first character is treated as the fill character to use if the field needs padding. A fill character must be followed by a valid align character. You cannot use either of the characters {
or }
as fill characters.
Note: The fill and align values only make sense if you also specify a width value, although it is not an error to specify them without one.
Listing 8 shows the effect of the align and fill values. The output is in Figure 7.
#include <fmt/format.h> #include <iostream> #include <string> using namespace std; using namespace fmt; int main() { string str; cout << "No fill character specified:\n"; str = format("|{:^10}| |{:<10}| |{:^10}| | " "{:>10}|\n", "default", "left", "centre", "right"); cout << str; string fmtstr = "|{0:10}| |{0:<10}| | " "{0:^10}| |{0:>10}|\n"; str = format(fmtstr, 123); cout << str; str = format(fmtstr, 1.23); cout << str; str = format(fmtstr, "abcde"); cout << str; cout << "\nFill character set to '*'\n"; str = format("|{:*<10}| |{:*^10}| |{:*>10}|\n", "left", "centre", "right"); cout << str; fmtstr = "|{0:*<10}| |{0:*^10}| |{0:*>10}|\n"; str = format(fmtstr, 123); cout << str; str = format(fmtstr, 1.23); cout << str; str = format(fmtstr, "abcde"); cout << str; } |
Listing 8 |
No fill character specified: | default | |left | | centre | | right| | 123| |123 | | 123 | | 123| | 1.23| |1.23 | | 1.23 | | 1.23| |abcde | |abcde | | abcde | | abcde| Fill character set to '*' |left******| |**centre**| |*****right| |123*******| |***123****| |*******123| |1.23******| |***1.23***| |******1.23| |abcde*****| |**abcde***| |*****abcde| |
Figure 7 |
Sign, #, and 0
The sign value specifies how the sign for an arithmetic type is to be output. It can take the following values:
+
– A sign should always be output for both negative and non-negative values.-
– A sign should only be output for negative values. This is the default.- (space) – A sign should be output for negative values, and a space for non-negative values.
The #
character indicates that the alternative form should be used for output of the given value. The meaning of this is described under the appropriate section below.
The 0
character is only valid when also specifying a width value. If present it pads the field with 0
characters after any sign character and/or base indicator. If an align value is present, any 0
character is ignored.
Note that the sign, #
, and 0
values are only valid for arithmetic types,and for bool
or char
(wchar_t
in wide string functions) when an integer presentation type is specified for them (see later).
Listing 9 shows the effect of the sign and 0
values. Output is shown in Figure 8. The effect of the #
value will be shown in examples in the arithmetic type sections.
#include <fmt/format.h> #include <iostream> #include <string> using namespace std; using namespace fmt; int main() { int ineg = -10; int ipos = 5; double fneg = -1.2; double fpos = 2.34; cout << format( "With sign '+' :|{:+}|{:+}|{:+}|{:+}|\n", ineg, fneg, ipos, fpos); cout << format( "With sign '-' :|{:-}|{:-}|{:-}|{:-}|\n", ineg, fneg, ipos, fpos); cout << format( "With sign ' ' :|{: }|{: }|{: }|{: }|\n", ineg, fneg, ipos, fpos); cout << format("With sign '+' and '0' " ":|{:+06}|{:+06}|{:+06}|{:+06}|\n", ineg, fneg, ipos, fpos); cout << format("With sign '-' and '0' " ":|{:-06}|{:-06}|{:-06}|{:-06}|\n", ineg, fneg, ipos, fpos); cout << format("With sign ' ' and '0' " ":|{: 06}|{: 06}|{: 06}|{: 06}|\n", ineg, fneg, ipos, fpos); } |
Listing 9 |
With sign '+' :|-10|-1.2|+5|+2.34| With sign '-' :|-10|-1.2|5|2.34| With sign ' ' :|-10|-1.2| 5| 2.34| With sign '+' and '0' :|-00010|-001.2|+00005|+02.34| With sign '-' and '0' :|-00010|-001.2|000005|002.34| With sign ' ' and '0' :|-00010|-001.2| 00005| 02.34| |
Figure 8 |
Width and precision
The width value can be used to give the minimum width for a field. If the output value needs more characters than the specified width, it will be displayed in full, not truncated to the width. If you need the value to be truncated to a certain width you can use the format_to_n
function to output the value, with the guarantee that only the given number of characters at most will be written.
The value given for the width field depends on whether you are hard-coding the width in the string, or need it to be specified dynamically at runtime. If it is to be hard-coded, it should be given as a literal positive decimal number. If you need to specify the width dynamically at runtime, you use a nested replacement field, which looks like {}
or {
n}
.
Listing 10 demonstrates the use of the width value, using both literal values and nested replacement fields with automatic and manual numbering. As shown in Figure 9, if the value is wider than the given width, the width value is ignored and the field is wide enough to display the full value.
#include <fmt/format.h> #include <iostream> using namespace std; using namespace fmt; int main() { int v1 = 10; int v2 = 10'000'000; cout << format("Specified width: |{0:4}| " "|{0:12}| |{1:4}| |{1:12}|\n", v1, v2); cout << format("Variable width, automatic " "numbering: |{:{}}| |{:{}}|\n", v1, 5, v2, " "12); for (int len = 7; len < 11; ++len) { cout << format("Variable width={0:>2}, manual numbering: |{1:{0}}| |{2:{0}}|\n", len, v1, v2); } } |
Listing 10 |
Specified width: | 10| | 10| |10000000| | 10000000| Variable width, automatic numbering: | 10| | 10000000| Variable width= 7, manual numbering: | 10| |10000000| Variable width= 8, manual numbering: | 10| |10000000| Variable width= 9, manual numbering: | 10| | 10000000| Variable width=10, manual numbering: | 10| | 10000000| |
Figure 9 |
The prec value is formed of a decimal point followed by the precision, which like the width field can be a literal positive decimal number or a nested replacement field.
The prec value is only valid for floating-point or string fields. It has different meanings for the two types and will be described in the relevant section below.
If using a nested replacement field for either width or prec, you must use the same numbering type as for the arg-ids, e.g. if using manual numbering for arg-ids you must also use it for nested replacement fields.
If you use automatic numbering, the arg-ids are assigned based on the count of {
characters up to that point, so the width and/or prec values come after the value they apply to. This contrasts with printf
, where if using the *
to indicate the value is read from the argument list, the values for width and prec appear before the value they apply to.
Integer presentation types
The available integer presentation types are given below. Where relevant, the effect of selecting the alternate form using the #
flag is also listed. Note that any sign character will always precede the prefix added in alternate form.
d
– Decimal format. This is the default if no presentation type is given.b
,B
– Binary format. For alternate form, the value is prefixed with0b
forb
, and0B
forB
.o
– Octal format. For alternate form, the value is prefixed with0
as long as it is non-zero. For example, 7 outputs as07
, but 0 outputs as0
.x
,X
– Hecadecimal format. The case of digits above 9 matches the case of the presentation type. For alternate form, the value is prefixed with0x
forx
, or0X
forX
.c
– Outputs the character with the code value given by the integer. Aformat_error
will be thrown if the value is not a valid code value for the character type of the format string.
Listing 11 gives examples of outputting using all the presentation types, with and without alternate form where that is relevant. The output is shown in Figure 10.
#include <fmt/format.h> #include <iostream> using namespace std; using namespace fmt; int main() { int i1 = -10; int i2 = 10; cout << format("Default: {} {} {}\n", i1, 0, i2); cout << format("Decimal type: {:d} {:d} {:d}\n", i1, 0, i2); cout << format("Binary type: {0:b} {0:B} {1:b} " "{1:B} {2:b} {2:B}\n", i1, 0, i2); cout << format("Binary '#' : {0:#b} {0:#B} " "{1:#b} {1:#B} {2:#b} {2:#B}\n", i1, 0, i2); cout << format("Octal type: {0:o} {1:o} " "{2:o}\n", i1, 0, i2); cout << format("Octal '#' : {0:#o} {1:#o} " "{2:#o}\n", i1, 0, i2); cout << format("Hex type: {0:x} {0:X} {1:x} " "{1:X} {2:x} {2:X}\n", i1, 0, i2); cout << format("Hex '#' : {0:#x} {0:#X} {1:#x} " "{1:#X} {2:#x} {2:#X}\n", i1, 0, i2); cout << format("Char type: |{:c}| |{:c}|\n", 32, 126); } |
Listing 11 |
Default: -10 0 10 Decimal type: -10 0 10 Binary type: -1010 -1010 0 0 1010 1010 Binary '#' : -0b1010 -0B1010 0b0 0B0 0b1010 0B1010 Octal type: -12 0 12 Octal '#' : -012 0 012 Hex type: -a -A 0 0 a A Hex '#' : -0xa -0XA 0x0 0X0 0xa 0XA Char type: | | |~| |
Figure 10 |
Floating-point presentation types
The available floating-point presentation types are given below.
e
– Outputs the value in scientific notation. If no prec value is given, it defaults to 6.f
– Outputs the value in fixed-point notation. If no prec value is given, it defaults to 6.g
– Outputs the value in general notation, which picks betweene
andg
form. The rules are slightly arcane but are the same as used forg
when used withprintf
. If no prec value is given, it defaults to 6.a
– Outputs the value using scientific notation, but with the number represented in hexadecimal. Becausee
is a valid hex digit, the exponent is indicated with ap
character.
If no presentation type is given, the output depends on whether a prec value is given or not. If prec is present the output is the same as using g
. If prec is not present, the output is in either fixed-point or scientific notation, depending on which gives the shortest output that still guarantees that reading the value in again will give the same value as was written out.
If the floating-point value represents infinity or NaN, the values 'inf' and 'nan' will be output respectively. They will be preceded by the appropriate sign character if required. Specifying #
does not cause a base prefix to be output for infinity or NaN.
The e
, f
, g
, and a
presentation types have equivalent E
, F
, G
, and A
types which perform the same, but output any alphabetic characters in uppercase rather than lowercase. For the f
and F
types this only affects output of infinity and NaN.
If the #
character is used to select the alternate form, it causes a decimal point character to always be output, even if there are no digits after it. This does not apply to infinity and NaN values.
Listing 12 gives examples of outputting using the lowercase presentation types, and how prec and alternate form affects the output. Output is given in Figure 11.
#include <fmt/format.h> #include <iostream> using namespace std; using namespace fmt; int main() { double small = 123.4567; double nodps = 34567.; double large = 1e10+12.345; double huge = 1e20; cout << "Default precision:\n"; cout << format("Default: {} {} {} {} {}\n", 0.0, small, nodps, large, huge); cout << format("Type f : {:f} {:f} {:f} {:f} " "{:f}\n", 0.0, small, nodps, large, huge); cout << format("Type e : {:e} {:e} {:e} {:e} " "{:e}\n", 0.0, small, nodps, large, huge); cout << format("Type g : {:g} {:g} {:g} {:g} " "{:g}\n", 0.0, small, nodps, large, huge); cout << format("Type a : {:a} {:a} {:a} {:a} " "{:a}\n", 0.0, small, nodps, large, huge); cout << "\nAlternate form:\n"; cout << format("Default: {:#} {:#} {:#} {:#} " "{:#}\n", 0.0, small, nodps, large, huge); cout << format("Type f : {:#f} {:#f} {:#f} " "{:#f}{:#f}\n", 0.0, small, nodps, large, huge); cout << format("Type e : {:#e} {:#e} {:#e} " "{:#e} {:#e}\n", 0.0, small, nodps, large, huge); cout << format("Type g : {:#g} {:#g} {:#g} " "{:#g} {:#g}\n", 0.0, small, nodps, large, huge); cout << format("Type a : {:#a} {:#a} {:#a} " {:#a} {:#a}\n", 0.0, small, nodps, large, huge); cout << "\nPrecision=3:\n"; cout << format("Default: {:.3} {:.3} {:.3} " "{:.3} {:.3}\n", 0.0, small, nodps, large, huge); cout << format("Type f : {:.3f} {:.3f} {:.3f} " "{:.3f} {:.3f}\n", 0.0, small, nodps, large, huge); cout << format("Type e : {:.3e} {:.3e} {:.3e} " "{:.3e} {:.3e}\n", 0.0, small, nodps, large, huge); cout << format("Type g : {:.3g} {:.3g} {:.3g} " "{:.3g} {:.3g}\n", 0.0, small, nodps, large, huge); cout << format("Type a : {:.3a} {:.3a} {:.3a} " "{:.3a} {:.3a}\n", 0.0, small, nodps, large, huge); cout << "\nPrecision=3, alternate form:\n"; cout << format("Default: {:#.3} {:#.3} {:#.3} " "{:#.3} {:#.3}\n", 0.0, small, nodps, large, huge); cout << format("Type f : {:#.3f} {:#.3f} " "{:#.3f} {:#.3f} {:#.3f}\n", 0.0, small, nodps, large, huge); cout << format("Type e : {:#.3e} {:#.3e} " "{:#.3e} {:#.3e} {:#.3e}\n", 0.0, small, nodps, large, huge); cout << format("Type g : {:#.3g} {:#.3g} " "{:#.3g} {:#.3g} {:#.3g}\n", 0.0, small, nodps, large, huge); cout << format("Type a : {:#.3a} {:#.3a} " "{:#.3a} {:#.3a} {:#.3a}\n", 0.0, small, nodps, large, huge); } |
Listing 12 |
Default precision: Default: 0 123.4567 34567 10000000012.345 1e+20 Type f : 0.000000 123.456700 34567.000000 10000000012.344999 100000000000000000000.000000 Type e : 0.000000e+00 1.234567e+02 3.456700e+04 1.000000e+10 1.000000e+20 Type g : 0 123.457 34567 1e+10 1e+20 Type a : 0x0p+0 0x1.edd3a92a30553p+6 0x1.0e0ep+15 0x1.2a05f2062c28fp+33 0x1.5af1d78b58c4p+66 Alternate form: Default: 0.0 123.4567 34567.0 10000000012.345 1.e+20 Type f : 0.000000 123.456700 34567.000000 10000000012.344999 100000000000000000000.000000 Type e : 0.000000e+00 1.234567e+02 3.456700e+04 1.000000e+10 1.000000e+20 Type g : 0.00000 123.457 34567.0 1.00000e+10 1.00000e+20 Type a : 0x0.p+0 0x1.edd3a92a30553p+6 0x1.0e0ep+15 0x1.2a05f2062c28fp+33 0x1.5af1d78b58c4p+66 Precision=3: Default: 0 123 3.46e+04 1e+10 1e+20 Type f : 0.000 123.457 34567.000 10000000012.345 100000000000000000000.000 Type e : 0.000e+00 1.235e+02 3.457e+04 1.000e+10 1.000e+20 Type g : 0 123 3.46e+04 1e+10 1e+20 Type a : 0x0.000p+0 0x1.eddp+6 0x1.0e1p+15 0x1.2a0p+33 0x1.5afp+66 Precision=3, alternate form: Default: 0.00 123.0 3.46e+04 1.00e+10 1.00e+20 Type f : 0.000 123.457 34567.000 10000000012.345 100000000000000000000.000 Type e : 0.000e+00 1.235e+02 3.457e+04 1.000e+10 1.000e+20 Type g : 0.00 123.0 3.46e+04 1.00e+10 1.00e+20 Type a : 0x0.000p+0 0x1.eddp+6 0x1.0e1p+15 0x1.2a0p+33 0x1.5afp+66 |
Figure 11 |
Listing 13 gives examples of outputting infinity and NaN values. Because all presentation types give the same output for infinity and NaN, we only give the output for types f
and F
. Output is given in Figure 12.
#include <fmt/format.h> #include <limits> #include <iostream> using namespace std; using namespace fmt; int main() { auto pinf = std::numeric_limits<double>::infinity(); auto ninf = -std::numeric_limits<double>::infinity(); auto pnan = std::numeric_limits<double>::quiet_NaN(); auto nnan = -std::numeric_limits<double>::quiet_NaN(); cout << "Default:\n"; cout << format("Default: {} {} {} {}\n", ninf, pinf, nnan, pnan); cout << format("Type f : {:f} {:f} {:f} {:f}\n", ninf, pinf, nnan, pnan); cout << format("Type F : {:F} {:F} {:F} {:F}\n", ninf, pinf, nnan, pnan); cout << "\nAlternate form:\n"; cout << format("Default: {:#} {:#} {:#} {:#}\n", ninf, pinf, nnan, pnan); cout << format("Type f : {:#f} {:#f} {:#f} " "{:#f}\n", ninf, pinf, nnan, pnan); cout << format("Type F : {:#F} {:#F} {:#F} " "{:#F}\n", ninf, pinf, nnan, pnan); cout << "\nWidth=7:\n"; cout << format("Default: |{:7}| |{:7}| |{:7}| " "|{:7}|\n", ninf, pinf, nnan, pnan); cout << "\nWidth=7, using '0':\n"; cout << format("Default: |{:07}| |{:07}| " "|{:07}| |{:07}|\n", ninf, pinf, nnan, pnan); } |
Listing 13 |
Default: Default: -inf inf -nan nan Type f : -inf inf -nan nan Type F : -INF INF -NAN NAN Alternate form: Default: -inf inf -nan nan Type f : -inf inf -nan nan Type F : -INF INF -NAN NAN Width=7: Default: |-inf | |inf | |-nan | |nan | Width=7, using '0': Default: | -inf| | inf| | -nan| | nan| |
Figure 12 |
Character presentation types
The default presentation type for char
and wchar_t
is c
. It simply copies the character to the output.
You can also use the integer presentation types b
, B
, d
, o
, x
, and X
. They write the integer value of the character code to the output, and take account of the alternate form flag #
if relevant.
Listing 14 gives examples of outputting characters. Output is given in Figure 13.
#include <fmt/format.h> #include <iostream> using namespace std; using namespace fmt; int main() { char c = 'a'; cout << format("Default: {}\n", c); cout << format("Char type: {:c}\n", c); cout << format("Decimal type: {:d}\n", c); cout << format("Binary type: {0:b} {0:B} " "{0:#b} {0:#B}\n", c); cout << format("Octal type: {0:o} {0:#o}\n", c); cout << format("Hex type: {0:x} {0:X} {0:#x} " "{0:#X}\n", c); } |
Listing 14 |
Default: a Char type: a Decimal type: 97 Binary type: 1100001 1100001 0b1100001 0B1100001 Octal type: 141 0141 Hex type: 61 61 0x61 0X61 |
Figure 13 |
String presentation types
String formatting works for std::string
and std::string_view
as well as the various char*
types.
The only presentation type for strings is s
, which is also the default if not given. The default alignment for string fields is left-justified.
If a precision value is specified with prec, and it is smaller than the string length, it causes only the first prec characters from the string to be output. This has the effect of reducing the effective length of the string when checking against any width parameter.
Listing 15 shows examples of outputting various types of string, as well as the interaction between width and prec values. The output is shown in Figure 14.
#include <fmt/format.h> #include <iostream> #include <string> using namespace std; using namespace fmt; int main() { string s = "Hello World!"; const char* cp = "Testing. Testing."; const char* cp2 = "Goodbye World!"; std::string_view sv = cp2; cout << format("Default: {} {} {}\n", s, cp, sv); cout << format("Type : {:s} {:s} {:s}\n", s, cp, sv); cout << "\nUsing width and precision:\n"; cout << format("With width: w=7:|{0:7s}| " "w=20:|{0:20s}|\n", s); cout << format("With precision: p=4:|{0:.4s}| " "p=15:|{0:.15s}|\n", s); cout << format("With width and precision: " "w=7,p=4:|{0:7.4s}| w=20,p=4:|{0:20.4s}|\n", s); cout << format("With width, precision, align: " "|{0:<8.4s}| |{0:^8.4s}| |{0:>8.4s}|\n", s); } |
Listing 15 |
Default: Hello World! Testing. Testing. Goodbye World! Type : Hello World! Testing. Testing. Goodbye World! Using width and precision: With width: w=7:|Hello World!| w=20:|Hello World! | With precision: p=4:|Hell| p=15:|Hello World!| With width and precision: w=7,p=4:|Hell | w=20,p=4:|Hell | With width, precision, align: |Hell | | Hell | | Hell| |
Figure 14 |
Bool presentation types
The default bool
presentation type is s
, which outputs true or false.
You can also use the integer presentation types b
, B
, d
, o
, x
, or X
. These behave like the same types for integers, treating false
as 0 and true
as 1.
You can also use c
as a bool
presentation type. It will output the characters with values 0x1 and 0x0 for true
and false
, which may not be what you expect (or particularly useful).
Listing 16 is an example of formatting bool
s, with the output in Figure 15.
#include <fmt/format.h> #include <iostream> using namespace std; using namespace fmt; int main() { cout << " b B d o x X\n"; cout << format("{0} {0:b} {0:B} {0:d} {0:o} {0:x} {0:X}\n", true); cout << format("{0} {0:b} {0:B} {0:d} {0:o} " "{0:x} {0:X}\n", false); cout << "\nUsing alternate form\n"; cout << "#b #B #d #o #x #X\n"; cout << format("{0:#b} {0:#B} {0:#d} {0:#o} " "{0:#x} {0:#X}\n", true); cout << format("{0:#b} {0:#B} {0:#d} {0:#o} " "{0:#x} {0:#X}\n", false); cout << "\nUsing type=s\n"; cout << format("|{:s}| |{:s}|\n", false, true); cout << "\nUsing type=c\n"; cout << format("|{:c}| |{:c}|\n", false, true); } |
Listing 16 |
b B d o x X true 1 1 1 1 1 1 false 0 0 0 0 0 0 Using alternate form #b #B #d #o #x #X 0b1 0B1 1 01 0x1 0X1 0b0 0B0 0 0 0x0 0X0 Using type=s |false| |true| Using type=c | | || |
Figure 15 |
The line that uses the c
presentation type appears to print nothing, but if you send the output to a file and then examine the output using a program that displays the actual bytes in the file, you will see that the characters with codes 0x0 and 0x1 have been output. For instance, you can use od c
on a Unix or Linux box.
Pointer presentation types
The only type value available for pointers is 'p
'. It can be omitted and std::format
will deduce the type from the argument. Note that if the pointer is to one of the char
types, it will be treated as a string, not as a pointer. If you want to output the actual pointer value you need to cast it to a void*
.
Pointer values are output in hexadecimal, with the prefix '0x' added. and digits a
to f
in lowercase. The output is right-justified by default, just like arithmetic types.
Note that the C++20 standard specifies that only pointers for which std::is_void_t
returns true can be output by std::format
, which in practice means you need to cast any pointers to void\*
. Listing 17 shows examples of pointer formatting, and does exactly that. Sample output is in Figure 16.
#include <fmt/format.h> #include <iostream> #include <memory> using namespace std; using namespace fmt; int main() { int* pi = new int; void* vpi = static_cast<void*>(pi); double* pd = new double; void* vpd = static_cast<void*>(pd); void* pnull = nullptr; cout << format("Default: |{}| |{}| |{}|\n", vpi, vpd, pnull); cout << format("Type : |{:p}| |{:p}| " "|{:p}|\n", vpi, vpd, pnull); cout << format("Width : |{:20p}| |{:20p}| " "|{:20p}|\n", vpi, vpd, pnull); } |
Listing 17 |
Default: |0x55595cbc8eb0| |0x55595cbc8ed0| |0x0| Type : |0x55595cbc8eb0| |0x55595cbc8ed0| |0x0| Width : | 0x55595cbc8eb0| | 0x55595cbc8ed0| | 0x0| |
Figure 16 |
Internationalization
Internationalization, or i18n as it is commonly written, is the process of writing a program so its output can be used natively by people speaking different languages and with different conventions for writing things like numbers and dates.
By default, std::format
takes no account of the current locale when outputting values. The reasons for this are described in the original proposal [P0645] in the section ‘Locale support’. In contrast, iostreams takes account of the locale on all output, even if it is set to the default.
Format strings
The ability to use manual argument numbering in format strings to reorder arguments is useful when using translated output. Allowing arguments to appear in a different order in the output can make for grammatically correct output in a given language.
Rather than hard-coding the format strings in your code, you could use a mechanism that provides the correct translated string to use, with manual argument numbers inserted. Libraries exist that make it easier to use such translated strings – for instance, GNU’s gettext
library [gettext] takes a string and looks up the translated version of it, as long as the translations have been provided in the correct format.
Locale-aware formatting
In a format-spec, the L
modifier can be used to specify that the field should be output in a locale-aware fashion. Without this modifier, the locale is ignored. You can use this for output of numeric values, and also bool
when doing string format output. When used with bool
it changes the ‘true’ and ‘false’ values to the appropriate numpunct::truename
and numpunct::falsename
instead.
The various output functions described earlier use the global locale when doing locale-aware formatting. You can change the global locale with a function call like the following:
std::locale::global(std::locale("de_DE"));
This will set the global locale to the one for Germany, de_DE
.
If you only want to change the locale for a single function call, there are overloads of the various output functions that take the locale as their first parameter. For instance, the format
function has the following overload:
template<class... Args> string format(const std::locale& loc, string_view fmt, const Args&... args);
Listing 18 shows examples of using locale-aware output, using both global locales and function-specific ones. Output from this program is shown in Figure 17.
#include <fmt/format.h> #include <iostream> #include <cstdio> #include <string> using namespace std; using namespace fmt; int main() { double dval = 1.5; int ival = 1'000'000; cout << "Using default locale:\n"; cout << format("format : {:.2f} {:12d}\n", dval, ival); cout << format("format+L: {:.2Lf} {:12Ld}\n", dval, ival); cout << "\nUsing global locale de_DE:\n"; locale::global(locale("de_DE")); cout << format("format : {:.2f} {:12d}\n", dval, ival); cout << format("format+L: {:.2Lf} {:12Ld}\n", dval, ival); cout << "\nUsing function-specific locale:\n"; cout << format(locale("en_US"), "en_US: {0:.2f} {0:.2Lf} {1:12d} {1:12Ld}\n", dval, ival); cout << format(locale("de_DE"), "de_DE: {0:.2f} {0:.2Lf} {1:12d} {1:12Ld}\n", dval, ival); } |
Listing 18 |
Using default locale: format : 1.50 1000000 format+L: 1.50 1000000 Using global locale de_DE: format : 1.50 1000000 format+L: 1,50 1.000.000 Using function-specific locale: en_US: 1.50 1.50 1000000 1,000,000 de_DE: 1.50 1,50 1000000 1.000.000 |
Figure 17 |
Avoiding code bloat
The formatting functions are all template functions. This means that each time one of the functions is used with a new set of argument types, a new template instantiation will be generated. This could quickly lead to unacceptable code bloat if these functions did the actual work of generating the output values.
To avoid this problem, the format
and format_to
functions call helper functions to do the actual formatting work. These helper functions have names formed by adding 'v
' to the start of the name of the calling function. For instance, format
calls vformat
, which has the following signature:
string vformat(string_view fmt, format_args args);
The format_args
argument is a container of type-erased values, the details of which are probably only interesting to library authors. They are certainly outside the scope of this article.
There will only be a single instantiation of the vformat
function, and one vformat_to
instantiation for each type of output iterator.
The actual work of doing the formatting is done by these helper functions, so the amount of code generated for each call to format
or format_to
is considerably reduced.
These functions are part of the std::format
public API, so you can use them yourself if you want to. Listing 19, which is based on code presented in [P0645], shows one such use, with a logging function that takes any number of arguments. The output from this function is in Figure 18. The call to vlog_error
uses the make_format_args
function to generate the format_args
structure.
#include <fmt/format.h> #include <iostream> #include <string> using namespace std; using namespace fmt; void vlog_error(int code, std::string_view fmt, format_args args) { cout << "Error " << code << ": " << vformat(fmt, args) << "\n"; } template<class... Args> void log_error(int code, std::string_view fmt, const Args&... args) { vlog_error(code, fmt, make_format_args(args...)); } int main() { int i = 10; double f = 1.234; string s = "Hello World!"; log_error(1, "Bad input detected: {} is not " "an integer value", 10.1); log_error(10, "Oops - Type mismatch between {} " "and {}", "var1", 10); log_error(255, "Something went wrong!"); } |
Listing 19 |
Error 1: Bad input detected: 10.1 is not an integer value Error 10: Oops - Type mismatch between var1 and 10 Error 255: Something went wrong! |
Figure 18 |
Conclusion
Hopefully this article has given you a taster of what std::format
can do. If you want to start trying it out you can use {fmt}
as a good proxy for it until library authors catch up with the standard.
In my own projects I am already using the std::format
compatible parts of {fmt}
, and in general find it easier and clearer than the equivalent iostreams code.
In future articles I intend to explore how to create formatters for your own user defined types, and also how to convert from existing uses of iostreams and printf
-family functions to std::format
.
Acknowledgements
I’d like to extend my thanks to Victor Zverovich for responding quickly to my various queries whilst writing this article, and also for reviewing draft versions of the article, making many useful suggestions for improvement. I’d also like to thank the Overload reviewers for making useful suggestions for improvements to the article. Any errors and ambiguities that remain are solely my responsibility.
Appendix 1: Comparison of format and printf format specifications
Both std::format
and printf
use format specifications to define how a field is to be output. Although they are similar in many cases, there are enough differences to make it worth outlining them here. The following list has the printf
items on the left and gives the std::format
equivalent if there is one.
-
– Replaced by the<
,^
, and>
flags to specify alignment. Repurposed as a sign specifier.+
, space – These have the same meaning and have been joined as sign specifiers by-
.#
– Has the same meaning.0
– Has the same meaning.*
– Used inprintf
to say the width or precision is specified at run-time. Replaced by nested replacement fields. See Note 1 below on argument ordering differences when converting these.d
– Inprintf
it specifies a signed integer. Instd::format
, it specifies any integer, but as it is the default it can be omitted, except when outputting the integer value of a character orbool
.h
,l
,ll
,z
,j
,t
– Not used. Inprintf
, they specify the size of integer being output.std::format
is type aware so these are not needed.hh
– Not used. Inprintf
it specifies the value is a char to be output as a numeric value. Used
instd::format
to do the same thing.i
,u
– Not used. Replaced byd
.L
– Not used. Inprintf
it specifies along double
value is being passed, butstd::format
is type aware. TheL
character has been repurposed to say the field should take account of the current locale when output.n
– Not used. Inprintf
, it saves the number of characters output so far to an integer pointed to by the argument. Useformatted_size
as a replacement.c
,p
,s
– Have the same meaning, but as they are the default output type for their argument type, they can be omitted.a
,A
,e
,E
,f
,F
,g
,G
,o
,x
,X
– All have the same meaning.
Note 1: If using dynamic values for width or precision, and you are using automatic parameter numbering in std::format
, the order of assigning parameter numbers when parsing the string means the width and precision values come after the value being output, whereas in the printf
-family functions they come before the value.
Note 2: POSIX adds positional parameters to the printf
format specification. These are specified using %n$
for value fields, or *m$
for width and precision fields – e.g %1$*2$,*3$f
. The std::format
format specification already supports these using manual numbering mode.
Table 1 shows examples of printf
formatting and the equivalent std::format
version.
|
||||||||||||||||||||||||||||||||||||||
Table 1 |
Appendix 2: std::format and {fmt}
As previously mentioned, std::format
is based on the {fmt}
library. However, {fmt}
offers a number of extra facilities that are not in the C++20 version of std::format
. This appendix gives a brief description of the main ones.
Direct output to a terminal
Both iostreams and the printf
-family functions have the ability to write directly to the terminal, either using std::cout
or the printf
function itself. The only way to do this in std::format
in C++20 is to use format_to
with a back_inserter
attached to std::cout
, but this is not recommended as it leads to slow performance. This is why the examples in this article write the strings produced to std::cout
.
The {fmt}
library provides a print
function to do this work. It can in fact write to any std::FILE*
stream, defaulting to stdout
if none is specified in the function call. A proposal to add this to C++23 has been made [P2093].
Named arguments
The {fmt}
library supports named arguments, so you can specify an argument in the replacement field by name as well as position.
Output of value ranges
There are a number of utility functions provided in {fmt}
. One of the most useful is fmt::join
, which can be used to output ranges of values in a single operation, with a given separator string between each value. The ranges output can be tuples, initializer_lists, any container to which std::begin
and std::end
can be applied, or any range specified by begin and end iterators.
References
[ClangFormat] Libc++ Format Status, https://libcxx.llvm.org//Status/Format.html
[fmtlib] {fmt}
library, https://github.com/fmtlib/fmt
[GCClib] GCC library support, https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2020
[gettext] GNU gettext library, https://www.gnu.org/software/libc/manual/html_node/Translation-with-gettext.html
[MSVClib] MSVC C++ library support, https://docs.microsoft.com/en-us/cpp/overview/visual-cpp-language-conformance
[P0645] Text Formatting, Victor Zverovich, 2019, http://wg21.link/P0645
[P2093] Formatted Output, Victor Zverovish, 2021, http://wg21.link/P2093
[Zverovich19] std::format in C++20, Victor Zverovich,https://www.zverovich.net/2019/07/23/std-format-cpp20.html
Footnote
- This section describes the standard format-spec defined by
std::format
for formatting fundamental types,string
s, andstring_view
s. Other types, likestd::chrono
, have their own format-spec definitions, and user-defined types can also define their own.
Spencer has been programming for more years than he cares to remember, mostly in the financial sector, although in his younger years he worked on projects as diverse as monitoring water treatment works on the one hand, and television programme scheduling on the other.