CONSTANTS often shout. Jonathan Wakely considers why this happens in C and what the alternatives are in C++.
It’s common in C and C++ to use entirely uppercase names for constants, for example:
const int LOWER = 0; const int UPPER = 300; const int STEP = 20; enum MetaSyntacticVariable { FOO, BAR, BAZ };
I think this is a terrible convention in C++ code and if you don’t already agree I hope I can convince you. Maybe one day we can rid the world of this affliction, except for a few carefully-controlled specimens kept far away where they can’t hurt anyone, like smallpox.
Using uppercase names for constants dates back (at least) to the early days of C and the need to distinguish symbolic constants, defined as macros or enumerators, from variables:
Symbolic constant names are conventionally written in upper case so they can be readily distinguished from lower case variable names. [ Kernighan88 ]
The quoted text follows these macro definitions:
#define LOWER 0 /* lower limit of table */ #define UPPER 300 /* upper limit */ #define STEP 20 /* step size */
This convention makes sense in the context of C. For many years C didn’t have the
const
keyword, and even today you can’t use a
const
variable where C requires a constant-expression, such as declaring the bounds of a (non-variable length) array or the size of a bitfield. Furthermore, unlike variables, symbolic constants don’t have an address and can’t be assigned new values. So I will grudgingly admit that macros are a necessary evil for defining constants in C, and distinguishing them can be useful and a consistent naming convention helps with that. Reserving a set of identifiers (in this case, ‘all names written in uppercase’) for a particular purpose is a form of namespace, allowing you to tell at a glance that the names
STEP
and
step
are different and, by the traditional C convention, allowing you to assume one is a symbolic constant and the other is a variable.
Although some form of ad hoc namespace may be useful to tell symbolic constants and variables apart, I think it’s very unfortunate that the traditional convention reserves names that are so VISIBLE in the code and draw your ATTENTION to something as mundane as symbolic constants. An alternative might have been to always use a common prefix, say
C_
, for symbolic constants, but it’s too late to change nearly half a century of C convention now.
C’s restrictions on defining constants aren’t present in C++, where a
const
variable (of suitable type) initialised with a constant expression is itself a constant expression and where
constexpr
functions can produce compile-time constants involving non-trivial calculations on other constants. C++ also supports namespaces directly in the language, so the constants above could be defined as follows and referred to as
FahrenheitToCelsiusConstants::step
instead of
STEP
:
namespace FahrenheitToCelsiusConstants { // lower & upper limits of table, step size enum Type { lower=0, upper=300, step=20 }; }
That means C++ gives you much better tools than macros for defining properly typed and scoped constants.
Macros are very important in C but have far fewer uses in C++. The first rule about macros is: Don’t use them unless you have to. [ Stroustrup00 ]
There are good reasons for avoiding macros apart from the fact that C++ provides higher-level alternatives. Many people are familiar with problems caused by the
min
and
max
macros defined in
<windows.h>
, which interfere with the names of function templates defined in the C++ standard library. The main problem is that macros don’t respect lexical scoping, they’ll stomp over any non-macro with the same name. Functions, variables, namespaces, you name it, the preprocessor will happily redefine it.
Preprocessing is probably the most dangerous phase of C++ translation. The preprocessor is concerned with tokens (the “words” of which the C++ source is composed) and is ignorant of the subtleties of the rest of the C++ language, both syntactic and semantic. In effect the preprocessor doesn’t know its own strength and, like many powerful ignoramuses, is capable of much damage. [ Dewhurst02 ]
Stephen Dewhurst devotes a whole chapter to gotchas involving the preprocessor, demonstrating how constants defined as macros can behave in unexpected ways, and ‘pseudofunctions’ defined as macros may evaluate arguments more than once, or not at all. So given that macros are less necessary and (in an ideal codebase) less widely-used in C++, it is important when macros
are
used to limit the damage they can cause and to draw the reader’s attention to their presence. We can’t use C++ namespaces to limit their scope, but we can use an ad hoc namespace in the form of a set of names reserved only for macros to avoid the problem of clashing with non-macros and silently redefining them. Conventionally we use uppercase names (and not single-character names, not only are short names undescriptive and unhelpful for macros, single-character names like
T
are typically used for template parameters).
Also to warn readers, follow the convention to name macros using lots of capital letters. [Stroustrup]
By convention, macro names are written in uppercase. Programs are easier to read when it is possible to tell at a glance which names are macros. [ GCC ]
Using uppercase names has the added benefit of SHOUTING to draw ATTENTION to names which don’t obey the usual syntactic and semantic rules of C++.
Do
#undefine
macros as soon as possible, always give them
SCREAMING_UPPERCASE_AND_UGLY
names, and avoid putting them in headers. [
Sutter05
]
When macro names stand out clearly from the rest of the code you can be careful to avoid reusing the name and you know to be careful using them e.g. be aware of side-effects being evaluated twice:
#define MIN(A,B) (A) < (B) ? (A) : (B) const int limit = 100; // ... return MIN(++n, limit);
But if you also use all-uppercase names for non-macros then you pollute the namespace. You no longer have the advantage of knowing which names are going to summon a powerful ignoramus to stomp on your code (including the fact that your carefully-scoped enumerator named
FOO
might not be used because someone else defined a macro called
FOO
with a different value), and the names that stand out prominently from the rest of the code might be something harmless and mundane, like the bound of an array. Constants are pretty dull, the actual logic using them is usually more interesting and deserving of the reader’s attention. Compare this to the previous code snippet, assuming the same definitions for the macro and constant, but with the case of the names changed:
return min(++n, LIMIT);
Is it more important to note that you’re limiting the return value to some constant
LIMIT
, rather than the fact that
n
is incremented? Or that you’re calling
min
rather than
max
or some other function? I don’t think
LIMIT
should be what grabs your attention here, it doesn’t even tell you what the limit is. It certainly isn’t obvious that
n
will be incremented twice!
So I’d like to make a plea to the C++ programmers of the world: stop naming (non-macro) constants in uppercase. Only use all-uppercase for macros, to warn your readers and limit the damage that the powerful ignoramus can do.
References
[Dewhurst02] C++ Gotchas , Stephen C. Dewhurst, Addison Wesley, 2002.
[Kernighan88] The C Programming Language , Second Edition, Brian W. Kernighan & Dennis M. Ritchie, Prentice Hall, 1988.
[GCC] ‘The GNU Compiler Collection: The C Preprocessor’, Free Software Foundation, 2014, http://gcc.gnu.org/onlinedocs/cpp/Object-like-Macros.html#Object-like-Macros
[Stroustrup00] The C++ Programming Language , Special Edition, Bjarne Stroustrup, Addison-Wesley, 2000.
[Sutter05] C++ Coding Standards , Herb Sutter & Alexei Alexandrescu, Addison-Wesley, 2005.