Section A.2. Fundamental Abstractions of the Preprocessor

A.2. Fundamental Abstractions of the Preprocessor

We began our discussion of template metaprogramming in Chapter 2 by describing its metadata (potential template arguments) and metafunctions (class templates). On the basis of those two fundamental abstractions, we built up the entire picture of compile-time computation covered in the rest of this book. In this section we'll lay a similar foundation for the preprocessor metaprogrammer. Some of what we cover here may be a review for you, but it's important to identify the basic concepts before going into detail.

A.2.1. Preprocessing Tokens

The fundamental unit of data in the preprocessor is the preprocessing token. Preprocessing tokens correspond roughly to the tokens you're used to working with in C++, such as identifiers, operator symbols, and literals. Technically, there are some differences between preprocessing tokens and regular tokens (see section 2 of the C++ standard for details), but they can be ignored for the purposes of this discussion. In fact, we'll be using the terms interchangeably here.

A.2.2. Macros

Preprocessor macros come in two flavors. Object-like macros can be defined this way:

#define identifier replacement-list

where the identifier names the macro being defined, and replacement-list is a sequence of zero or more tokens. Where the identifier appears in subsequent program text, it is expanded by the preprocessor into its replacement list.

Function-like macros, which act as the "metafunctions of the preprocessing phase," are defined as follows:

#define identifier(a₁, a₂, ... a_n) replacement-list

where each a_i is an identifier naming a macro parameter. When the macro name appears in subsequent program text followed by a suitable argument list, it is expanded into its replacement-list, except that each argument is substituted for the corresponding parameter where it appears in the replacement-list.^[2]

^[2] We have omitted many details of how macro expansion works. We encourage you to take a few minutes to study section 16.3 of the C++ standard, which describes that process in straightforward terms.

A.2.3. Macro Arguments

Definition

A macro argument is a nonempty sequence of:

Preprocessing tokens other than commas or parentheses, and/or
Preprocessing tokens surrounded by matched pairs of parentheses.

This definition has consequences for preprocessor metaprogramming that must not be underestimated. Note, first of all, that the following tokens have special status:



   , ( )

As a result, a macro argument can never contain an unmatched parenthesis, or a comma that is not surrounded by matched parentheses. For example, both lines following the definition of FOO below are ill-formed:



   #define FOO(X) X // unary identity macro


   FOO(,)           // un-parenthesized comma or two empty arguments


   FOO())           // unmatched parenthesis or missing argument

Note also that the following tokens do not have special status; the preprocessor knows nothing about matched pairs of braces, brackets, or angle brackets:



   { } [ ] < >

As a result, these lines are also ill-formed:



   FOO(std::pair<int, long>)                 // two arguments


   FOO({ int x = 1, y = 2; return x+y; })    // two arguments

It is possible to pass either string of tokens above as part of a single macro argument, provided it is parenthesized:



   FOO((std::pair<int,int>))                 // one argument


   FOO(({ int x = 1, y = 2; return x+y; }))  // one argument

However, because of the special status of commas, it is impossible to strip parentheses from a macro argument without knowing the number of comma-separated token sequences it contains.^[3] If you are writing a macro that needs to be able to accept an argument containing a variable number of commas, your users will either have to parenthesize that argument and pass you the number of comma-separated token sequences as an additional argument, or they will have to encode the same information in one of the preprocessor data structures covered later in this appendix.

^[3] The C99 preprocessor, by virtue of its variadic macros, can do that and more. The C++ standardization committee is likely to adopt C99's preprocessor extensions for the next version of the C++ standard.