Standard C Preprocessor
$ cat xx.c
#define VARIABLE 3
#define PASTER(x,y) x ## _ ## y
#define EVALUATOR(x,y) PASTER(x,y)
#define NAME(fun) EVALUATOR(fun, VARIABLE)
extern void NAME(mine)(char *x);
$ gcc -E xx.c
# 1 "xx.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "xx.c"
extern void mine_3(char *x);
$
Two levels of indirection
In a comment to another answer, Cade Roux asked why this needs two levels of indirection. The flippant answer is because that's how the standard requires it to work; you tend to find you need the equivalent trick with the stringizing operator too.
Section 6.10.3 of the C99 standard covers 'macro replacement', and 6.10.3.1 covers 'argument substitution'.
After the arguments for the invocation of a function-like macro have been identified,
argument substitution takes place. A parameter in the replacement list, unless preceded
by a #
or ##
preprocessing token or followed by a ##
preprocessing token (see below), is
replaced by the corresponding argument after all macros contained therein have been
expanded. Before being substituted, each argument’s preprocessing tokens are
completely macro replaced as if they formed the rest of the preprocessing file; no other
preprocessing tokens are available.
In the invocation NAME(mine)
, the argument is 'mine'; it is fully expanded to 'mine'; it is then substituted into the replacement string:
EVALUATOR(mine, VARIABLE)
Now the macro EVALUATOR is discovered, and the arguments are isolated as 'mine' and 'VARIABLE'; the latter is then fully expanded to '3', and substituted into the replacement string:
PASTER(mine, 3)
The operation of this is covered by other rules (6.10.3.3 'The ## operator'):
If, in the replacement list of a function-like macro, a parameter is immediately preceded
or followed by a ##
preprocessing token, the parameter is replaced by the corresponding
argument’s preprocessing token sequence; [...]
For both object-like and function-like macro invocations, before the replacement list is
reexamined for more macro names to replace, each instance of a ##
preprocessing token
in the replacement list (not from an argument) is deleted and the preceding preprocessing
token is concatenated with the following preprocessing token.
So, the replacement list contains x
followed by ##
and also ##
followed by y
; so we have:
mine ## _ ## 3
and eliminating the ##
tokens and concatenating the tokens on either side combines 'mine' with '_' and '3' to yield:
mine_3
This is the desired result.
If we look at the original question, the code was (adapted to use 'mine' instead of 'some_function'):
#define VARIABLE 3
#define NAME(fun) fun ## _ ## VARIABLE
NAME(mine)
The argument to NAME is clearly 'mine' and that is fully expanded.
Following the rules of 6.10.3.3, we find:
mine ## _ ## VARIABLE
which, when the ##
operators are eliminated, maps to:
mine_VARIABLE
exactly as reported in the question.
Traditional C Preprocessor
Robert Rüger asks:
Is there any way do to this with the traditional C preprocessor which does not have the token pasting operator ##
?
Maybe, and maybe not — it depends on the preprocessor. One of the advantages of the standard preprocessor is that it has this facility which works reliably, whereas there were different implementations for pre-standard preprocessors. One requirement is that when the preprocessor replaces a comment, it does not generate a space as the ANSI preprocessor is required to do. The GCC (6.3.0) C Preprocessor meets this requirement; the Clang preprocessor from XCode 8.2.1 does not.
When it works, this does the job (x-paste.c
):
#define VARIABLE 3
#define PASTE2(x,y) x/**/y
#define EVALUATOR(x,y) PASTE2(PASTE2(x,_),y)
#define NAME(fun) EVALUATOR(fun,VARIABLE)
extern void NAME(mine)(char *x);
Note that there isn't a space between fun,
and VARIABLE
— that is important because if present, it is copied to the output, and you end up with mine_ 3
as the name, which is not syntactically valid, of course. (Now, please can I have my hair back?)
With GCC 6.3.0 (running cpp -traditional x-paste.c
), I get:
# 1 "x-paste.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "x-paste.c"
extern void mine_3(char *x);
With Clang from XCode 8.2.1, I get:
# 1 "x-paste.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 329 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "x-paste.c" 2
extern void mine _ 3(char *x);
Those spaces spoil everything. I note that both preprocessors are correct; different pre-standard preprocessors exhibited both behaviours, which made token pasting an extremely annoying and unreliable process when trying to port code. The standard with the ##
notation radically simplifies that.
There might be other ways to do this. However, this does not work:
#define VARIABLE 3
#define PASTER(x,y) x/**/_/**/y
#define EVALUATOR(x,y) PASTER(x,y)
#define NAME(fun) EVALUATOR(fun,VARIABLE)
extern void NAME(mine)(char *x);
GCC generates:
# 1 "x-paste.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "x-paste.c"
extern void mine_VARIABLE(char *x);
Close, but no dice. YMMV, of course, depending on the pre-standard preprocessor that you're using. Frankly, if you're stuck with a preprocessor that is not cooperating, it would probably be simpler to arrange to use a standard C preprocessor in place of the pre-standard one (there is usually a way to configure the compiler appropriately) than to spend much time trying to work out a way to do the job.
Best Answer
If you use a C99 or later compiler
It assumes you are using C99 (the variable argument list notation is not supported in earlier versions). The
do { ... } while (0)
idiom ensures that the code acts like a statement (function call). The unconditional use of the code ensures that the compiler always checks that your debug code is valid — but the optimizer will remove the code when DEBUG is 0.If you want to work with #ifdef DEBUG, then change the test condition:
And then use DEBUG_TEST where I used DEBUG.
If you insist on a string literal for the format string (probably a good idea anyway), you can also introduce things like
__FILE__
,__LINE__
and__func__
into the output, which can improve the diagnostics:This relies on string concatenation to create a bigger format string than the programmer writes.
If you use a C89 compiler
If you are stuck with C89 and no useful compiler extension, then there isn't a particularly clean way to handle it. The technique I used to use was:
And then, in the code, write:
The double-parentheses are crucial — and are why you have the funny notation in the macro expansion. As before, the compiler always checks the code for syntactic validity (which is good) but the optimizer only invokes the printing function if the DEBUG macro evaluates to non-zero.
This does require a support function — dbg_printf() in the example — to handle things like 'stderr'. It requires you to know how to write varargs functions, but that isn't hard:
You can also use this technique in C99, of course, but the
__VA_ARGS__
technique is neater because it uses regular function notation, not the double-parentheses hack.Why is it crucial that the compiler always see the debug code?
[Rehashing comments made to another answer.]
One central idea behind both the C99 and C89 implementations above is that the compiler proper always sees the debugging printf-like statements. This is important for long-term code — code that will last a decade or two.
Suppose a piece of code has been mostly dormant (stable) for a number of years, but now needs to be changed. You re-enable debugging trace - but it is frustrating to have to debug the debugging (tracing) code because it refers to variables that have been renamed or retyped, during the years of stable maintenance. If the compiler (post pre-processor) always sees the print statement, it ensures that any surrounding changes have not invalidated the diagnostics. If the compiler does not see the print statement, it cannot protect you against your own carelessness (or the carelessness of your colleagues or collaborators). See 'The Practice of Programming' by Kernighan and Pike, especially Chapter 8 (see also Wikipedia on TPOP).
This is 'been there, done that' experience — I used essentially the technique described in other answers where the non-debug build does not see the printf-like statements for a number of years (more than a decade). But I came across the advice in TPOP (see my previous comment), and then did enable some debugging code after a number of years, and ran into problems of changed context breaking the debugging. Several times, having the printing always validated has saved me from later problems.
I use NDEBUG to control assertions only, and a separate macro (usually DEBUG) to control whether debug tracing is built into the program. Even when the debug tracing is built in, I frequently do not want debug output to appear unconditionally, so I have mechanism to control whether the output appears (debug levels, and instead of calling
fprintf()
directly, I call a debug print function that only conditionally prints so the same build of the code can print or not print based on program options). I also have a 'multiple-subsystem' version of the code for bigger programs, so that I can have different sections of the program producing different amounts of trace - under runtime control.I am advocating that for all builds, the compiler should see the diagnostic statements; however, the compiler won't generate any code for the debugging trace statements unless debug is enabled. Basically, it means that all of your code is checked by the compiler every time you compile - whether for release or debugging. This is a good thing!
debug.h - version 1.2 (1990-05-01)
debug.h - version 3.6 (2008-02-11)
Single argument variant for C99 or later
Kyle Brandt asked:
There's one simple, old-fashioned hack:
The GCC-only solution shown below also provides support for that.
However, you can do it with the straight C99 system by using:
Compared to the first version, you lose the limited checking that requires the 'fmt' argument, which means that someone could try to call 'debug_print()' with no arguments (but the trailing comma in the argument list to
fprintf()
would fail to compile). Whether the loss of checking is a problem at all is debatable.GCC-specific technique for a single argument
Some compilers may offer extensions for other ways of handling variable-length argument lists in macros. Specifically, as first noted in the comments by Hugo Ideler, GCC allows you to omit the comma that would normally appear after the last 'fixed' argument to the macro. It also allows you to use
##__VA_ARGS__
in the macro replacement text, which deletes the comma preceding the notation if, but only if, the previous token is a comma:This solution retains the benefit of requiring the format argument while accepting optional arguments after the format.
This technique is also supported by Clang for GCC compatibility.
Why the do-while loop?
You want to be able to use the macro so it looks like a function call, which means it will be followed by a semi-colon. Therefore, you have to package the macro body to suit. If you use an
if
statement without the surroundingdo { ... } while (0)
, you will have:Now, suppose you write:
Unfortunately, that indentation doesn't reflect the actual control of flow, because the preprocessor produces code equivalent to this (indented and braces added to emphasize the actual meaning):
The next attempt at the macro might be:
And the same code fragment now produces:
And the
else
is now a syntax error. Thedo { ... } while(0)
loop avoids both these problems.There's one other way of writing the macro which might work:
This leaves the program fragment shown as valid. The
(void)
cast prevents it being used in contexts where a value is required — but it could be used as the left operand of a comma operator where thedo { ... } while (0)
version cannot. If you think you should be able to embed debug code into such expressions, you might prefer this. If you prefer to require the debug print to act as a full statement, then thedo { ... } while (0)
version is better. Note that if the body of the macro involved any semi-colons (roughly speaking), then you can only use thedo { ... } while(0)
notation. It always works; the expression statement mechanism can be more difficult to apply. You might also get warnings from the compiler with the expression form that you'd prefer to avoid; it will depend on the compiler and the flags you use.TPOP was previously at http://plan9.bell-labs.com/cm/cs/tpop and http://cm.bell-labs.com/cm/cs/tpop but both are now (2015-08-10) broken.
Code in GitHub
If you're curious, you can look at this code in GitHub in my SOQ (Stack Overflow Questions) repository as files
debug.c
,debug.h
andmddebug.c
in the src/libsoq sub-directory.