Abusing the C preprocessor for better variadic function arguments.
I am not a fan of C’s stance on variadic function arguments. They are
unsafe and their usage is fairly unorthodox.
I, however, will not claim to solve all of the issues with it — I will
provide, at best, some techniques I have used that I think will make
variadic functions more usable.
As such, be wary that there is no silver bullet, and each proposition
has some drawbacks.
Well then, without further ado, let us first quicky review how variadic
arguments work:
## Standard variadic arguments
Here, `func` is a variadic function that prints its parameters.
What this simple program does is printing on the standard output the strings
"1", "2" and "3" on separate lines.
The inner workings are straight forward : since variadic parameters
can only be last in a parameter list, they are pushed first
on the stack when the function is called.
However, there is no way, as-is, to be able to access each variadic
parameter without a point of reference, a marker to tell where to look,
so we need an additional parameter, a *sentinel*.
This is what `va_start` just does, it initializes the variadic parameter
list in a `va_list`, with `n` as a point of reference. Each parameter
can then be pulled in sequence with `va_arg`, and when we are done,
we just call `va_end`. Simple indeed, but incredibly unsafe.
### A sword of Damocles
The first thing that you might have noticed is that the sentinel `n`
indicates the number of variadic parameters: indeed, there is no way as-is
to know how many of them there are, so we need a hint. Some functions like
`printf` are blessed with parameters that convey both a meaning and
the number of expected variadic arguments (i.e. the format string), but
in the general case, that is not always possible.
So, would you guess what calling `func(4, "1", "2", "3")` does ? That's
right, *Undefined Behavior*. At best, your process will read whatever has
been pushed before "3", try do dereference it, and will crash, but worse,
it could continue to run as if nothing happened -- no need to tell how bad
this is.
Let's take another example:
What would happen if I called `func2(3, "1", 2, 3)` ? Again, *Undefined
Behavior*. `va_arg` is *not* typesafe, meaning that you have to care about
the order and size of your parameters (this can cause horrible results if,
for example, you pass `2` instead of `2.`; subtle, but deadly).
Overall, I would say that the worst part is that your compiler will
**never** complain about all that, the only special case being the
`printf` function family, where the parameters are type checked against
the format string (and this only happens because said functions have
the `format` function annotation).
## Improving the usability with sentinels
Let us ignore for the moment all the type safety concerns, and focus on
the first horrible problem variadic parameters have : the length hint.
We are not, in 99% of the time, designing functions à la `printf`, where
one of the mandatory parameters gives us the length hint, so usually we
end up with functions like:
voidfunc(param0,param1,intn,...){// function body
}
If we happen to have all our variadic parameters of the same type, we can
use a null sentinel at the end of the parameter list:
The attribute here is of course GCC/Clang only, but can help to enforce
the length safety by generating a compiler warning when no NULL is
provided at the end of the parameter list.
This is better, but it only works when the parameters are of the same
integer type and we still have to remember to insert NULL, plus we
programmers hate writing redundant things like this. It is time to ask
some help from the C preprocessor.
## Using the preprocessor to generate the length hints
### With sentinels
Wrapping the function to insert a trailing NULL is pretty straight
forward:
// ANSI C
#define func(Param0, Param1, ...) func(Param0, Param1, __VA_ARGS__, NULL)
// GNU C
#define func(Param0, Param1, args...) func(Param0, Param1, ## args, NULL)
The only difference between the ANSI and GNU version is that
`func(a, b)` will expand to `func(a, b, NULL)` in GNU C instead of
`func(a, b, , NULL)` in ANSI C (making one of the variadic parameters
mandatory).
### With the length directly
Generating a similar macro when the length hint is explicit is a bit
trickier. We first have to define a macro to count the variadic parameters
passed to it:
(Snippet taken from [libcsptr][libcsptr-varargs])
Here the snippet is GNU C only, but can be adapted to work with other
compiler -- the only difference being that `ARG_LENGTH` will have to take at
least one parameter.
The `#if __STRICT_ANSI__` block is here to have consistent results with
different gcc/clang compiler options when `ARG_LENGTH()` is expanded: without this,
the macro expands to `1` instead of `0` when compiled with `-ansi`,
`-std=c99`, or `-std=c89`.
The macro itself relies on the fact that the expanded varargs will shift
all the trailing numbers to the right in `ARG_LENGTH_`, making Count
expand to the number of times the shift happens.
With this macro defined, it becomes trivial to wrap the variadic function:
## Tackling the type safety problem
This is a technique I discovered while experimenting with the interfaces
of [libcsptr][libcsptr] and [criterion][criterion]: it all goes down to
the fact that designated initializers and compound literals both use
commas as value separator, just like parameter lists -- the one major
difference being that types in those *are* checked.
Putting the concept into practice:
`sentinel_` is here to ensure that func called without parameters still
produce a valid compound literal.
*Edit*: [professorlava](http://snaipe.me/c/preprocessor/varargs/#comment-2070555044)
fairly pointed out that for those that use GNU extensions you can, in fact, ignore
the sentinel. It will still produce a warning if `-Wmissing-field-initializer` is
provided, so be sure to balance that out (is 4 bytes worth all this?).
What this brings on the table is a type safe interface for optional
parameters (that can be extended to variadic parameters when using arrays
instead of structs), that is 100% standard C99 (to make this work with
C89, you would have to replace the compound literal with a local variable
and a designated initializer).
Hence, this compiles:
func();func(1);func(1,2.0);func(1,2.0,"3");
... and this does not:
func(1,"3");// error: a pointer is not a double
func(1,2.0,3);//error:anintegerisnotapointer
The neat side-effect that this technique spawns is that `func` has
a python-like interface for keyword arguments:
// All of these expressions are equivalent
func(1,.b=2.0,.c="3");func(1,.c="3",.b=2.0);func(.a=1,.c="3",.b=2.0);
Of course, one must also consider that this technique provides default
values to unspecified fields: 0 (zero). This effect is sometimes
desirable, but on other cases it is not.
* * *
TL;DR
Variadic functions are unsafe and weird to use, the following techniques
try to provide a better interface:
Technique
Usage
Pros
Cons
Sentinel
f(param, "1", "2", NULL)
Sentinel presence can be enforced
Trailing NULL (easily forgettable)
No length parameter
Unique type parameters
Not type safe
Sentinel + Macro
f(param, "1", "2")
Benefits of sentinel without trailing NULL
Unique type parameters
Not type safe
Length parameter + Macro
f(1, 2.0, "3")
Inferred length parameter
Not type safe
Parameters of different types
Struct + Macro
f(1, 2.0, "3")
Type safe optional parameters
Default values are always 0
f(1, .c = "3", .b = 2.0)
Python-like keyword arguments
Conclusion
These techniques provide some sugar on top of the C language to provide
safer interfaces to variadic parameters – I hope you find inspiration in
those techniques.