In Praise of do-while (false)
J. L. Sloan
jsloan@diag.com
2005-08-01
Updated 2006-02-18
By now I would have
thought that everyone knew the joys of the language construct do-while(false). It is a staple among
C, C++, and Java programmers. You can find articles written about it on the web
from as far back as 1994, which might as well be Neolithic cave drawings.
Yet I’ve continued
to have problems getting code using do-while(false)
through code inspections with inspectors who should know better, the result being
that I ended up submitting what is, in my no reason to be humble opinion, lower
quality code into the code bases of products. I did however conform to the
stringent ISO9001 quality processes, so I guess that means it must be okay.
Except that it isn’t
okay.
There is nothing
magic about do-while(false). It does
exactly what you think it does, which is to say, very little. In fact, it does
so little, your typical optimizing compiler won’t generate any code for it.
Yet, it is really handy in a few circumstances.
(As usual, all of my
code snippets are written in C++ unless C is absolutely necessary, and all my
referenced examples are from the Digital Aggregates Desperado open source library
of reusable components from http://www.diag.com/navigation/downloads/Desperado.html.)
Common Exit Flow of Control
All C++ functions
have a common entry point. It is frequently desirable for functions to have a
common exit point. There are all sorts of reasons for this. The most pragmatic
reason is having a common entry and exit point makes it easy to add debugging
statements that log the arguments being passed into the function, and the
results generated by the function. If the flow of control needs to return
prematurely, it can do so while not avoiding the logging statement at the
common exit, just by doing a break.
bool function1(int argument1) {
int
rc = 0;
printf(“%s[%d]:
function1(%d)\n”,
__FILE__,
__LINE__, argument1);
do
{
//
Some really complicated code.
if
(bogus) {
rc
= -1;
break;
}
//
Some more really complicated code.
if
(giveup) {
rc
= -2;
break;
}
//
Yet more really complicated code.
if
(error) {
rc
= -3;
break;
}
}
while (false);
printf(“%s[%d]:
function1=%d\n”,
__FILE__,
__LINE__, rc);
return
rc;
}
The inner logic uses
the break statement to drop out the bottom of the do-while (false). No need for labels. No need for maintaining and
checking flags. And if the inner logic completes and the flow of control finds
itself at the while (false), it
simply drops through. No harm, no foul, no iteration.
Adherents of other
languages both more modern and more ancient will recognize this control
structure as something you might have known as a do-end. It would be great if C++ (and C) had something similar,
perhaps a way to use break to exit
out the bottom of any compound statement (that is, a block of statements
enclosed in curly braces). Alas, a break
can only occur in the context of a switch or loop construct. So in order to use
break, we must provide the compiler with a loop construct, albeit one that
never actually loops.
This pattern is
applicable to any block of logic, not just functions. I frequently use it when
I am writing a long sequence of data transformations or functions calls, all of
which must succeed for the result to be useful. If any step in the sequence
fails, it does a break to the end of the block. Refactoring fans will be
pleased that the pattern can be used to refactor spaghetti code into something
more readable. The Design By Contract
crowd will like the fact that code written with a common exit can establish preconditions
above the do and postconditions below the while
(false). Formal Verification folks will like the idea of establishing
invariants (assertions that remain true during execution) before and after the do-while (false). And although I find the idea of proving any
non-trivial piece of code correct pretty laughable, I do find the concept of
invariants when thinking about program correctness to be very powerful.
The use of the break statement is obviously not
universally applicable. If you are using it from inside another looping control
structure, including other do-while
(false) constructs, or from inside a switch
statement, then it is not going to drop to the bottom of the outer do-while (false).
Instead, if you are
using an ancient language like C, which I have a lot of affection for, the same
way I might have had for Latin, had I studied Latin in high school instead of
goofing off in the computer lab, you could have accomplished the same thing
using a goto. In fact, this
application is one of the few in which I find the use of goto acceptable.
But if you are using
a modern language that doesn’t have a goto,
or if like me you find the use of goto
a slippery slope, or even perhaps it is too reminiscent of those thousands of
lines of FORTRAN IV that you wrote decades ago, the memory of which you are
desperately trying to suppress, then this is a useful technique.
Desperado uses this pattern
in several places, but for a simple example, see the method CellRateThrottle::admissible(ticks_t ticks).
The use of do-while (false) to implement a common
exit flow of control is merely good practice. There is another context in which
it is absolutely necessary.
Compound Statements and Preprocessor Macros
My name is John, and
I use the C preprocessor when writing in C++. As much as the C++ purists like
inline functions (and truth be told so do I), there are situations in which
they just don’t cut it. Desperado makes use of the C preprocessor in its generics.h header file, which provides
preprocessor macros to do fun things like compute the largest signed two’s
complement binary number of any basic data type. I’ve tried to write an inline
function to do that, and I would be pleased to see the results of anyone who
did so successfully without using the preprocessor. (I’m sure it could be done
with a templated function, but then it could not be used in C.) The C
preprocessor is a powerful form of code reuse known as code generation, and
like all powers, with it comes responsibility. It must be used only for good
and never for evil.
So given that I’m
going to use the C preprocessor whether the C++ crowd likes it or not, consider
the following code snippet.
#define TRANSFORM(_A_, _B_) \
function1(_A_); \
function2(_B_)
Now consider its use
in this context.
if (transformable)
TRANSFORM(x, y);
It’s not going to do
the right thing, is it? The preprocessor will expand it thusly.
if (transformable)
function1(x);
function2(y);
This is clearly not
what the user of the macro intended. You might be able to make up a lot of
excuses for writing macros like the one above, but regardless, you have done
something to surprise anyone that uses it. You have designed an abstraction
that does not conform to the behavior any competent programmer would expect.
You can argue that your coding standard requires curly braces around even
single statements in if-else blocks.
This is not going to be helpful to your fellow developer who has to port ten
thousand lines of third-party code, code which follows its own coding standard,
and wants to use your macro to make their job easier.
The logical thing is
to place the function calls in a compound block instead.
#define TRANSFORM(_A_, _B_) \
{ \
function1(_A_);
\
function2(_B_);
\
}
Then our code
snippet will expand into something like this.
if (transformable)
{
function1(x);
function2(y);
}
Looks better at
first glance, doesn’t it? Now both
functions are part of the conditional.
So try this.
if (transformable)
TRANSFORM(x,
y);
else
TRANSFORM(q,
r);
Now our snippet
expands to something like this.
if (transformable)
{
function1(x);
function2(y);
};
else
{
function1(q);
function2(r);
}
This will not
compile. The semicolon trailing the first invocation of the TRANSFORM macro is actually a null
statement, separate from the compound block preceding it. It becomes a
statement in-between the if clause
and the else clause. Using a
semicolon following the macro invocation in the expected way leaves the else clause dangling.
The fact that this
code does not compile is the good news. The programmer using your macro will
merely think that you are incompetent, and will never use your macro, nor
probably any code that you write, ever again.
A much worse case
would be if the resulting code compiled, but did the wrong thing. I have tried
very hard to find a code snippet which compiles but does the wrong thing. I
have been unsuccessful. I’m not saying that such a code snippet does not exist,
merely that I am not smart enough to find it. If such a snippet exists, then
the programmer using your macro will think that you are incompetent while they
sit with a baseball bat in the bushes next your house waiting for you to come
home. If we were truly judged by a jury of our peers, it would be completely
justifiable homicide.
A common approach to
fixing this is to use the macro without a semi-colon at the end.
if (transformable)
TRANSFORM(x,
y)
else
TRANSFORM(q,
r)
This is an
unsatisfying solution. You are requiring the user to write code in an
unexpected and surprising way. Worse, the requirement to omit the semi-colon is
merely an artifact of having to use a compound statement. If a thousand years
from now the definition of your macro changes so that it is not a compound
statement, then you must churn every single application of it to add the
semi-colon. Or you have to put the semi-colon in the macro definition itself, which
may cause all sorts of wackiness to ensue. Wouldn’t it be better to just make
the macro work like any other C++ statement?
Like Lassie, do-while (false) comes to our rescue.
What is it, girl? The barn is on fire? Timmy fell into the well? We write our
macro thusly.
#define TRANSFORM(_A_, _B_) \
do
{ \
function1(_A_);
\
function2(_B_);
\
} while (false)
The preprocessor now
expands our macro into a single C++ statement that must be properly terminated
by a semicolon. Hence
if (transformable)
TRANSFORM(x,
y);
else
TRANSFORM(q,
r);
becomes
if (transformable)
do
{
function1(x);
function2(y);
}
while (false);
else
do
{
function1(q);
function2(r);
}
while (false);
The semicolon, added
by the user of the macro, is now a required part of the syntax, not a dangling null
statement.
All of the snippets
I have shown not only compile, but do the expected thing when executed. The do-while (false) control structure
serves as a compound statement that is both syntactically and semantically well
behaved.
An example of this
use of do-while (false) can be found
in the reinitializeobject() macro in
the Desperado reinitializeobject.h
header file. This macro, which is so scary it merits an article all of its own,
re-initializes an existing object by using do-while(false)
to combine an explicit destructor call with a call to a placement new operator. (Before you send me
email, yes, this is a bad idea, which is why Desperado does not use this macro
itself.)
One context in which
do-while (false) does not work is
when you are using the preprocessor to generate code that declares variables.
#define ALLOCATE(_A_, _B_) \
do
{ \
int
_A_; \
int
_B_; \
}
while (false)
The variables will
be allocated on the stack then immediately deallocated when the do-while (false) construct terminates.
This is fine if the scope of the variables is limited to the code inside the do-while (false). It is not so useful
if they are being declared for use outside of that scope. The simple compound
statement has the same flaw.
The Little Control Structure That Could
I hope I have given
you a new appreciation of do-while
(false), the control structure that does so much, while generating so
little in return.
