|
|
|
Sponsored Link •
|
|
Advertisement
|
My CS201 "Software Development Methods" professor taught
that all anyone would ever need to know about C++'s ternary
conditional operator (?:) was that it was
poorly understood and best avoided. He was right about one
thing: the conditional operator is poorly
understood, by me at least. Ten years later, I've only begun
to appreciate the subtlety and the power of the conditional
operator. But contrary to what my professor would have me
believe, it is a unique and indispensable tool. In fact, the
lowly conditional operator is the magic ingredient in
BOOST_FOREACH, a macro that makes it simple to
loop over STL collections, arrays, null-terminated strings,
and iterator ranges, among other things. In this article,
I'll zoom in on this operator and discover its unique
properties. Using the conditional operator, you will see how
to encode the type of an expression without evaluating it
and how to detect whether an expression is an lvalue or an
rvalue. Thus armed, I'll take aim at a very vexing problem:
how to write a "safe" macro that does not reevaluate its
arguments. Along the way, I will implement a toy
FOREACH macro that allows iteration over any
STL container.
This article takes a very close look at a very small part of the language—the conditional operator. I make occasional reference to the standard, and the analysis gets fairly technical. Consider yourself warned.
In the November 2003 issue of the C/C++ Users' Journal,
Anson Tsao and I published an article entitled
"FOREACH and LOCK" in which we described how to
bring the functionality of C#'s foreach and
lock keywords to C++. The FOREACH
macro lets you iterate over STL containers with syntax like
this:
vector< string > vect_str( /* ... */ );
FOREACH( string str, vect_str )
{
cout << str << '\n';
}
Why would you want to do such a thing? It is more concise
and less error-prone than using iterators directly,
certainly, but wouldn't it be better to use the
std::for_each algorithm? It depends.
std::for_each is a good idea, but in practice
it is often a clumsy tool. If C++ had native support for
lambda expressions, std::for_each would
make sense, but in the language we have today
std::for_each forces you to move your loop body
into a function (or function object) at namespace scope, far
from where it will be used. Code that uses
std::for_each extensively is littered with
these small, one-off functions that make little sense in
isolation. Also, you can't use break or
continue once you are committed to using
std::for_each. By comparison,
FOREACH clearly expresses the programmer's
intent, is easy to use correctly, and lets you put your loop
body where it makes sense. If you think that macros are
ugly, you'll get no argument from me. A foreach
keyword and lambda functions sure would be nice, but that's
not the language we have to work with, unfortunately. And as
we'll see, the conditional operator is just the tool we need
to teach those evil, unruly macros some manners.
Listing 1 in file broken_foreach.cpp
shows a bare-bones, stripped down version of the FOREACH
macro as it was implemented at the time. Don't sweat the details yet;
we'll get to them in short order. This version has problems—it
evaluates the container expression multiple times, and it doesn't work if
the container is an unnamed temporary object. The following code
illustrates the reevaluation problem. It uses FOREACH to
loop over a container of doubles, but the container expression,
get_vect_dbl(), is a function, which leads to some
surprising behavior.
vector<double> vect_dbl( 3, 1.0 );
vector<double> const & get_vect_dbl()
{
cout << "here!\n";
return vect_dbl;
}
// elsewhere ....
FOREACH( double d, get_vect_dbl() )
{
cout << d << '\n';
}
You might expect to see "here!" displayed only once, but
what you actually see is:
here!
here!
here!
here!
1
here!
here!
here!
1
here!
here!
here!
1
here!
here!
This is because get_vect_dbl() is being
evaluated multiple times by the FOREACH macro.
Surprise! This is a typical macro bug-a-boo, and one of the
reasons why macros have such a bad reputation. Squashing
this bug is our first order of business, but first we'll
need to understand a bit about what the FOREACH
macro is doing, and why.
Imagine that it is your job to implement the
FOREACH macro. "Well, I'll probably need some
iterators," you think to yourself, and you write something
like this:
#define FOREACH( item, container ) \ ??? iter = (container).begin();Now you're stuck. What type do you use to declare your iterator? You don't know the type of the container, and you certainly don't know the type of its iterators. Somehow you must be able to declare variables for the iterators without knowing the iterators' type! It's not impossible, but it isn't obvious, either. Let's see how.
First, we use the ScopeGuard trick [1]: temporary objects of derived type can have their lifetime extended by binding them to a const reference to the base type. This is a magic property of const references and temporary objects. If you're surprised, don't worry—it surprises everyone at first. The following code illustrates the trick. It defines an empty base class, a derived class which wraps some piece of data, and a function which returns an iterator, suitably wrapped.
struct auto_any_base {};
template< class T > struct auto_any :
auto_any_base
{
auto_any( T const & t ) : item( t ) {}
mutable T item;
};
template< class Container >
auto_any< typename Container::const_iterator >
begin( Container const & c )
{
return c.begin();
}
Now, in our FOREACH macro we can say:
#define BOOST_FOREACH( item, container ) \ auto_any_base const & iter = begin( container ); \ ...
begin( container ) returns the begin iterator,
wrapped in an auto_any temporary object. That
temporary object gets bound to iter, which is a
const reference to auto_any_base, the empty
base class. This gives the temporary object a reprieve of
sorts—its lifetime is extended, along with the
iterator stored inside.[2]
As the name "auto_any" suggests, this is a
general mechanism for putting an object of unknown type in
automatic storage (i.e. it is not dynamically allocated
using new).
We have succeeded in storing an iterator, but we have lost
its type information. iter is a const reference
to an auto_any_base—that tells us nothing about what
is actually stored in there. It could be a vector
iterator, or it could be your Auntie May's
Christmas fruit cake. If we want to be able to increment the
iterator, for instance, we need to recover the lost type
information.
next() below does just that.
// extract a reference to the internally stored item
template< class T >
T & auto_any_cast( auto_any_base const & base )
{
return static_cast< auto_any< T > const & >( base ).item;
}
// infer the type of the iterator
template< class Container >
void next( auto_any_base const & iter, Container const& )
{
++auto_any_cast< typename Container::const_iterator >( iter );
}
The function auto_any_cast() takes a const
reference to an auto_any_base and casts it to
the requested derived type. It uses a
static_cast for this; this is kosher because we
are allowed to static_cast from a base to a
derived type. It returns a reference to the contained piece
of data. The function next() uses
auto_any_cast() to access the iterator and
increment it. We use next() by passing it both
iter and the container, so that
next() can deduce the type of the container and
infer the type of the iterator. If you recall, the
begin() function defined above puts a
Container::const_iterator into the
auto_any. The function next() must
follow suit by casting back to a
Container::const_iterator. Notice that the
container is not actually used by next() for
anything besides template argument deduction. The snippet
below shows how this all hangs together.
#define FOREACH( item, container ) \ auto_any_base const & iter = begin( container ); \ ... \ next( iter, container ); \ ...This works very well as long as container is something simple, like a variable of type
vector. But
imagine what happens if container is a function
call. Since container appears twice, it will
get called twice. And if the function has side-effects, all
heck breaks loose. This is the crux of the problem. It is
especially galling considering that in the call to
next(), we don't even need the container; we
just need its type. There's gotta be a better way!
|
Sponsored Links
|