|
|
|
Sponsored Link •
|
|
Advertisement
|
The research described in this paper could be extended in several ways:
operator+, how
would clients do this in an expression such as “a + b” without compromising the
natural syntax that is the primary motivation for overloading operators?
Support for feature constraint checking on operators remains an open
question.
struct BasicGuarantee {};
struct StrongGuarantee: BasicGuarantee {};
struct NoThrowGuarantee: StrongGuarantee {};
It would be interesting to modify code feature constraint checking to offer support for such relationships.
AllCodeFeatures.
The current design assumes the existence of a container holding all possible feature classes (i.e., AllCodeFeatures). A more flexible
approach would do away with such a container, thus allowing different groups of
developers to independently define their own sets of feature classes that, absent name
conflicts, would work seamlessly in combination. The conventional approach to this
problem would be self-registration of feature classes, but conventional self-registration
takes place at runtime, and what's needed in this case is a compile-time approach.I am unaware of other work facilitating the definition and combination of arbitrary code features, but the general issue of imposing restrictions on function invocation has certainly received attention. Bolten, for example, has described how the use of intermediate classes makes it possible to increase the granularity of friendship in C++,6 and Alexandrescu has demonstrated how C++'s type system can be employed to enable compile-time detection of race conditions in multithreaded systems.4,5 Also related is Perl's Taint Mode,19 which tags data from outside sources and restricts the operations that may be performed with it. In contrast to the static checking underlying the work described in this paper and in the publications by Bolten and Alexandrescu, Perl's Taint Mode is enforced dynamically.
This paper describes a mechanism that allows users to define arbitrary sets of code features and to ensure during compilation that invoked functions offer all the code features callers require. The design takes advantage of C++'s template metaprogramming capabilities as embodied in the Boost MPL library. It applies to member functions, non-member functions, and function templates, but not to operators.
During my work on the research described in this paper as well as the paper
itself, I was the beneficiary of assistance from many people. The members of the
Boost Users mailing list provided invaluable help in learning how to apply the
MPL; Steven Watanabe went so far as to contribute the essence of the
MakeFeatures implementation.21 Herb Sutter suggested improvements
to my original design that removed restrictions and laid the groundwork for
additional refinements that ultimately yielded a fully compile-time-checked
design.20 Andrei Alexandrescu, Eric
Niebler, Bartosz Milewski, and Hendrik Schober provided useful comments on
earlier versions of this paper, including the exhortation to find a way to
eliminate the runtime checking I employed for virtual functions at that time.
Andrei suggested the use of inheritance as the basis of implicit conversions
among feature set types, and he also suggested the use of overloading to allow
virtual overrides in derived classes to offer more code features than their
base class counterparts. Steve Dewhurst suggested using positions in a master
type container as the basis for canonically ordering sequences of feature
classes.
When working with code features and the constraints they lead to, it can be convenient to refer to red code and green code. Red code lacks the feature(s) in question, hence is unconstrained: it can call any other functions. Green code offers the feature(s) being considered and is constrained to call only other functions that also offer the feature(s), i.e., it requires the feature(s) in the functions it calls.
When I was initially confronted with the problem of finding a way to keep red code from calling green code without an explicit syntactic indication (e.g., a cast), template metaprogramming (TMP) was not the approach that came to mind. I thought instead of namespaces. My idea was that red and green code could be separated into different namespaces, with the green code imported into the red namespace, but not vice versa. That would allow red code to call green code without adornment, but green code could call red code only with an explicit namespace qualification.15 For example:
namespace green {
void greenFunc(); // callable by anybody, but can call only other code in this namespace
}
namespace red {
using namespace green; // all green code is available to red code
void redFunc(); // callable only by unconstrained code, but can call anything
}
void green::greenFunc()
{
redFunc(); // error! Red code not visible in green namespace
red::redFunc(); // okay – call explicitly allowed
}
void red::redFunc()
{
greenFunc(); // okay
}
This approach quickly falls apart. It doesn't work for global functions, because they're not in a named namespace. If a green function makes an unqualified call to a red function with an argument whose type comes from the green namespace, C++'s argument-dependent lookup11,22 will cause the function in the red namespace to be found, thus circumventing the constraint checking the namespaces are supposed to provide. In addition, constraints may occur in arbitrary combinations, but namespaces must nest, and I was unable to envision a way to model arbitrary combinations of constraints using nested namespaces.
My next idea was to try to apply a technique akin to that used by Barton and Nackman to enforce dimensional unit correctness during compilation.8 Their approach is based on associating information with objects, however, and my need was for a way to associate it with functions, and it was not clear how their approach could be modified to transcend this difference.
The need to control functions got me to thinking about the use of enable_if technology to
enable and disable the applicability of functions for particular calls.13 Unfortunately, there was a
semantic mismatch between what I wanted to do and what enable_if is designed to achieve. My
goal was that calls from constrained to unconstrained code should not compile, but when the
condition controlling an enable_if-guarded function is unsatisfied, the function is simply
removed from the overload set, i.e., from the set of candidate functions considered for the call.
The call itself might still compile, because overload resolution might succeed with a different
function.
An additional problem with enable_if is that it doesn't apply to functions, only to function
templates. This makes it unsuitable for virtual functions, because they may not be templatized.
It also leads to the possibility of code bloat, because function templates with different enable_if
arguments could, through multiple instantiations, lead to multiple copies of identical object
code. This problem is one I overlooked during my initial design work, and my first
implementation of code constraints,18 though not based on enable_if, did assume that all constrained functions were templates.
Unsatisfied with enable_if's behavior, I turned my attention to traits as a mechanism for
associating constraint information with functions. Traits are primarily employed to map types to
information, but they can associate information with values, too, so I considered using function
addresses as such values. I abandoned this idea, however, in part because it was not clear how to
deal with function templates (which generate multiple functions, hence multiple addresses), in
part because traits would physically separate the constraints for a function from the function's
declaration(s), and a function's constraints is a key part of its interface. As such, it's important
that they be documented at or near the point where the function itself is declared.
I then noticed that compile-time dimensional analysis, enable_if, and traits had something in
common: they were all based on template metaprogramming. That led me to ponder whether
TMP in general and the MPL in particular could be used to solve the code constraint problem,
and that, in conjunction with the observation that iterator categories in the STL are represented
by empty classes, was the genesis of the design described in this article.
Discuss this article in the Articles Forum topic, Enforcing Code Feature Requirements in C++.
Scott Meyers, an independent consultant, is the author of Effective C++, More Effective C++, and Effective STL; author and designer of Effective C++ CD; Consulting Editor for Addison Wesley's Effective Software Development Series; and was a founding member of the Advisory Board for The C++ Source. He has a Ph.D in Computer Science from Brown University. He can be contacted at smeyers@aristeia.com.
On the surface, Scott's code features seem to be tied to their C++ implementation. It turns out that they can be translated into at least one other programming language. I took the challenge of implementing them in D, a relatively new general purpose language loosely based on C++ (for details, see http://www.digitalmars.com/d). What makes D a good candidate for the task is its extensive and well integrated support for metaprogramming. Metaprogramming in D does not require:
Instead a D compiler has a built-in D interpreter. It can execute a substantial subset of D at compile time.
A metaprogram is a program that generates a program. In D you can generate a program in the form of a string. The string can then be converted to actual D code using a “string mixin”—all at compile time. For instance, this code:
mixin ("int x;");
is equivalent to:
int x;
The D implementation of the main article's code features is based on generating a string containing the definition of a hierarchy of types as in Figure 3. The crucial idea, suggested to me by Andrei Alexandrescu, was to use D interfaces rather than classes. D does not support multiple inheritance, but interfaces can be multiply inherited and their inheritance is virtual.
Client code that defines a set of code features looks like this:
mixin (declareAllFeatures (["Tested", "Portable"]));
The function declareAllFeatures() is run at compile time. It takes an array of feature names and
generates a string with interface declarations. Here's the string corresponding to the above
example (complete with newlines for easier debugging):
"interface Portable: Portable_Tested {}
interface Tested: Portable_Tested {}
interface Portable_Tested {}
interface NoFeatures: Portable,Tested {}"
Incidentally, the same string can be generated and printed at run time using this line of code:
writeln (declareAllFeatures (["Tested", "Portable"]));
Such run time/compile time duality makes D metaprograms easy to test and debug.
Continuing with the client code, here's how you declare a function that guarantees "Portable" and "Tested":
void bar (ToType!(MakeFeatures (["Portable", "Tested"])) x)
The function MakeFeatures creates a string "Portable_Tested", which is converted to a D type
using the template ToType. Notice that Portable_Tested is one of the interfaces declared using
declareAllFeatures above.
The client may call the function bar with a particular set of requirements, which are declared
using MakeFeatures. For instance,
ToType!(MakeFeatures (["Tested"])) tested; // require Tested bar (tested);
Notice that the interface Tested inherits from Portable_Tested, so this call will compile
successfully.
Just to give you a taste of compile-time programming in D, here's the implementation of
MakeFeatures:
string MakeFeatures (string[] a)
{
if (a.length == 0)
return "NoFeatures";
else
return ctConcat (ctSort (a), '_');
}
It takes an array of strings (names of features). If the array is empty, it generates NoFeatures,
the name I gave to the interface corresponding to the bottom class in Figure 3. Otherwise it sorts
the array (compile-time sort) and concatenates its elements using the underscore as separator.
Here's the implementation of the compile-time concatenation function, without the separator option for simplicity:
string ctConcat (string [] arr)
{
string result = "";
foreach (s; arr)
result ~= s;
return result;
}
It's pretty self-explanatory if you know that the operator tilde is used to concatenate arrays (strings in this case). Notice that local variables and loops are okay at compile time.
Compilation times for the D implementation are negligible for up to seven features. The compilation of eight features took two minutes, and the compiler run out of memory at nine features.
The source code of the full D implementation is available at http://www.bartosz.com/features.
Bartosz Milewski is a member of the D design team.
const,” presentation to the Northwest
C++ Users Group, April 2007. Video available at http://video.google.com/videoplay?docid=-4728145737208991310&hl=en, presentation materials at http://www.nwcpp.org/Downloads/2007/redcode_-_updated.pdf.
|
Sponsored Links
|