Built-in Type Safety?

by Thomas Guest

June 25, 2005

Summary

C++ is a statically typed language but its type system is not bulletproof. This article reveals some all-too-common type glitches and how to fix them.

Built-in Type Safety?

Introduction

C++ is statically typed. Programmers inform their compiler about the types they are using. In return the compiler warns them of any type abuse it detects. Static typing doesn't cure all known defects, of course. On occasion, a defect which appears type-related can slip through the type system undetected. This article tells the true story of two such defects. It discusses how they might have been avoided and concludes with some more general advice.

Defect One

Signal Quality

The following function lives somewhere in the middleware of the Talk-To-Me wearable computer (see Sidebar below). The Talk-To-Me, when active, polls a hardware driver for signal quality information. This driver returns a Signal, which is a simple typedef for a built-in integral type. Some variation in the signal is to be expected, depending on location, atmospheric conditions and so on; but if a more substantial change occurs then the new value must be stored in RAM and written to flash memory. That's what this function does.

The Talk-To-Me Communicator

The Talk-To-Me communicator is a wearable computer. Despite its limited size and processing power it runs a range of software, including an email client, a personal organiser, and a cut-down web browser. It is an imaginary device used here to fictionalise the defects described in this article, which occurred, in real life, on a similar platform.

    /**
     * Handle a signal update.
     * If a change in the signal is detected, store the new value
     * and write it to flash. 
     */
    void
    signalUpdate(Signal update, Signal & stored)
    {
        int const tolerance = 10;
        
        bool const changed 
            = update > stored + tolerance
              ||
              update < stored - tolerance;
        
        if (changed)
        {
            flashWriteSignal(update);
            stored = update;
        }
    }

Maybe this function isn't perfect—perhaps the detection of change and subsequent action should be subfunctions, and certainly the numeric literal requires explanation—but it looks sound enough. In fact, it is so apparently sensible that the Talk-To-Me engineers are surprised when field tests reveal problems.

Problems

Writing to flash is expensive. It takes time, and, eventually, wears out the hardware. Field tests show that in poor signal conditions the Talk-To-Me is writing to flash far too often. The signalUpdate() function is to blame.

Attempted Solutions

Flash is only written to when the signal deviates by a certain amount from the previous stored value. Evidently the driver layer is reporting greater fluctuation than expected. The obvious solution is to bump up the magic number.

    int const tolerance = 20; // Used to be 10, but this caused
                              // too many calls to flashWriteSignal()

The changes are made, a patch is issued, field tests resume. Unfortunately, the problem persists—if anything, it seems worse.

There is only one way out:

    if (changed)
    {
        // flashWriteSignal(update); Avoid excessive writes to flash
        //                           Works around driver problems.
        stored = update;
    }

This rather drastic step means that the Talk-To-Me will lose settings when power-cycled, which isn't ideal, but at least a show-stopping defect can be downgraded to medium priority. Flaky hardware drivers get the blame. Field tests continue. The signalUpdate() function is left in this rather messy state.

Unit Tests

Once the panic has passed, we can refactor the function in order to test it:

    void
    signalUpdate(Signal update, Signal & stored)
    {
        int const tolerance = 20; // Used to be 10
        
        if (signalsDifferent(update, stored, tolerance))
        {
            // flashWriteSignal(update); Avoid excessive writes to flash
            stored = update;
        }
    }

In a new file, then, we have:

    ....

    /**
     * @return True if the input signals differ by more than the supplied
     * tolerance, false otherwise.
     */
    bool
    signalsDifferent(Signal s1, Signal s2, int tol)
    {
        return 
            s1 > s2 + tol || s1 < s2 - tol;
    }

The associated unit tests read:

    void testSignalsDifferentBoundaryCase()
    {
        assert(!signalsDifferent(0, 0, 0));
    }
    
    void testSignalsDifferentTypicalSignal()
    {
        assert(!signalsDifferent(70, 80, 10));
        assert(!signalsDifferent(80, 70, 10));
    }
    
    void testSignalsDifferentPoorSignal()
    {
        assert(!signalsDifferent(10, 10, 20));
    }

Test Failure

The testSignalsDifferentPoorSignal() test case fails. At last the real problem is evident.

For the purposes of this article I have been withholding a critical detail: namely which built-in integral type a Signal is. Inspecting the relevant header file shows that:

    /**
     * Typedef used to represent signal quality as a percentage.
     * 100 represents perfect signal, 0 no signal.
     */
    typedef unsigned Signal;

So, the expression in signalsDifferent(10, 10, 20) evaluates: 10u > 10u + 20 || 10u < 10u - 20 Now, when you subtract an int from an unsigned both values are promoted to unsigned and the result is unsigned. So, 10u - 20 is a very big unsigned number. Our expression is therefore equivalent to:

    false || true

which is of course true.

In fact, signalsDifferent() returns true whenever the second input Signal dips below the tolerance. So, if the signal quality drops below 20%, flash memory is written to every time the driver is polled. Hence the bug reported during field tests.

History

Rewinding the header file shows that once upon a time we had:

    /**
     * Typedef used to represent signal quality as a percentage.
     * 100 represents perfect signal, 0 no signal.
     */
    typedef int Signal;

The code worked just fine back then. Unfortunately the C++ type system offered no help when someone decided that an unsigned was more appropriate for a value in the range [0, 100]. No compiler warnings were, in this case, bad news.

Perhaps it would be more accurate to say that the C type system failed to help since this is one of those areas where C++ finds itself compromised and constrained by its C heritage. The integral promotion rules are well and truly entrenched in the common core of the two languages.

Solutions

There are many ways to fix this problem.

We could revert the header file and curse the engineer who violated the fundamental rule about fixing things which ain't broke. This approach, though, risks destabilising any newer code which assumes a Signal is unsigned.

We could recast the arithmetic so it works independently of signedness:

    bool
    signalsDifferent(Signal s1, Signal s2, int tol)
    {
        return 
            s1 > s2 + tol || s2 > s1 + tol;
    }

Note, though, that even this version of the function may catch us out if the additions overflow.

Or we could lean on the type system and get it working for us. The driver layer is written in C and may well prefer to offer the middleware built-in types but that doesn't mean the C++ middleware shouldn't use something safer: a range checked integer class (such as the one by Hubert Matthews [1]) perhaps:

    typedef CheckedInt<0, 100, SaturatedArithmetic> SignalPercent;

The perfect solution will have to be left to the reader. I'm not sure there is one. Finding the problem was the hard part: we really want to avoid similar problems in future.

Creating more specialised classes is no panacea. Such classes often end up implementing cast operators for ease-of-use which can make them susceptible to the very problems they're meant to prevent—only now we have an extra layer of abstraction to wade through. Matthews' implementation is sound, and the accompanying text describes the subtleties involved in getting such an apparently simple class right. Unfortunately it does not work out of the box on the Talk-To-Me, whose C++ compiler is closer to the embedded C++ specification than ISO/IEC 14882: 2003[2]).

Lessons Learned

There are several lessons we can learn from this.

Invest in unit tests. They catch bugs in even the simplest code.
Find a way to simulate poor signal conditions back in the lab, rather than waiting for field tests to throw up this kind of problem. This principle generalises: make sure your system tests simulate a full range of operating conditions.
If the types used in an interface must change in this way, don't make the changes backwards compatible. So, typedef int Signal would migrate to typedef unsigned SignalPercent rather than change silently. This means client code has to at least notice the change.
Take special care with the built-in types. They are designed to provide the most efficient code you can get on a platform short of writing machine code and, as such, are inevitably a source of portability problems. Although the standard lays down what you can expect, what you can expect often comes down to "implementation defined". When will an int overflow? What happens when you right shift a negative signed value? Is a char signed or unsigned? (Earlier, when I said that 10u - 20 was a very big unsigned number I wasn't being vague: the actual value is platform dependent.)

Defect Two

The Problem

Despite its tiny screen, the Talk-To-Me provides a rather neat email client. Users can navigate up and down a list of emails or view the selected email full-screen.

Once again, system testing indicates a problem.

The full-screen email view doesn't make use of the full screen. Text is being wrapped a few pixels short of the right hand boundary. This wouldn't matter much for a PC-based email client, which typically grabs plenty of space for toolbars, sliders and so forth—but on the Talk-To-Me's 288 by 256 pixel display, this is a big shame.

The Cause

The Talk-To-Me engineers are less suspicious of the driver layer this time. Other parts of the user interface use the full extent of the screen just fine.

Once investigated, the problem is swiftly tracked down to the following piece of code:

    /**
     * Render email full screen
     */
    void Email::fullScreen() const
    {
         Rectangle const 
             full_screen(0, 0, 288, 256);
         
         textRender(text,
                    full_screen,
                    true); // wrap text
    }

Here, textRender() requires as input:

the text data to render
the display region (a reference to a constant Rectangle)
a flag (which defaults to false) indicating whether or not the text should be wrapped.

Analysis

The numeric literals 288, 256 are not pretty but the Talk-To-Me comes with no other display options. The problem here is that the Rectangle constructor wanted them the other way round.

    class Rectangle
    {
    public: 
        /**
         * Construct a rectangle with the supplied input dimensions.
         */
         Rectangle(int top,
                   int left,
                   int bottom,
                   int right);

     ....

     };

So, the email was rendered to a width of 256 rather than 288 pixels.

The programmer makes the fix, feels ashamed (this has happened before), but won't get fooled again (fingers crossed).

Solutions

It's rather difficult to see how to guard against this sort of problem. The Rectangle class is fine; the Email class is using it in a reasonable way.

A unit test would have to be cunning. A system tester has to watch carefully.

You could argue—and I would agree—that the full screen rectangle ought to be defined just once and passed around as needed. That should sort out this particular problem and maybe a few similar ones. However most rectangles aren't full screen ones, so the problem hasn't been eradicated.

You could also argue that the type system could again be used to help—isn't it a type error to pass parameters in the wrong order?—but it's hard to see how X and Y coordinates could sensibly be made different types.

You can name your inputs:

    int const top    = 0;
    int const left   = 0;
    int const bottom = 256;
    int const right  = 288;

    Rectangle const 
        full_screen(top, left, bottom, right);

or resort to comments:

    Rectangle const 
        full_screen(0,    // top  
                    0,    // left
                    256,  // bottom
                    288); // right

These solutions are fragile. The compiler doesn't check comments for accuracy nor will it spot if your nicely named parameters match those declared in the constructor.

An Idea from Python

In Python, we might define a rectangle's initialiser as follows:

    class PyRect:
        '''Axis aligned rectangle'''

        def __init__(self, top, left, width, height)
            '''Initialise with the given position/dimensions'''
            ....

I have no particular reason for using width, height rather than right, bottom other than to indicate that there is no canonical way of specifiying an axis-aligned rectangle; if there were, the bug I'm discussing would have been less likely to trouble us.

We can now create PyRect objects:

    r1 = PyRect(0, 0, 288, 256)
    r2 = PyRect(top=0, left=0, width=288, height=256)
    r3 = PyRect(left=0, top=0, height=256, width=288)

Here, r1, r2 and r3 are equivalent. r2 and r3 are created using keyword argument syntax, which allows us to pass in parameters in the order we chose.

In C++, we can do something which superficially resembles this, allowing us to create Rectangles without fretting over parameter order.

    class Rectangle
    {
    public:
        /**
         * Default constructor builds an empty 
         * rectangle based at (0, 0).
         */
        Rectangle()  
        : top(0), left(0), width(0u), height(0u)
        {
        }
        
        /**
         * Set the rectangle top coordinate.
         * @param new_top
         * @returns a reference to self (for use
         * in method chaining).
         */
        Rectangle & top(int new_top)
        { 
            top = new_top;
            return *this;
        }
        
        Rectangle & left(int coord)
        .... 
                  
    private:              
        int top, left;
        unsigned width, height;
    };

We can construct objects of this class as follows:

    Rectangle r1 = Rectangle()
                       .height(256)
                       .width(288);
    
    Rectangle r2 = Rectangle()
                       .top(0)
                       .left(0)
                       .width(288)
                       .height(256);

    Rectangle r3 = Rectangle()
                       .width(288)
                       .height(256);

This technique is known as the Named Parameter Idiom [ 3].

An Idea from Smalltalk

Alternatively, we can adopt an idea originally documented in Smalltalk [ 4] and create wrapper types for the parameters required by the Rectangle constructor:

    class Xcoord 
    {        
    public:
        explicit Xcoord(int x);
    ....
    };

Having set up similar wrapper classes for Ycoord, Width, Height, we declare the Rectangle constructor:

    class Rectangle
    {
    public:
        Rectangle(Xcoord x,
                  Ycoord y,
                  Width w,
                  Height h);
    ....
    };

and the C++ type-system is once again working for us:

    Rectangle const full_screen(Xcoord(0),
                                Ycoord(0),
                                Width(288),
                                Height(256));

This technique is known as the Whole Value Pattern.

Note that with our strongly type-differentiated arguments we could, if we wanted, overload the constructor, allowing clients to supply parameters in a different order.

True Typedefs

A problem with the Whole Value Pattern approach is that we may end up creating different classes which essentially do the same thing (that is, wrap one of the built-in integral types). The Xcoord, Ycoord, Width and Height classes all need to implement the same arithmetical operators, and they must all allow access to the base value; in effect all we have done is replicate the same class to give it four distinct types.

(Note that if this really does turn out to be the case, then we've probably failed to identify the values used by our program correctly: the Whole Value Pattern isn't just about type-safety, it's also about identifying and properly encapsulating the separate value-based types in our problem domain.)

This problem is discussed in more depth in an article by Mark Radford [5], which identifies three candidate mechanisms for generating the required class families:

the C++ preprocessor
C++ templates
an external (non-C++) program—typically a script integrated into the build system

The STLSoft library [6] implements the second mechanism with its true_typedef class template [7], which allows us to define our wrapper classes as follows:

    #include <stlsoft/true_typedef.hpp>
    
    // Tag types
    struct Xcoord_t;
    struct Ycoord_t;
    struct Width_t;
    struct Height_t;
    
    typedef stlsoft::true_typedef<int, Xcoord_t> Xcoord;
    typedef stlsoft::true_typedef<int, Ycoord_t> Ycoord;
    typedef stlsoft::true_typedef<unsigned, Width_t> Width;
    typedef stlsoft::true_typedef<unsigned, Height_t> Height;

Here, declaration and use of the Rectangle constructor remain as before; the STLSoft header takes care of the rest.

Class Redesign

Even without named parameters, whole values or true typedefs, we can improve on things.

I mentioned before that there is no canonical way to construct a rectangle. There is, however, a canonical way to construct a point on a display screen— that is, to supply its X and Y coordinates, in that order. So, if we write:

 Point const bottom_right(288, 256);

we can be confident our Point really will be at the bottom right hand corner of the screen.

Instead of using coordinates and coordinate differences, we might, then, prefer to construct our Rectangle from two of its corners:

    class Rectangle
    {
    public Rectangle(Point const & top_left_corner,
                     Point const & bottom_right_corner);

    ....
    };

Anyone with a basic grasp of geometry will realise that if we wish to define a rectangular region on-screen from two corners, then those two corners must be diagonal opposites—just think of creating a rectangle in a graphical drawing package by anchoring one corner then dragging the other corner into position. There are eight permutations:

(Top Left, Bottom Right),
(Bottom Right, Top Left),
(Top Right, Bottom Left),
etc.

Of these, the first is the outstanding favourite, even without looking at the class documentation. What's more, a little simple logic in the Rectangle constructor allows it to make sense of the other seven, perhaps outputting a warning if it does not receive the expected pair.

So, then, although there is no one standard way to construct a rectangle, one of the standard ways is less vulnerable to misuse!

Conclusions

There may be some programmers who have been careful (or lucky) enough not to have been caught out by these problems. I suspect far more will know what I'm talking about from personal experience.

There are other points of weakness in the C/C++ type system. Enumerated values and booleans get mistaken for integral types all too easily. And what's worse, in a mixed language system, such as the one deployed on the Talk-To-Me, they can change size and alignment when passing from C to C++.

This article has offered a few survival tips already. I would like to conclude by adding a few more. There's nothing here which hasn't been said before, but I think these bear repeating in the light of the preceding.

Build Cleanly at High Warning Level

You don't want those warnings about "truncation, possible loss of data" to get submerged. They need attention. Preferably, build cleanly with more than one compiler. Ideally, have a Lint build target.

(Learn to) Use a Decent Editor

All good code editors have some notion of a source file in a wider context—as part of a project, for example. So when you write code to construct a rectangle you have immediate access to the declaration of the Rectangle's constructor. This makes it harder to submit parameters in the wrong order.

Invest in Unit Tests

I've already mentioned that unit tests can catch problems in even the simplest code. A unit testing regime can also improve the design of your code. To put a unit under test, it needs to be free of dependencies.

The source code presented in this article has been considerably simplified for expositional purposes. I do not think it a coincidence that the actual defective code was buried in the middle of rather more complicated functions.

Use Keyword Casts

I hope by now it is clear that if you really do need to cast between types you should:

do so intentionally
know exactly what you're doing.

C++ provides four different keyword cast operators. These allow programmers to write code which indicates clearly what kind of cast is being done, and why.

And Finally

Fixing defects can be fun. It can be strangely satisfying when, after many hours of setting breakpoints, watching variables, poring over log files, you finally expose the broken logic, the flawed arithmetic, the race-condition. (And how often does the fix turn out to be a one-liner, even a single character change?) Unfortunately the fun soon wears off as release dates approach, to be replaced by fear, panic, and the sense of despair which accompanies all hacky workarounds.

This article has focused on some C++ techniques to circumvent a couple of simple defects. Our best protection, is, however, language independent. It's down to the way in which we approach software development: and that will have to remain the subject of

Acknowledgements

I would like to thank the editorial team at The C++ Source for their help with this article.

References

CheckedInt: A policy-based range-checked integer, Hubert Matthews, http://www.oxyware.com/Publications.html.
The Embedded C++ Specification, http://www.caravan.net/ec2plus/language.html
C++ FAQ, The Named Parameter Idiom, http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.18
Whole Value, Ward Cunningham, http://c2.com/ppr/checks.html
Design Experiences in C++ - Generative Programming Goes Live!, Mark Radford, http://www.twonine.co.uk/documents.html
STLSoft, , http://stlsoft.org/
"True Typedefs", Matthew Wilson, Imperfect C++, Chapter 18, http://www.imperfectcplusplus.com/
The Boost Parameters Library , http://boost.org/ (As of June 2005, this library does not form part of the Boost 1.32.0 release, but can be downloaded directly from the CVS repository.)

Talk back!

Have an opinion? Readers have already posted 14 comments about this article. Why not add yours?

Introduction

Defect One

Signal Quality

Problems

Attempted Solutions

Unit Tests

Test Failure

History

Solutions

Lessons Learned

Defect Two

The Problem

The Cause

Analysis

Solutions

An Idea from Python

An Idea from Smalltalk

True Typedefs

Class Redesign

Other Problems

Conclusions

Build Cleanly at High Warning Level

(Learn to) Use a Decent Editor

Invest in Unit Tests

Use Keyword Casts

And Finally

Acknowledgements

References

Talk back!

About the author