As Simple As Possible?

An Editorial

by Chuck Allison

August 17, 2005

Summary

While the C++ Standards committee is about midway through formulating the next official version of C++, Chuck ponders the relationship between power and complexity.

As Simple As Possible?

I recently spent a month teaching advanced C++ and introductory Python at the world's fourth largest software company. The C++ students were sharp, experienced developers. The Python students were talented testers but fairly new to development. Naturally I couldn't keep myself from comparing these two important languages, as well as the two teaching experiences overall.

This was the first time I taught Python to non-programmers—my usual audience is my senior Analysis of Programming Languages class. It is also the first time I ever had students thank me as a class for each day's experience; they sincerely thanked me after each of the eight sessions, and even applauded on the last day. While I'd like to think that they were applauding me, I know that they were also applauding Python itself. It is easy to learn, yes, but it is also very powerful. Lists, tuples, and dictionaries are built-in abstractions, and the syntax is so clean it squeaks. On day one we were doing what it takes a std::map to do in C++ (and remember these were non-programmers). To show Python's object-oriented power, here's a nifty little example I got from Kent Johnson on one of the python.org mailing lists�

Like Smalltalk, Python supports class methods, which differ from static methods in that the exact dynamic type of the object being operated on is an implicit parameter to the method (as a type object). The following example keeps a separate per-class object counter for every class in a hierarchy (in this case, a Shape hierarchy):

class Shape(object):
    _count = 0  # A shared value for Shape classes with no current objects
        
    @classmethod
    def _incr(cls):
        cls._count += 1         # Create/update class attribute
      
    @classmethod
    def showCount(cls):
        print 'Class %s has count = %s' % (cls.__name__, cls._count)
        
    def __init__(self):         # A constructor
        self._incr()

class Point(Shape): pass        # An empty subclass of Shape
class Line(Shape): pass         # Ditto

This requires some perusing if you're new to Python. Python requires indentation to show logical subordination (whereas good programmers of other languages indent by convention), so you can readily see that the class Shape has three methods, _incr, showCount, and __init__. The latter is the special name used for constructors. The other two methods are class methods (indicated by the @classmethod prefix), so their parameter is the unique class object for the instance on which they are invoked. (The term "class object" here refers to a unique, bona fide object that describes a class type, similar to, but more robust than, std::type_info objects in C++.) The method named _incr is only called by the constructor.

Now consider what happens when the following lines execute:

p1 = Point()
p2 = Point()
p3 = Point()
Point.showCount()
Line.showCount()
x = Line()
Line.showCount()

The variable p1 is bound to a new instance of Point, whose constructor calls self._ incr(Point). (The implicit parameter I have named self is just like this in C++, except it must appear explicitly in the method's parameter list.) Since _incr is a class method, it implicitly receives the class object for Point in its cls parameter, p1 being bound to a Point object. The statement

cls._count += 1

is equivalent in this case to

Point._count = Point._count + 1

The first time _incr executes on behalf of the Point class object, the attribute Point._count doesn't exist, so the expression Point._count on the right-hand side of the assignment accesses Shape._count (which is 0) by inheritance. The assignment then creates a new attribute named _count inside the Point class and initializes it to 1. At this point, Point._count is 1, and Line._count doesn't yet exist (all attributes are bound at runtime in Python). When p2 is created, Point._count already exists, so it is incremented. Calling Line.showCount() the first time refers to cls._count, which is really Line._count, but no such attribute exists yet in the Line class, so the _count attribute in Line's base class, Shape (still 0!), is used. When the Line variable named x is created a few lines down, Line._count is created and initialized to 1, just as Point._count was earlier. The final output of this code is:

Class Point has count = 3
Class Line has count = 0
Class Line has count = 1

After three years of study, I have concluded that Python is about as simple as a full-powered object-oriented language can get. My inner programmer just loves it. In a recent interview Scott Meyers was asked which language he thought would be ideal for introducing programming to novices. He replied:

"� a first language should foster a feeling of power and accomplishment out of the box — it should get people excited about the limitless things that can be accomplished in software. Among other things, such a language would offer an extensive and powerful library�" [ 1]

I know of no language to which these words apply more than Python.

A C++ implementation of this counting idea is none other than the canonical example for a Curiously Recurring Template Pattern (aka a Self-parameterized Base Class), and goes like this:

// A base class that provides counting
template<class T> class Counted {
  static int count;
public:
  Counted() {
    ++count;
  }
  Counted(const Counted<T>&) {
    ++count;
  }
  ~Counted() {
    --count;
  }
  static int getCount() {
    return count;
  }
};

template<class T> int Counted<T>::count = 0;

// Curious class definitions
class CountedClass : public Counted<CountedClass> {};
class CountedClass2 : public Counted<CountedClass2> {};

Both CountedClass and CountedClass2 have inherited their own copy of a variable named count, because their base classes are different; e.g., CountedClass inherits from Counted<CountedClass>. This seems circular at first, but it works because the size of Counted<T> does not depend on T. The following test driver verifies that things work as expected:

#include <iostream>
using namespace std;

int main() {
  CountedClass a;
  cout << CountedClass::getCount() << endl;    // 1
  CountedClass b;
  cout << CountedClass::getCount() << endl;    // 2
  CountedClass2 c;
  cout << CountedClass2::getCount() << endl;   // 1
}

Which version do you find easier to grok? If you're not used to dynamically-typed languages, you may prefer the C++ version, but I think the Python version has a more out-of-the-box feel. (C++ novices don't learn templates, let alone CRTP, in their first week.)

Nonetheless, there are certain applications I wouldn't dream of writing in Python (or in Java or C#, for that matter). When efficiency is an issue, and you need ultimate flexibility, C++ is the only object-oriented (multi-paradigm, even) game in town. That's certainly one reason it's still so heavily used. That's why I was summoned to Symantec teach it this summer. Their C++ developers do heavy lifting that the Python folks couldn't begin to do. I don't need to convince this audience.

Einstein's maxim, "Solutions should be as simple as possible, but no simpler," is worth pondering here. If a language is too simple, it will lack the power needed to craft industrial-strength applications (try solving systems of differential equations efficiently in Forth or Visual Basic). Any tool or discipline rich enough to solve any software problem within real-world resource constraint parameters is probably not "too simple." Likewise, any such tool or discipline will likely come with a steep learning curve, just like non-Euclidean geometry is as easy as it's ever going to get [2] :-).

Since my wife is a hand weaver, comparing languages to looms readily comes to mind. A loom has warp threads running vertically from the weaver's point of view. These threads are attached in varying configurations to a number of shafts; lifting a shaft lifts its attached threads, so when a horizontal (weft) thread passes through, it interleaves between the threads lifted by the shaft and those that are not, exposing only part of the weft thread in the finished result. The complexity of a fabric's overall design is proportional to the number of shafts in the loom; the more shafts, the richer the result. Beginners typically start with toy looms with two shafts, and then progress to four, eight, and so on. To implement an interesting design on a loom unassisted by computer or Jacquard attachments [3] requires sixteen or twenty-four shafts. The effort required to effectively use such looms is great, but the result is worth the effort and simply unattainable by more elementary looms.

In the world of computer programming, C++ is a 128-shaft loom.

But is C++ as simple as possible? It depends on what you think is "possible." I once asked Bjarne Stroustrup what language he would have designed if C compatibility were not an issue, and he let me know in no uncertain terms that the question was not a fair one. He was right, I suppose. I still hear it said that there is too much investment in the C/C++ relationship for these languages to "break up" now, but I wonder if it is time to revisit that assumption. (I personally wouldn't mind if they parted ways—my C days are far behind me; your mileage may vary). Even if the bothersome noise in C++ due to C compatibility isn't going away anytime soon, is there still some room to considerably simplify things anyway?

When asked to name three things he disliked about C++, Scott Meyers said:

"I'd like to answer this question with 'complexity, complexity, complexity!', but naming the same thing three times is cheating. Still, I think that C++'s greatest weakness is complexity. For almost every rule in C++, there are exceptions, and often there are exceptions to the exceptions. For example, const objects can't be modified, unless you cast away their constness, in which case they can, unless they were originally defined to be const, in which case the attempted modifications yield undefined behavior. As another example, names in base classes are visible in derived classes, unless the base class is instantiated from a template, in which case they're not, unless the derived class has employed a using declaration for them, in which case they are." [ 1]

I must admit that one of my greatest frustrations in teaching C++ is all of the arcane twists and exceptional cases. Consider the following "gotcha" that occurs in a beginning course. When introducing templates, you want to keep things simple, so here's a class template that holds a single value:

template<typename T>
class Box {
    T value;
public:
    Box(const T& t) {
        value = t;
    }
};

Now try to introduce a stream insertion operator as a friend in the usual way:

template<typename T>
class Box {
    T value;
public:
    Box(const T& t) {
        value = t;
    }
    friend ostream& operator<<(ostream&, const Box<T>&);
};

template<typename T>
ostream& operator<<(ostream os, const Box<T> b) {
    return os << b.value;
}

Students expect to be able to do this as they do for non-template classes. But wait a minute. Is operator<< here a template or not? And do I want it to befriend all specializations of Box or not? No conforming C++ compiler will let you instantiate the Box class and use the associated stream inserter. I think g++ gives the best diagnostic for this error:

box.cpp:11: warning: friend declaration 'std::ostream&
   operator<<(std::ostream&, const Box<T>&)' declares a non-template function
box.cpp:11: warning: (if this is not what you intended, make sure the function
   template has already been declared and add <> after the function name here)

This message suggests the following correct solution:


// Forward declarations
template<class T> class Box;
template<class T>
ostream& operator<<(ostream&, const Box<T>&);


template<class T> class Box {
  T value;
public:
    Box(const T& t) {
        value = t;
    }
  friend ostream& operator<< <>(ostream&, const Box<T>&);
};

template<class T>
ostream& operator<<(ostream& os, const Box<T>& b) {
  return os << b.value;
}

An alternative solution defines the operator friend function in situ:

template<typename T>
class Box {
    T value;
public:
    Box(const T& t) {
        value = t;
    }
    friend ostream& operator<<(ostream& os, const Box<T>& b) {
      return os << b.value;
    }
};

The first approach requires forward declarations and a very unusual syntax in the declaration for operator<<. The second, while simpler, is not exactly equivalent to the second, because operator<< is not a template in that case [ 4].

Do you think this bewilders beginners? You bet it does—and quite a few veteran developers as well.

No, I don't think C++ is as simple as it could be. I love templates, I love the flexibility of C++, but I too often lament lost brain cycles tracking all the gotchas, especially when teaching. Speaking of PL/I, Dijkstra remarked

"I absolutely fail to see how we can keep our growing programs firmly within our intellectual grip when by its sheer baroqueness the programming language—our basic tool, mind you!—already escapes our intellectual control." [ 5]

Food for thought. My current leaning is to use Python when I can, C++ when I must (which is still often and usually enjoyable) [ 6], and to be ever on the lookout for a language that gives most of the power and flexibility of C++ but loses most of the headaches. Finding one is not going to be easy [ 7]. Here's hoping that the standards committee will do all it can to make using C++ as enjoyable as the incredible software we create with it.

Acknowledgements

Thanks to reader Steve Chaplin for suggesting a simplification to the Python example in the beginning of this article. As always, thanks go to our Editorial Board for their comments given prior to publication.

Notes and References

An interview by BookPool with Scott Meyers, July 11, 2005.
Quoted by Arthur J. Siegel on python.org's Edu-SIG email forum, May 10, 2005.
The Jacquard loom of the mid-1800s used punched cards and is considered the world's first working "computer." In a Jacquard loom, each thread is a virtual shaft, independently controlled, so arbitrarily-complex designs are feasible. Today's computer-controlled looms are essentially Jacquard looms, but of course much faster. In the domain of programming languages, using a Jacquard loom can be compared to using assembly language.
Dan Saks calls this "making new friends," since a new non-template friend is created for each instantiation of Box.
From his 1972 Turing Award Lecture, The Humble Programmer.
For those of you who use both languages, you may want to familiarize yourself with Boost.Python, if you haven't already. It allows Python and C++ to comfortably interoperate.
I see some promise in Walter Bright's D Programming Language. It's not a C++ "replacement," but it makes the C++ developer feel at home while providing support for systems programming, exported templates, template metaprogramming, nested function definitions and dynamic function closures, inner classes, garbage collection (but it's still compiled), unit testing, contract programming, and lots more.

Talk back!

Have an opinion? Readers have already posted 19 comments about this article. Why not add yours?

About the author

Chuck Allison is the editor of The C++ Source. He has over 20 years of industrial software development experience, and is currently a professor of Computer Science at Utah Valley State College. He was a contributing member of the C++ Standards Committee throughout the 1990s and designed std::bitset. He was a contributing editor for the C/C++ Users Journal from 1992-2001 and Senior Editor from 2001-2003. Chuck has been a regular lecturer at Software Development Conference for over 10 years, and is author of C & C++ Code Capsules: A Practitioner's Guide (Prentice-Hall, 1998), and co-author with Bruce Eckel of Thinking in C++, Volume 2: Practical Programming (Prentice-Hall, 2004). His company, Fresh Sources, Inc., offers training and mentoring in C++ and Java development.