The Artima Developer Community
Sponsored Link

The C++ Source
In the Spirit of C
by Greg Colvin
June 21, 2004

<<  Page 2 of 3  >>


Necessity is a Mother (Anonymous)

So where did this remarkable tool called C come from? In 1969 Ken Thompson set out to write a FORTRAN compiler for the nascent Unix operating system being created on his PDP-7 with Dennis Ritchie, Doug McIlroy and others. As Ritchie relates:

As I recall, the intent to handle Fortran lasted about a week. What he produced instead was a definition of and a compiler for the new language B. B was much influenced by the BCPL language; other influences were Thompson's taste for spartan syntax, and the very small space into which the compiler had to fit.

Like BCPL, B was a typeless language with a rich set of operations on machine words, which could hold integers, bit patterns, characters, data addresses, and function addresses. Our GCD example is easily translated to B code:

gcd(m, n) {
   while( m > 0 ) {
      if( n > m ) {
         auto t = m; m = n; n = t;
      m = m - n;
   return n;

How well does B embody The Spirit? Almost perfectly, in my opinion. "Trust the programmer" and "Don't prevent the programmer from doing what needs to be done" are obvious characteristics of a typeless language. An 8 kilobyte space limit and Thompson's spartan tastes conspired to "Keep the language small and simple" -- the auto specifier is just about the only syntactic cruft. With so little syntax one is lucky to find even "one way to do an operation", and by operating only on native machine words there is no impediment to a fast implementation.

So has evolution since B been just a Fall from Paradise?

"But of the tree of the knowledge of good and evil, thou shalt not eat..." (Genesis 2:17)

The PDP-7 was a word-addressed machine, but the PDP-11 was byte addressed. As a result the clumsy handling of characters in B, packing and unpacking bytes to and from machine words, became an obstacle to performance. Also, the PDP-11 was promised to soon have a floating point unit, but the 16 bit machine word would not suffice to hold a floating point value. To achieve better performance Dennis Ritchie decided to bite the apple and add char and float data types to B. This marked the first compromise with Proverbs 2 and 3 of the Spirit of C, trading the simplicity of typeless programming for maximal performance.

Having bitten, Ritchie improved other aspects of B by adding user-defined struct and union types and introducing the rule that the name of an array is converted to the address of its first member when used in expressions. A syntax for declaring types was also provided, and the evolution from B to C was well under way.

The rule that int was the default type and the fact that pointer values would fit in an int type on the PDP-11 made it possible for the earliest C compilers to accept most B code, and most of Unix, originally coded in assembly and B, was rewritten in a nearly typeless style of C. So we have the first appearance of the rule that �programmers don't pay for what they don't use,� and we also see a policy of backwards-compatible changes, both of which would have a great impact on the future evolution of C and C++.

As ever, the fruit of the Tree has its price. C remained a remarkably simple language, but with the more complex type system came more opportunities for error, and trusting the programmer to avoid error became more difficult. Over time tools like lint were provided to check for possible type errors, and compilers became more strict about the code they would accept. So Proverb 1 was also compromised as more trust was placed in the tools and less in the programmer.

Were you wondering was the gamble worth the price? (Joni Mitchell)

If Dennis Ritchie bit the apple, Bjarne Stroustrup went on to become a veritable Johnny Appleseed. The C++ language is very nearly completely type-safe, which is a good thing, as its type system is arguably the most complex of any language. In addition to the fundamental and derived types of C we have references, inheritance, multiple inheritance, virtual inheritance, pure and virtual member functions, runtime type information, function templates, type templates, type deduction, and more. Proverb 3 was abandoned at the wayside, as the language gained expressive power at the expense of simplicity.

Although object-orientation was the initial motivation for extending C to C++, the most powerful extension has turned out to be the generic programming facility provided by templates. Templates were introduced to allow for type-safe containers, so that one could define a class like list<T> just once and then use it for any kind of list element. But in 1994 Erwin Unruh brought an innocent-looking little program to Santa Cruz that failed to compile, but caused the compiler to generate a sequence of prime numbers in its diagnostic output. I recall being mystified, then amused, then horrified. By introducing templates we had inadvertently added a Turing- complete meta-language to C++. At that point we could have restricted the template facility to prevent such meta-programming, but instead we took a gamble and embraced it, which cost us no end of pain as the impact of templates rumbled through the language and library.

Does type safety mean the programmer need no longer be trusted? Hardly. All the undefined behavior possible in C remains possible in C++, along with new ways to go wrong. In my experience there is almost no limit to the damage that a sufficiently ingenious fool can do with C++. But there is also almost no limit to the degree of complexity that a skillful library designer can hide behind a simple, safe, and elegant C++ interface. Generic meta-programming in particular is proving to be an amazing tool for creating simple interfaces to maximally efficient implementations of complex facilities, so in my opinion the gamble has been worth the price.

<<  Page 2 of 3  >>

Sponsored Links

Copyright © 1996-2018 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use