|
|
|
Advertisement
|
So where did this remarkable tool called C come from? In 1969 Ken Thompson set out to write a FORTRAN compiler for the nascent Unix operating system being created on his PDP-7 with Dennis Ritchie, Doug McIlroy and others. As Ritchie relates:
As I recall, the intent to handle Fortran lasted about a week. What he produced instead was a definition of and a compiler for the new language B. B was much influenced by the BCPL language; other influences were Thompson's taste for spartan syntax, and the very small space into which the compiler had to fit.
Like BCPL, B was a typeless language with a rich set of operations on machine words, which could hold integers, bit patterns, characters, data addresses, and function addresses. Our GCD example is easily translated to B code:
gcd(m, n) {
while( m > 0 ) {
if( n > m ) {
auto t = m; m = n; n = t;
}
m = m - n;
}
return n;
}
How well does B embody The Spirit? Almost perfectly, in my opinion.
"Trust the programmer" and "Don't prevent the programmer from doing what
needs to be done" are obvious characteristics of a typeless language. An
8 kilobyte space limit and Thompson's spartan tastes conspired to "Keep
the language small and simple" -- the auto specifier is just
about the only syntactic cruft. With so little syntax one is lucky to
find even "one way to do an operation", and by operating only on native
machine words there is no impediment to a fast implementation.
So has evolution since B been just a Fall from Paradise?
The PDP-7 was a word-addressed machine, but the PDP-11 was byte
addressed. As a result the clumsy handling of characters in B, packing
and unpacking bytes to and from machine words, became an obstacle to
performance. Also, the PDP-11 was promised to soon have a floating point
unit, but the 16 bit machine word would not suffice to hold a floating
point value. To achieve better performance Dennis Ritchie decided to bite
the apple and add char and float data types to
B. This marked the first compromise with Proverbs 2 and 3 of the Spirit
of C, trading the simplicity of typeless programming for maximal
performance.
Having bitten, Ritchie improved other aspects of B by adding
user-defined struct and union types and
introducing the rule that the name of an array is converted to the address
of its first member when used in expressions. A syntax for declaring
types was also provided, and the evolution from B to C was well under way.
The rule that int was the default type and the fact that
pointer values would fit in an int type on the PDP-11 made it
possible for the earliest C compilers to accept most B code, and most of
Unix, originally coded in assembly and B, was rewritten in a nearly
typeless style of C. So we have the first appearance of the rule that
�programmers don't pay for what they don't use,� and we also see a policy
of backwards-compatible changes, both of which would have a great impact
on the future evolution of C and C++.
As ever, the fruit of the Tree has its price. C remained a remarkably simple language, but with the more complex type system came more opportunities for error, and trusting the programmer to avoid error became more difficult. Over time tools like lint were provided to check for possible type errors, and compilers became more strict about the code they would accept. So Proverb 1 was also compromised as more trust was placed in the tools and less in the programmer.
If Dennis Ritchie bit the apple, Bjarne Stroustrup went on to become a veritable Johnny Appleseed. The C++ language is very nearly completely type-safe, which is a good thing, as its type system is arguably the most complex of any language. In addition to the fundamental and derived types of C we have references, inheritance, multiple inheritance, virtual inheritance, pure and virtual member functions, runtime type information, function templates, type templates, type deduction, and more. Proverb 3 was abandoned at the wayside, as the language gained expressive power at the expense of simplicity.
Although object-orientation was the initial motivation for extending C to
C++, the most powerful extension has turned out to be the generic
programming facility provided by templates. Templates were introduced to
allow for type-safe containers, so that one could define a class like
list<T> just once and then use it for any kind of list element. But in
1994 Erwin Unruh brought an innocent-looking little program to Santa Cruz
that failed to compile, but caused the compiler to generate a sequence of
prime numbers in its diagnostic output. I recall being mystified, then
amused, then horrified. By introducing templates we had inadvertently
added a Turing- complete meta-language to C++. At that point we could
have restricted the template facility to prevent such meta-programming,
but instead we took a gamble and embraced it, which cost us no end of pain
as the impact of templates rumbled through the language and library.
Does type safety mean the programmer need no longer be trusted? Hardly. All the undefined behavior possible in C remains possible in C++, along with new ways to go wrong. In my experience there is almost no limit to the damage that a sufficiently ingenious fool can do with C++. But there is also almost no limit to the degree of complexity that a skillful library designer can hide behind a simple, safe, and elegant C++ interface. Generic meta-programming in particular is proving to be an amazing tool for creating simple interfaces to maximally efficient implementations of complex facilities, so in my opinion the gamble has been worth the price.
|
Sponsored Links
|