Guerrilla Development
The Myth of External Program Documentation
by B. Scott Andersen
September 7, 2003

Summary
If you read Computer Science text books you might imagine external program documentation actually exists. Yet, when I asked many of my colleagues if they had ever had such documentation throughout a project's life cycle, they laughed a bit uncomfortably and said, "well, no, not really." What's going on here?

Documentation

"Do you have written documentation for your project? You know, all that stuff the books say you're supposed to have: theory of operations documents and high-level designs and stuff like that?", I asked.
"You're kidding, right?", was the response.

The Myth of External Program Documentation

In this blog I'd like to pick up on something I ended with last time. Here's the final note from The Future of UML:

I ran into somebody who had still been associated with the project this summer, some 4+ years after the first new Prescriber had shipped. "Had the documentation been helpful?", I asked. "Are you kidding?!", my friend replied. "It scared the hell out of 'em." I looked aghast. He continued, "they take one look at the size of the notebook you left and they say I'm not touching that! It looks complicated!"

If you read Computer Science text books you might imagine external program documentation such as high-level designs, specifications, and theory of operation descriptions actually exist in most projects. Yet, when I asked many of my friends and colleagues if they had ever had such documentation throughout a project's life cycle, they laughed a bit uncomfortably and said, "well, no, not really." What's going on here?

While we're on it, how do you explain the reaction to the project documentation I'd left for the Prescriber? Is this common? The answer to that appears to be "yes". External program documentation, despite what you may have learned in all those high-minded computer science text books, remains one of the weak links in software development. How did this come to be and what is it costing us? And, why, if the industry standard is to eschew external program documentation, do we continue to preach the importance of such things?

Obsolescence

In his book Extreme Programming Explained: Embrace Change, Kent Beck expands on the original programmer adage "good, fast, cheap: pick two" and claims customers actually choose three of four factors: cost, schedule, features, and quality. Developers choose (or are at least stuck with) the last variable. In either case I think we have, at best, done a poor job of identifying costs or, at worst, have openly lied about costs, thereby skewing what customers might have selected.

Program maintenance constitutes 40-80% of software costs, yet I recall few discussions during program planning or development about such things. Further, maintenance costs fall largely into the category of enhancements. As Robert Glass in his book Facts and Fallacies of Software Engineering, puts it:

The 60/60 rule: 60 percent of software's dollar is spent on maintenance, and 60 percent of that maintenance is enhancement. Enhancing old software is, therefore, a big deal.

Given this large number one might expect that we'd be looking for ways to manage this cost and mitigate any associated risks. Steve Rakitin, in his book Software Verification and Validation for Practitioners and Managers, make the observation (also quoting from DeMarco and Lister) that "Turnover is incredibly expensive." Rakitin's book focuses squarely on software quality and I believe he's right to discuss turnover in his "Balancing People, Process, and Product" chapter. But, why is turnover so expensive?

It is estimated that roughly 30% of the total maintenance time is spent "understanding the existing product". (Again, I turned to Glass's book for this figure since it was handy.) This fact relates directly to the turnover number as illustrated by an Air Force study in 1983 where researchers found that the "biggest problem of software maintenance" was "high [staff] turnover" (at 8.7 on a scale of 1 to 10), "understanding and lack of documentation" (7.5), and "determining the place to make a change" (6.9). I contend they are all related. If you have no usable documentation then all of the information is in people's heads. If the heads walk out the door (turnover) then the information needs to be rediscovered. That is not cheap.

Of course, useless documentation is no help. Since a small percentage of the software life cycle is dedicated to the creation of documentation, the quality of it is immediately suspect. After all, if the documentation is already untrustworthy, why maintain it? Why throw good money after bad? But, is a given document useless 10 minutes after it is written? How about 10 days? How about 10 weeks? The tendency is to dismiss anything outside the code as "out-of-sync with reality" whether, objectively, that is true or not.

Finally, there is a notion that the code is golden (at least documents itself) even if no other external document is present purporting to do so. Tools like JavaDOC, which scrape the Java source code of your project and create a hierarchy of web pages, can create the documentation at the push of a button. The bits will be new and fresh but are they right? That is, how is it more likely that the comments describing the inner workings of a particular subsystem are more accurate, descriptive, and insightful just because they were pulled from the Java code? Put another way, does source code and its associated comments ever get "out-of-sync"? Of course it can. At best the co-location of the documentation and the code can eliminate the need for "finding the right place to update the documentation", but it isn't a panacea. It still takes work, and discipline, to ensure the documentation is correct.

Assume for a moment we're willing to maroon the maintenance programmer with no (useful) documentation. Can documentation early the software life cycle help mitigate risk? It can't be out-of-sync too far while we're still writing it, can it?

Failure to Drive Out Risk

One of the arguments for external program documentation, especially before the coding stage, is that it should help drive out risk. I actually agree with this, at least in principal, but have been forced to also recognize this activity's limits. For example, it isn't unheard of to have a feasibility study to determine if a project is even worth attempting. Certainly this would qualify as a risk reducing activity. Yet, as Glass recounts in his book, he was in the audience at the International Conference on Software Engineering (ICSE) in Tokyo in 1987 when Jerry Weinberg presented his keynote. Weinberg asked the audience who among them had ever participated in a feasibility study where the answer came back "No". Not a single hand of the 1,500 in attendance was raised. Which begs the question "how many of these documents are good science and how many are simply position papers.

Before we have too much fun beating up management on their feasibility studies I think we should take a hard look at our own writings. That is, how many of our (scant few) documents are intended to be good science and how many have simply been constructed to deflect attacks from our critics, get our way on some technical issue, select our favorite vendor, choose our favorite product, or simply embarrass those who dared to disagree with us?

Glass quotes a fellow named Bill Curtis who said "in a room full of top software designers, if any two of them agree, that's a majority." We programmers do like to have our opinions! But, it is sometimes difficult to even have documentation crisp enough to know what it is we are arguing about. I've seen some poor designs win over better alternatives put forth because the advocate for the better alternative simply didn't have the presentation skills to get the facts out there. Winning the argument and getting the right answer are two different things. In the case where the poorer design won, it is difficult to justify the documentation process as risk mitigation.

Literature

In many ways, a good design document is like a novel. It speaks of a world thus far only fictitious, in a tense that makes it sound like it already exists. "The program does this and this and that..." Since most engineers are poor writers (or, at least inexperienced novelists), it is no wonder good design documents are hard to find.

I think it goes beyond that, however. It takes some courage to put things in writing and then live with the consequences. Going on the record may not be in your (personal) best interest -- even if you're right. And, of course, being wrong, in writing, can hang about your neck like the proverbial albatross.

In some ways software design documents differ from novels: usually the novelist knows how the story is going to end. Such is not always the case in software development. Yet, the design decisions made early in a project will affect the software throughout its life cycle. Have you ever worked on something and asked, either under your breath or even out loud, "what were they thinking?!" Documentation of the thinking at the time, even if it drew incorrect conclusions, would be revealing to a maintenance programmer later. Such revelations could save that programmer hours or even days if they had such a document. "Oh, I see where they were going with this; and, I can see why it didn't work out." Still, with all the value that might give, we don't do it.

The final comparison to literature I believe is most apt. Writers make little money. This may also be true in the software world. Our boss or our customer needs the software; the documentation is our problem or just some internal software concern. Customers and management don't put any emphasis on it, so software developers concentrate on what they're being measured upon: delivering code.

Myth vs. Reality

My problem with all this is that I can't reconcile the software engineering practices book world from the real world. The software books blithely continue to tell us about external program documentation and how it is used throughout the software life cycle. For the vast majority of projects I have worked on, there are no such documents. I contend we should fix one or the other. The Extreme Programming crew has made their choice: they don't pretend to create such artifacts. (Castigate them if you must, but at least they are honest about what they do.) For the rest of us, it is a world of denial. I can't help but wonder if we couldn't do better.

References

Facts and Fallacies of Software Engineering by Robert L. Glass.
Find it on amazon.com here.
Extreme Programming Explained: Embrace Change by Kent Beck.
Find it on amazon.com here.
Curtis, B. R., Guindon, H. Krasner, D. Waltz, J. Elam, and N. Iscoe. 1987.
Empirical Studies of the Design Process: Papers for the Second Workshop on Empirical Studies of Programmers. MCC Technical Report Number STP-260-87.
I wasn't able to find this on the web. The reference is from the Glass book (above). It sounds interesting, though. If somebody has a pointer to it, please let me know. I'd like to read the original.
Software Verification and Validation for Practitioners and Managers, Second Edition by Steven R. Rakitin
Find it on amazon.com here. My review of that work appears on that page as well.
Peopleware : Productive Projects and Teams, 2nd Edition by Tom DeMarco & Timothy Lister
Find it on amazon.com here.
The Psychology of Computer Programming Silver Anniversary Edition by Gerald M. Weinberg.
Find it on amazon.com here. My review for the work may also be found on that page. Weinberg explore what does, and does not, motivate software professionals which, of course, is directly related to this.
See all of my reviews on amazon.com here.

Talk Back!

Have an opinion? Readers have already posted 14 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever B. Scott Andersen adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

B. Scott Andersen has 20+ years of experience in software development splitting his time between individual contributor and management. He is now a Principal Software Engineer with Verocel, Inc., a company specializing in helping safety-critical system developers attain certification for their products. The opinions expressed here are his own and he takes full responsibility for them... unless, of course, they are worth money, at which point they belong to his employer.


	Web Artima.com