Programming Defensively

A Conversation with Andy Hunt and Dave Thomas, Part IX

by Bill Venners

April 28, 2003

Summary

Pragmatic Programmers Andy Hunt and Dave Thomas talk with Bill Venners about the importance of programming defensively against your own and other's mistakes, of crashing near the cause, and understanding the proper use assertions.

Andy Hunt and Dave Thomas are the Pragmatic Programmers, recognized internationally as experts in the development of high-quality software. Their best-selling book of software best practices, The Pragmatic Programmer: From Journeyman to Master (Addison-Wesley, 1999), is filled with practical advice on a wide range of software development issues. They also authored Programming Ruby: A Pragmatic Programmer's Guide (Addison-Wesley, 2000), and helped to write the now famous Agile Manifesto.

In this interview, which is being published in ten weekly installments, Andy Hunt and Dave Thomas discuss many aspects of software development:

In Part I. Don't Live with Broken Windows, they discuss the importance of software craftsmanship and the importance of staying on top of the small problems in your projects.
In Part II. Orthogonality and the DRY Principle, they discuss the importance of keeping your system orthogonal, and the real meaning the DRY, or Don't Repeat Yourself, principle.
In Part III. Good Enough Software, they discuss the myth of bug-free software, the importance of specifying level of quality as a system requirement, and the need for every team member to inject quality throughout the development cycle.
In Part IV. Abstraction and Detail, they discuss an approach to design in which details are pulled out of the code and stored as metadata.
In Part V. Building Adaptable Systems, they discuss reversible design decisions, the cost of change curve, going beyond the requirements, and making systems configurable.
In Part VI. Programming Close to the Domain, they discuss the benefit of programming in a language close to the business domain.
In Part VII. Programming is Gardening, not Engineering, they discuss a gardening metaphor for software development, the reasons coding is not mechanical, and the stratification of development jobs.
In Part VIII. Tracer Bullets and Prototypes, they discuss the importance of getting feedback during development by firing tracer bullets and building prototypes.
In this installment, they discuss the importance of programming defensively against your own and other's mistakes, of crashing near the cause, and understanding the proper use assertions.

Expecting Mistakes

Bill Venners: In your book, The Pragmatic Programmer, you suggest programming defensively against the potential mistakes of others, and against our own mistakes. Why shouldn't I trust myself or others, and what should I do about it?

Dave Thomas: The simple answer to that is, when was the last time you or I or anyone else was perfect? We know that we make mistakes all the time. Our compilers, our computers, our applications, and our users tell us we make mistakes. So why shouldn't we learn that lesson of history, and account for mistakes up front?

Andy Hunt: There is an interesting parallel in woodworking, which I do as a hobby. A woodworking magazine recently had an article suggesting that how a woodworker deals with mistakes is what differentiates a master from someone who's not. Everyone makes mistakes: You cut something too short; There's a crack in the board; Something goes wrong. Do you have the skill to work with the mistake, to get beyond it? Or are you just dead in the water and you have to start over?

Dave Thomas: That analogy is a good one. When you're working with wood and you make a mistake, you have to decide, "Do I go on with this? Or do I stop and go way back, fix the problem, and start again?"

Andy Hunt: Is it salvageable?

Dave Thomas: Right. And that's something developers would do well to think about.

Andy Hunt: You can look at it from the other way around too. Once you cut a piece of wood too short, it's really hard to grow it back long again. That's a bad thing. Once you've done something to a piece of code where you've really got to start over, that's where you just have to bite the bullet and really start over. You can't patch it. But programmers don't like to make that decision. I think most programmers prefer to say, "No, I'm sure I can just patch around it." And that's equivalent to trying to glue that little bit of extra stuff on the end of a piece of wood. It just doesn't work out real well. You've got to know when to cut your losses.

Dave Thomas: And just to throw in another side of the same equation: as well as this attitude of, "I've worked real hard on this I'm not going to throw it away," is a complementary attitude. Quite often if you show programmers someone else's code, and that code has a single bug in it, their response will be, "Oh, I have to rewrite this." So you have to be good at working out what the problem is and what you need to do about it.

Andy Hunt: You have to figure out what is the appropriate level of repair.

Bill Venners: How do I decide how much to check myself? It takes time to write unit tests. It takes time to do design by contract. How paranoid should I be about my own code?

Dave Thomas: You use feedback. To start off, you choose a level that sounds reasonable. You use it, and you ask, "Is this level finding most problems?" If it is, then you say, "Maybe I could cut it back a bit." If you cut back and the situation gets worse, you go back up again. If you're doing some unit testing and a few asserts and still a whole lot of bugs are getting through that you're embarrassed about, then you're not doing enough. It's not one of the prescription things where you say, "A two year programmer has to do 80% unit test coverage." It's what works for you.

Andy Hunt: It comes down to feedback. If you think, "I'm doing great unit testing. I couldn't do more. I'm really at the top of my game." And yet you're getting a flood of bug reports on your code back from Quality Assurance (QA) or users, you probably need to do a bit more.

Crash Early

Bill Venners: In your book you suggest we try to detect problems as early as possible, so we can make the program crash before it does damage. I have often felt the need for a ShouldNeverHappenException in Java. I'm programming along, and I get to a case that I'm confident will really never happen. But just in case it ever happens, I want to throw an exception there, but what exception? I usually end up throwing a RuntimeException and putting in a comment that "This should never happen." But it takes time to add that throw statement, as does any way of crashing early. Checking every pointer in a C program for null before it's used, for instance, would take a lot of time. Where do you draw the line? How do you decide the investment is worth it?

Dave Thomas: That's interesting, because quite often you don't have to do anything special to crash early. For example, as long as you're sure a nullpointer is going to cause an error immediately, then I don't see much difference in throwing a random RuntimeException or throwing a NullPointerException. The bad thing is to propagate an error.

The reason you crash early is to stop errors from propagating far away from the cause. Because once you have an error that's a million instructions away from the cause, finding the cause is a pain in the butt. Quite often, the check is done for you by the compiler. What we're trying to say is when the checks are not put in by the compiler, that's when you start needing to put the checks in yourself.

Andy Hunt: It's more an issue of localization, keeping the crash near the cause.

Bill Venners: You write in your book, "When the system does fail, will it fail gracefully?" And in a footnote, you write, "Our editors wanted us to change this sentence to, 'If the system does fail...' We resisted." Why?

Andy Hunt: We actually quite deliberately put in "When." I think we had this argument in a draft of the book. Somebody else reading the draft made the same comment, that we should say, " If the system fails..." No, that's wrong. It should be, "When the system fails..." Every system fails. There is no such thing as perfect software. So part of phrasing that sentence that way is to encourage people to get over this in-bred arrogance that the system can't fail. Of course it can. Every system can fail. The question should not be, "Can the system fail?" It should be, "When the system fails, how are you going to handle it?"

Using Assertions

Bill Venners: You write in your book, "Don't use assertions in place of real error handling. Assertions check for things that should never happen." Can you elaborate?

Andy Hunt: You shouldn't use assertions to validate user input, for instance, or for general errors. Say you are doing some systems programming on a Unix platform, for example, and you are about to check the return value of opening /etc/passwd. We could probably debate about this, but I would think that would be more a class of assertion, because /etc/passwd should be there. It can never happen that that file is not there. OK, there are some extreme cases where that file may not be there, but it means something really really bad is going on. I would be inclined to say that's not traditional error handling, because that could "never happen." Now, if you're just looking to go open some properties file, or some file the user told you to go open, that's just normal error handling. The file may be there. It may not. It doesn't fall into the class of can't ever happen, so it's not an appropriate use for an assertion.

Bill Venners: What are the benefits and costs of using asserts?

Andy Hunt: There is a great anecdote from a small company that was making network diagnostic software. This company had a very strict policy about asserts. They asserted just everything—everything that could go wrong, that they thought might be a problem. They had tons and tons of asserts. And what made them different from other companies was they left in all of the asserts in their shipping product.

This was a real-time sensitive application. It wasn't just a report writer. This was time-critical network monitoring software. They left all the asserts in, to the point where if something managed to get through testing, and produced an error out in the field, they would actually put up a nice warning to the user saying, "Here's some information. We need you to call up and get tech support, because something bad happened."

And as a result of leaving all these assertions in, getting that feedback loop all the way out to the end user for the few bugs that did escape testing, they had a nearly bug-free product. They did so well they got purchased by some other software company for about a billion dollars. So as far as cost versus benefit, ...

Bill Venners: It sounds like they got about $1000 per assert.

Andy Hunt: Yeah, whatever the asserts cost them, their company got purchased for a huge amount of money. They did very well.

Dave Thomas: So let's look at the cost of actually writing an assert. There are two aspects to actually writing an assert. One is, you have to think about what you want to be true at this point in the code. Secondly, you have to find a way to express it. Finding out what you want to be true at this point in the code is in my mind the definition of programming. That's exactly how you write a line of code. You must ask, "What change do I want to make in the state of the world in this line of code?" So you have to have answered that question anyway if you're programming correctly. And people who say, "Oh, I can't work out what an assertion would be at this point," aren't programming. You should be able to know what assertion to put at any point in your code to verify that what you've just written does what you think it does.

Andy Hunt: You have to clarify your assumptions. To me, you can't use the word assertion without having the word assumption real close to it. Because with everything we program we've got this huge raft of assumptions in our mind—"Of course this must be like this. I know the state of this is set, and I'm about to do this." You've got this whole raft of assumptions. What you have to do is just take those assumptions, or some subset of those, and put that into an assert. I want to make sure that my assumption really does hold.

Bill Venners: In your book you say, "Whenever you find yourself thinking, 'Of course that could never happen,' write code to check it." Personally, I feel the urge for an assertion when there's enough complexity, for example, if there are several methods that must work together to keep something true at this point in this method. I think it works, but I'm not 100% confident I fully grasp the complexity. And I'm not confident that although it may work now, the people making changes over time may not grasp the complexity sufficiently to avoid breaking it in the future. That's when I feel the urge to put in an assertion.

Dave Thomas: That's really an interesting observation. I would say that there are many people who do that. I do the same myself, in particular if I'm doing something that is full of off-by-one boundary condition issues. I will put asserts in there just to check boundary conditions. Invariably, what it really means is I don't understand what my code is doing. So I put the asserts in there because I think I understand, but I don't really understand, what the code is doing. So I'll let some user check it for me. Whenever I find myself putting in asserts to try and clarify something, then I try to use that as a little warning bell that I should step back, simplify the code so I can understand it, and then put decent asserts in.

The other side of the coin is this. When we say in the book, "Whenever you find yourself thinking something can't happen, put asserts in," that could be misunderstood. We're not saying you have to assert everything. We're trying to undermine the kind of arrogance of the attitude, "I've just written this. It can't go wrong." Clearly there is code, setter and getter methods for example, where there is zero point in doing asserts just as there is zero point in doing unit testing. But it's more the case that this can never happen, because this file must exist. This socket must be open. This parameter must be greater than zero. Those are the ones where maybe that arrogance isn't quite appropriate.

Bill Venners: That makes sense. What you're saying is in the areas where I do feel confident, don't be so sure of myself.

Dave Thomas: Exactly.

Bill Venners: In areas where I'm not sure of myself, take that as a red flag that I should maybe try and simplify and clarify the code.

Dave Thomas: Yeah.

Andy Hunt: The other related topic is, in other engineering disciplines, bridge building for example, they are much more focused on what can possibly go wrong. Unfortunately, we have a tendency when we're writing software to focus on trying to find that one path through that will go right. And that's a very different focus. And we get so focused on finding that one path through that might work right, we tend not to spend so much time thinking about the 100 million things that could go wrong. And that's where you get all these errors that we don't trap, assumptions we don't check, boundary and edge conditions we don't deal with properly. That really is what makes the difference between a good programmer and a bad one. The good programmer tries to think of and deal with all the things that can go wrong.

Next Week

Come back Monday, May 5 for Part X of this conversation with Pragmatic Programmers Andy Hunt and Dave Thomas. If you'd like to receive a brief weekly email announcing new articles at Artima.com, please subscribe to the Artima Newsletter.

Resources

Andy Hunt and Dave Thomas are authors of The Pragmatic Programmer, which is available on Amazon.com at:
http://www.amazon.com/exec/obidos/ASIN/020161622X/

The Pragmatic Programmer's home page is here:
http://www.pragmaticprogrammer.com/

Talk back!

Have an opinion? Readers have already posted 5 comments about this article. Why not add yours?

About the author

Bill Venners is president of Artima Software, Inc. and editor-in-chief of Artima.com. He is author of the book, Inside the Java Virtual Machine, a programmer-oriented survey of the Java platform's architecture and internals. His popular columns in JavaWorld magazine covered Java internals, object-oriented design, and Jini. Bill has been active in the Jini Community since its inception. He led the Jini Community's ServiceUI project that produced the ServiceUI API. The ServiceUI became the de facto standard way to associate user interfaces to Jini services, and was the first Jini community standard approved via the Jini Decision Process. Bill also serves as an elected member of the Jini Community's initial Technical Oversight Committee (TOC), and in this role helped to define the governance process for the community. He currently devotes most of his energy to building Artima.com into an ever more useful resource for developers.