Cool Tools and Other Stuff
How Unit Testing and Refactoring Work Together
by Eric Armstrong
April 17, 2006

Summary
Refactoring is a wonderful thing. It lets elegance evolve, instead of making you live with faulty initial design decisions for the life of the project. Unit testing makes it possible to refactor safely, for sure. That's an important reason for unit tests. But the need for unit tests goes well beyond that...

Unit tests record the important use cases -- use cases that you frequently discover as you're going along, and then forget about some number of weeks later after you've modified the code to handle them. You're now several months down the road, and you've got some humungous wad of code that badly needs refactoring.

I'm looking at one of those at the moment, in fact. It's a link-managing routine that takes an input link like http://xyx, ../xyz, xyz, or /xyz. It has to normalize relative links, take into account whether there was a "base" directive on the page it came from, and map the result to a new location, if a mapping has been supplied by the user. It also has to deal with http:/, https:/, ftp:/, and file:/ protocols, as well as plain directory paths.

Frankly, the code is a mess. It was patched in multiple places, each time to solve a problem introduced by some set of factors that I hand't orignally taken into account. Looking at it now, I can see code that can never be reached--a sure sign that it has grown too large and too complex for my feeble brain to manage.

Obviously, refactoring is needed. But what kind of refactoring? The answer, naturally enough, depends on what problems the code is trying to solve.

The question is, what were those problems?? They arrived over the course of a couple of years. The code shows the result of the attempted soulutions, but I've long since forgotten the problems I needed to solve.

One way to get such a list is to examine the version modification history. With sufficient scouring, they could be ferreted out. That's one useful reason for maintaining version-controlled sources.

A bug tracking system could also be examined. Out of all the problems that needed to be solved, it would be possible to extract that ones that involve this particular part of the code.

But a better answer is unit tests. Every time the code broke, my first task should have been to create a unit test that replicated the break. That speeds up the edit/debug cycle, too. A small test takes a lot less time to run than the real-life data that generates the error. That's another great reason for unit tests.

But more importantly, had those tests been constructed, they would now provide a complete list of the problems the code had to solve. And the list would be organized by section, with all the tests that involve this method in one place.

In addition, there would be a place to record new issues, as they arise. (An almost certain occurrence, since the history of the project has been one of finding out that the code had to deal with things that I never knew were possible.)

With that list of cases, I could be sure that when I refactor to solve the problem in front of me, I won't create a regression for some issue that I've completely forgotten about--something that the current code is handling successfully, no matter how ugly it is.

Over time, then, the unit tests you collect tells you what the code has to do. In effect, they give you a complete, detailed specification--a specification you can use in an automated way to ensure that your newly refactored design achieves all of the goals that have been identified for it, past and present.

So the moral here is that unit tests be very, very good. They set you up so you can refactor safely. They record the reasons for the patches you made over time. And they tell you when a new refactoring is successful. Taken all together, that's a heck of a lot of value.

Talk Back!

Have an opinion? Readers have already posted 12 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Eric Armstrong adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

Eric Armstrong has been programming and writing professionally since before there were personal computers. His production experience includes artificial intelligence (AI) programs, system libraries, real-time programs, and business applications in a variety of languages. He works as a writer and software consultant in the San Francisco Bay Area. He wrote The JBuilder2 Bible and authored the Java/XML programming tutorial available at http://java.sun.com. Eric is also involved in efforts to design knowledge-based collaboration systems.


	Web Artima.com