Agitating Thoughts & Ideas
Pardon My French, But This Code Is C.R.A.P. (2)
by Alberto Savoia
July 19, 2007

Summary
Part II - We let the C.R.A.P. out of the bag. We explain our thinking behind the first version of the CRAP metric and then unveil the formula.

In Part I, we noted that there’s a lot of crappy code in the world. We also noted that most developers don’t consider their own code to fall into that category. Finally, we talked about the fact that, as long as the application does what it’s supposed to do, most software organizations don’t seem to worry too much about bad code – until the code has to be maintained or enhanced and the original developers have skipped town.

From a developer’s point of view, crap is other people’s code that they have to maintain. Unfortunately, most developers spend most of their career working with code that they have not written themselves – code that they consider crappy because they find it nasty, ugly, convoluted, hard-to-understand, and hard to work with.

Crappy code is a big problem for software organizations; so my AgitarLabs’ colleague, Bob Evans, and I decided to do some research to see if the level of crappiness in the code could be analyzed and quantified in some useful form; which brings us to the C.R.A.P. index.

The C.R.A.P. (Change Risk Analysis and Predictions) index is designed to analyze and predict the amount of effort, pain, and time required to maintain an existing body of code.

Below is some of our thinking behind the C.R.A.P. index:

[] We believe that software metrics, in general, are just tools. No single metric can tell the whole story; it’s just one more data point. Metrics are meant to be used by developers, not the other way around – the metric should work for you, you should not have to work for the metric. Metrics should never be an end unto themselves. Metrics are meant to help you think, not to do the thinking for you.

[] We believe that, in order to be useful and become widely adopted, a software metric should be easy to understand, easy to use, and – most importantly – easy to act upon. You should not have to acquire a bunch of additional knowledge in order to use a new metric. If a metric tells you that your inter-class coupling and coherence score (I am making this up) is 3.7, would you know if that’s good or bad? Would you know what you need to do to improve it? Are you even in a position to make the kind of deep and pervasive architectural changes that might be required to improve this number?

[] We believe that the formula for the metric, along with various implementations of the software to calculate the metric should be open-source. We will get things started by hosting a Java implementation of the C.R.A.P. metric (called crap4j) on SourceForge.

[] The way we design, develop, and deploy software changes all the time. We believe that with software metrics, as with software itself, you should plan for, and expect, changes and additions as you gain experience with them. Therefore the C.R.A.P. index will evolve and, hopefully, improve over time. In that spirit, what we present today is version 0.1 and we solicit your input and suggestions for the next version.

[] We believe that a good metric should have a clear and very specific purpose. It should be optimized for that purpose, and it should be used only for that purpose. The more general and generic a metric is, the weaker it is. The C.R.A.P. index focuses on the risk and effort associated with maintaining and changing an existing body of code by people other than the original developers. It should not be abused or misused as a proxy for code quality, evaluating programmers’ skills, or betting on a software company’s stock price.

[] Once the objective for the metric is established, the metric should be designed to measure the major factors that impact that objective and encourage actions that will move the code closer to the desired state with respect to that objective. In the case of C.R.A.P., the objective is to measure and help reduce the risks associated with code changes and software maintenance – especially when such work is to be performed by people other than the original developers. Based on our initial studies and research on metrics with similar aims (e.g., the Maintainability Index from CMU’s Software Engineering Institute) we decided that the formula for version 0.1 of the C.R.A.P. index should be based on method complexity and test coverage.

[] There are always corner cases, special situations, etc., and any metric might misfire on occasion. For example, C.R.A.P. takes into account complexity because there is good research showing that, as complexity increases, the understandability and maintainability of a piece of code decreases and the risk of defects increases. This suggests that measuring code complexity at the method/function level and making an effort to minimize it (e.g. through refactoring) is a good thing. But, based on our experience, there are cases where a single method might be easier to understand, test, and maintain than a refactored version with two or three methods. That’s OK. We know that the way we measure and use complexity is not perfect. We have yet to find a software metric that’s right in all cases. Our goal is to have a metric that’s right in most cases.

Let The C.R.A.P. Out Of The Bag

Now that you have some insight into our thinking, beliefs, and preferences, it’s time to unveil version 0.1 of C.R.A.P. for Java.

Given a Java method m, C.R.A.P. for m is calculated as follows:

C.R.A.P.(m) = comp(m)^2 * (1 – cov(m)/100)^3 + comp(m)

Where comp(m) is the cyclomatic complexity of method m, and cov(m) is the test code coverage provided by automated tests (e.g. JUnit tests, not manual QA). Cyclomatic complexity is a well-known, and widely used metric and it’s calculated as one plus the number of unique decisions in the method. For code coverage we uses basis path coverage.

Low C.R.A.P. numbers indicate code with relatively low change and maintenance risk – because it’s not too complex and/or it’s well-protected by automated and repeatable tests. High C.R.A.P. numbers indicate code that’s risky to change because of a hazardous combination of high complexity and low, or no, automated test coverage to make sure you have not introduced any unintentional changes. We’ll cover in much more detail how the C.R.A.P. score should be interpreted and applied in Part III.

As the code coverage approaches 100%, the formula reduces to:

C.R.A.P.(m) = comp(m)

In other words, the change risk is linearly coupled with the complexity. We had a lot of heated discussions about this. Some very smart people we talked with believe that if you have 100% test coverage, your risk of change should be 0. Bob and I, however, believe that more complex code, even if it’s fully protected by tests, is usually riskier to change than less complex code. If you have 100% test coverage and your C.R.A.P. index is still too high, you should consider some refactoring (e.g., method extraction).

As the code coverage approaches 0%, the formula reduces to:

C.R.A.P.(m) = comp(m)^2 + comp(m)

In other words, if you have no tests, your change risk increases, roughly, as the square of the method complexity. This indicates that it’s time to write some tests.

Generally speaking, you can lower your C.R.A.P. index either by adding automated tests or by refactoring to reduce complexity. Preferably both (and it’s a good idea to write the tests firsts so you can refactor more safely).

Conclusion And Call For Feedback

We believe that unit/developer testing and refactoring are good practices and since either of these activities improves the C.R.A.P. index, we believe that we are on the right path – though we know there is still a LOT of room for improvement.

Software metrics have always been a very touchy topic; they are perfect can-of-worms openers and an easy target. When we started this effort, we knew that we’d be in for a wild ride, a lot of criticism, and lots of conflicting opinions. But I am hopeful that – working together and with an open-source mindset – we can fine tune the C.R.A.P. index and have a metric to will help reduce the amount of crappy code in the world.

OK. Time for some feedback – preferably of the constructive type so that C.R.A.P. 0.2 will be better than C.R.A.P. 0.1.

Preview of Part III

In Part III, I am going to spend some time discussing how to interpret and use the C.R.A.P. index. Is there a threshold where a piece of code turns into crappy code? If some code will never be changed, do you care what its C.R.A.P. index is? We are going to explore these and other interesting questions. I hope you come back for it.

Talk Back!

Have an opinion? Readers have already posted 36 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Alberto Savoia adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

Alberto Savoia is founder and CTO at Agitar Software, and he has been life-long agitator and innovator in the area of software development and testing tools and technology. Alberto's software products have won a number of awards including: the JavaOne's Duke Award, Software Development Magazine's Productivity Award, Java Developer Journal's World Class Award, and Java World Editor's Choice Award. His current mission is to make developer unit testing a broadly adopted and standar industry practice rather than a rare exception. Before Agitar, Alberto worked at Google as the engineering executive in charge of the highly successful and profitable ads group. In October 1998, he cofounded and became CTO of Velogic/Keynote (NASD:KEYN), the pioneer and leading innovator in Internet performance and scalability testing. Prior to Velogic, Alberto had 13-year career at Sun Microsystems where his most recent positions were Founder and General Manager of the SunTest business unit, and Director of Software Technology Research at Sun Microsystems Laboratories.


	Web Artima.com