Agitating Thoughts & Ideas
The Code C.R.A.P. Metric Hits the Fan - Introducing the crap4j Plug-in
by Alberto Savoia
October 2, 2007

Summary
After much talking, experimenting, and blogging, we implemented a prototype version of crap4j - a code Change Risk Analyzer and Predictor (i.e. CRAP) for Java. A tool to help you defend yourself against overly complex and untested code. Read all about it and find out how to download and install the crap4j Eclipse plug-in.

[Cue in sound of a giant can of worms being opened]

[Edit 10/18/07 - Given the warm reception and interest in Crap4j, we created a dedicated Crap4J website. Please use this website for getting plug-in updates, reporting bugs, and feature requests. Thanks, Alberto]

Disclaimer

The CRAP metric and the crap4j plug-in are, at this point, highly experimental in nature. The CRAP formula is based on what we believe are sound principles and ideas; but at this time the actual numerical values used in its calculation and interpretation should only be considered a starting point which we plan to refine as we gain experience. Of course, it's also possible that we'll "scrap the CRAP" if we determine that it's not useful.

If you are adventurous and care enough about the topic, read on, download the plug-in, run it on some code, and work with us to improve it (or at least give us feedback, good or bad). Otherwise, I suggest you check back in a few months - after we've gained some experience with this early prototype and did the first round of tuning and cleaning up.

Introduction

There is no fool-proof, 100% objective and accurate way to determine if a particular piece of code is crappy or not. However, our intuition – backed by research and empirical evidence – is that unnecessarily complex and convoluted code, written by someone else, is the code most likely to elicit a “This is crap!” response. If the person looking at the code is also responsible for maintaining it going forward, the response typically changes into “Oh crap!”

Since writing automated tests (e.g., using JUnit) for complex code is particularly hard to do, crappy code usually comes with few, if any, automated tests. The presence of automated tests implies not only some degree of testability (which in turn seems to be associated with better, or more thoughtful, design), but it also means that the developers cared enough and had enough time to write tests – which is a good sign for the people inheriting the code.

Because the combination of complexity and lack of tests appear to be good indicators of code that is potentially crappy – and a maintenance challenge – my Agitar Labs colleague Bob Evans and I have been experimenting with a metric based on those two measurements. The Change Risk Analysis and Prediction (CRAP) score uses cyclomatic complexity and code coverage from automated tests to help estimate the effort and risk associated with maintaining legacy code. We started working on an open-source experimental tool called “crap4j” that calculates the CRAP score for Java code. We need more experience and time to fine tune it, but the initial results are encouraging and we have started to experiment with it in-house.

Crap4J is currently a prototype and it’s implemented as an Eclipse (3.2.1 or later) plug-in which finds and runs any JUnit tests in the project to calculate the coverage component. If you are interested in contributing to crap4j’s open-source effort to support other environments (e.g. NetBeans) and test frameworks (e.g. TestNG) and/or coverage tools (e.g. Emma) please let us know. Instructions for installing the crap4j plug-in are below, but first let’s re-introduce our first pass for the CRAP formula.

The CRAP Formula Version 0.1

Given a Java method m, CRAP for m is calculated as follows:

CRAP(m) = comp(m)^2 * (1 – cov(m)/100)^3 + comp(m)

Where comp(m) is the cyclomatic complexity of method m, and cov(m) is the test code coverage provided by automated tests (e.g. JUnit tests, not manual QA). Cyclomatic complexity is a well-known and widely used metric and it’s calculated as one plus the number of unique decisions in the method. For code coverage we use basis path coverage. Low CRAP numbers indicate code with relatively low change and maintenance risk – because it’s not too complex and/or it’s well-protected by automated and repeatable tests. High CRAP numbers indicate code that’s risky to change because of a hazardous combination of high complexity and low, or no, automated test coverage.

Generally speaking, you can lower your CRAP score either by adding automated tests or by refactoring to reduce complexity. Preferably both; and it’s a good idea to write the tests firsts so you can refactor more safely.

Like all software metrics, CRAP is neither perfect nor complete. We know very well, for example, that you can have great code coverage and lousy tests. In addition, sometimes complex code is either unavoidable or preferable; there might be instances where a single higher complexity method might be easier to understand than three simpler ones. We are also aware that the CRAP formula doesn’t currently take into account higher-order, more design-oriented metrics that are relevant to maintainability (such as cohesion and coupling). Since the perfect software metric does not exist and, regardless of design issues, overly complex methods and lack of tests are usually bad things, we decided that – even in its current state – the crap4j metric provides useful information that we should start experimenting with. This way we have something concrete to give us experience and data for further refinement.

You can get more details on the hows and whys of the current CRAP formula, along with other people’s opinions on it, by checking out the original blog:

http://www.artima.com/weblogs/viewpost.jsp?thread=210575

Interpreting Crap4j Results

For a given method, the CRAP number ranges from 1 (for a method of complexity 1 and 100% code coverage) to a very large number (e.g. a method of complexity 100 with 0% code coverage – we have seen such beasts – would score 10,100).

Individual Method Interpretation:

Bob Evans and I have looked at a lot of examples (using our code and many open source projects) and listened to a LOT of opinions. After much debate, we decided to INITIALLY use a CRAP score of 30 as the threshold for crappiness. Below is a table that shows the amount of test coverage required to stay below the CRAP threshold based on the complexity of a method:

Method’s Cyclomatic Complexity        % of coverage required to be
                                      below CRAPpy threshold
------------------------------        --------------------------------
0 – 5                                   0%
10                                     42%
15                                     57%
20                                     71%
25                                     80%
30                                    100%
31+                                   No amount of testing will keep methods
                                      this complex out of CRAP territory.

In other words, you can have high-complexity methods BUT you better have a lot of tests for them. Please note that we don’t recommend having zero tests – even for simple methods of complexity 5 and below. Every piece of code that does something non-trivial should have some tests but – after many discussions – we believe that the CRAP metric will be more useful if it highlights the truly crappy and risky code, not just the code that could be made better by refactoring or adding some tests.

Aggregate (Project-Level) Interpretation:

At the project level, we report the percentage of methods above the CRAP threshold (i.e., all methods with a CRAP score of 30 or higher). We understand that nobody is perfect and that in some cases people have good excuses for having a few methods that are more complex, or have fewer tests, than the ideal. Project wide, we allow up to 5% crappy methods before labeling the entire project as crappy. Some people will think that this is too generous (Bob and I think it is), while others will thing that it’s too draconian – as we gain experience we’ll adjust accordingly or let people set their own thresholds.

CRAP Load:

Crap4j also reports CRAP load, this is an estimate of the amount of work required to address crappy methods. It takes into account the amount of testing (with a small refactoring component) required for bringing a crappy method back into non-CRAP territory. Broadly speaking, a CRAP load of N indicates that you have to write N tests to bring the project below the acceptable CRAP threshold. The CRAP load metric is even more experimental than the rest, so we will not spend too much time on it at this point. I’ll blog more about it in the future – after getting some experience with it.

That’s it for now. If this is of interest to you, it’s time to download it and start experimenting. Let us know what you think, how you’d change the metrics, improve the plug-in, etc.

Download and Installation Instructions

If you already have Eclipse 3.2.1 or later (right now we don't support earlier versions), you can install the plug-ins from our update site at

http://www.junitfactory.com/crap4j/update/.

Important Note: This prototype version of Crap4j uses JUnit Factory’s test runner because the default JUnit runner does not have the built-in code coverage information that are needed to calculate the CRAP score. JUnit Factory (www.junitfactory.com) is Agitar Labs’ free experimental web-based JUnit generator (originally meant for open-source projects, students, and test automation researchers, but open to anyone who does not mind sending their bytecode over the Internet). JUnit Factory’s test generation services are not downloaded, and are not needed to run crap4j. If you want to try JUnit Factory use the update site: http://www.junitfactory.com/update/. Going forward, we will consider supporting other open-source coverage tools such as Emma – if you want to help with that and contribute to crap4j’s open-source effort let us know.

Follow the steps below to install crap4j and the JUnit Factory runner with built-in code coverage.

In Eclipse, select Help > Software Updates > Find and Install
Choose Search For New Features to Install and select Next
Select New Remote Site
Enter a name for this server: Crap4J
Enter (or copy/paste) this url: http://www.junitfactory.com/crap4j/update/
Install all plug-ins in the Crap4j category and restart Eclipse

Usage Instructions:

Once installed you should see a distinctive – if not exactly tasteful – toilet-paper crap4J icon in the Eclipse toolbar. Select an open Eclipse project (i.e. click on the top-level project icon) and click on the crap4j icon. Crap4j will automatically identify and run all the JUnit tests in the project, record the coverage information, and calculate the cyclomatic complexity for each method. After it’s done (it may take a while if you have a lot of tests to run), it will display the results in a new window. The results page show high-level information and has links to more detailed pages (e.g. all methods sorted by complexity, coverage, CRAP, or CRAP load).

Remember that this is a prototype implementation intended primarily for our research. As such, we've only done limited testing using Eclipse 3.2.

Summary

Crap4j is the prototype for a planned open-source implementation of the CRAP metric.

Crap4j is currently implemented as an Eclipse plug-in downloadable from:

http://www.junitfactory.com/crap4j/update/.

We plan to evolve and refine both the CRAP metric and Crap4j based on user feedback and experience using an open-source model both for the formula and the various implementations. We are getting things started with an implementation for Java/JUnit (done as an Eclipse plug-in), but it would be great if people are willing to port it to other languages, IDEs, and testing frameworks.

Talk Back!

Have an opinion? Readers have already posted 38 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Alberto Savoia adds a new entry to his weblog, subscribe to his RSS feed.

Digg |

del.icio.us |

About the Blogger

Alberto Savoia is founder and CTO at Agitar Software, and he has been life-long agitator and innovator in the area of software development and testing tools and technology. Alberto's software products have won a number of awards including: the JavaOne's Duke Award, Software Development Magazine's Productivity Award, Java Developer Journal's World Class Award, and Java World Editor's Choice Award. His current mission is to make developer unit testing a broadly adopted and standar industry practice rather than a rare exception. Before Agitar, Alberto worked at Google as the engineering executive in charge of the highly successful and profitable ads group. In October 1998, he cofounded and became CTO of Velogic/Keynote (NASD:KEYN), the pioneer and leading innovator in Internet performance and scalability testing. Prior to Velogic, Alberto had 13-year career at Sun Microsystems where his most recent positions were Founder and General Manager of the SunTest business unit, and Director of Software Technology Research at Sun Microsystems Laboratories.


	Web Artima.com