In part 2 we wrote a set of characterization tests by hand. In part 3 we showed you how those tests helped you to catch unwanted or unintended changes in the behavior of the legacy code. In this part I want to have some fun and introduce you to a pet research project of mine: JUnit Factory.

JUnit Factory, JUF for short, is a free web-based characterization test generator for Java. You send JUF some Java code and, if it can make any sense out of it, it sends you back JUnit characterization tests.

JUF is one of Agitar Lab’s (http://www.agitar.com/company/agitarlabs.html) initiatives aimed primarily at test automation researchers and computer science students (it has already been used in a computer science class assignment at Carnegie-Mellon University), but anyone can use it (with the caveat that they have to be comfortable sending their code to the JUF servers over the Internet).

Before we jump into our example, I’d like be clear about one thing: I believe that developers should take responsibility for unit testing their code. Ideally, every piece of code, should be accompanied by unit tests. I see test automation tools playing a key role in making developers more efficient and effective at testing their own code, but I don’t believe in developers abdicating all responsibility for testing and expecting a push-button tool to do all their testing work for them (even if such a tool was possible – which it isn’t). In other words, I believe in the developer and the test automation tools working together, each doing what they are best at. This applies both to testing new code and legacy code.

I hope I can give you a flavor of what I mean by this developer + test automation tool cooperation with the examples that follow.

public class SalesUtil { final static double BQ = 10000.0; final static double BCR = 0.20; final static double OQM1 = 1.5; final static double OQM2 = OQM1 * 2; public static double calculateCommissionDue(double totSales) { if (totSales <= BQ) { return totSales * BCR; } else if (totSales <= BQ * 2){ return (BQ) * BCR + (totSales - BQ) * BCR * OQM1; } else { return (BQ) * BCR + (totSales - BQ) * BCR * OQM1 + (totSales - BQ * 2) * BCR * OQM2; } } }

public void testCalculateCommissionDue1() { assertEquals(200.0, SalesUtil.calculateCommissionDue(1000.0), 0.01); } public void testCalculateCommissionDue2() { assertEquals(5000.0, SalesUtil.calculateCommissionDue(20000.0), 0.01); } public void testCalculateCommissionDue3() { assertEquals(14000.0, SalesUtil.calculateCommissionDue(30000.0), 0.01); }

Let’s see what kind of characterization tests JUnit Factory comes up with. Since I have downloaded the Eclipse plug-in for JUF, all I have to do is press the JUF Generate Tests button and, a few seconds later, JUF sends back the following tests for the method calculateCommissionDue:

public void testCalculateCommissionDue() throws Throwable { double result = SalesUtil.calculateCommissionDue(9999.999); assertEquals("result", 1999.9998, result, 1.0E-6); } public void testCalculateCommissionDue1() throws Throwable { double result = SalesUtil.calculateCommissionDue(20000.001); assertEquals("result", 5000.0009, result, 1.0E-6); } public void testCalculateCommissionDue2() throws Throwable { double result = SalesUtil.calculateCommissionDue(10000.001); assertEquals("result", 2000.0003000000002, result, 1.0E-6); } public void testCalculateCommissionDue3() throws Throwable { double result = SalesUtil.calculateCommissionDue(10000.0); assertEquals("result", 2000.0, result, 1.0E-6); } public void testCalculateCommissionDue4() throws Throwable { double result = SalesUtil.calculateCommissionDue(20000.0); assertEquals("result", 5000.0, result, 1.0E-6); } public void testCalculateCommissionDue5() throws Throwable { double result = SalesUtil.calculateCommissionDue(19999.999); assertEquals("result", 4999.9997, result, 1.0E-6); } public void testCalculateCommissionDue6() throws Throwable { double result = SalesUtil.calculateCommissionDue(0.0); assertEquals("result", 0.0, result, 1.0E-6); }

Hey, these tests look pretty darn good – if I may say so myself – and the price is right. But, of course, I am biased. So let me tell you why I consider these generated tests to be useful, and also how they can be improved.

What I Like About The Generated Tests

For one thing, I like the fact that, in addition to testing for the basic values (e.g. 10000 and 20000), JUnit Factory applied boundary value analysis (a best practice in testing) and created test cases just above and below the boundary values (e.g. 9999.999 and 10000.001). I should have those values in my tests.

In addition to boundary value testing, JUF applied a classic testing heuristic and used 0.0 as an input value. I like this because: 1) using zero for a numerical input value is a always a good testing idea and, 2) because it got me thinking and helped me realize that the existing code will do the right thing with 0.0, but it will also gladly accept a negative number for totSales and will return a negative commission. Now, call me paranoid or over-protective, but this method begs for some input checking.

By looking at the tests, I also realize that with this code I have a real problem with fractional cents both in the input and the output. The double data type is not ideal for representing dollars. I knew this all along, but seeing code like the following,

assertEquals("result", 2000.0003000000002, result, 1.0E-6);

in the generated tests made the problem more real and the need for a proper solution more urgent

How The Generated Tests Could Be Even Better

I like the fact that the tests used 0.0 as an input. But why didn’t JUF use some negative values? I would have also liked to see some very large number; this would have helped me to think about putting a reality-check upper bound on the input – before we accidentally pay a commission of several million dollars.

Why doesn’t JUnit Factory generate the tests I just described? It’s not a technical problem. For us it’s very easy to add a heuristic to generate additional tests with negative and/or large input values for the totSales parameter. The answer is that we are trying to find the proper balance between bare minimum and overkill in the number and types of tests JUnit Factory generates. This is one of the aspects that makes it experimental, and there are other default behaviors that are up for debate. A few of the many things we are trying to decide are:

Should we make assertions on private fields? Some people believe that having to assert on private fields is an indication that there’s something wrong with your design. Others believe that testing trumps encapsulation.
If a method has an object parameter, should we always generate a test using a null value? Some think that testing some methods with null is a waste of time – at best: “A null will never make this far”. Others have seen too many unexpected NullPointerExceptions percolate up to the end-user, and believe that having such a test might help developers think more carefully about their null handling behavior.
If we can’t construct an object, should we automatically mock it? How far do we take mocking? Some believe that proper unit tests should make extensive use of mocks. Others believe that mocks are a weapon of last resort since they can hide serious problems between collaborating classes.

All good questions with strong arguments and proponents for both sides. Why don’t you give JUnit Factory a try yourself, with your own sample code, and let us know what you think. The simple web-based demo (http://www.junitfactory.com/demo/) gives you an opportunity to rate the generated tests (i.e. 1 to 5 stars) and also provide free-form text feedback. For the full JUnit Factory experience, you should download the Eclipse plug-in.

The way I see it, the best way to use automatically generated characterization tests is to consider them a starting point. These tests get my testing juices flowing and they make me think of cases I might have otherwise ignored. But, ultimately, I believe in taking control. Keep the test cases I like (and possibly edit them) and add my own. If some generated don’t make sense or don’t apply, I simply delete them from the set.

In this case, I combined my original tests with the generated tests and added a few of my own to characterize behavior for negative and very large input values. Below is the result:

public void testCalculateCommissionDue1() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(-10000.00);
    assertEquals("result", -2000.00, result, 0.01);
}

public void testCalculateCommissionDue2() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(-0.01);
    assertEquals("result", 0.0, result, 0.01);
}

public void testCalculateCommissionDue3() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(0.0);
    assertEquals("result", 0.0, result, 0.01);
}

public void testCalculateCommissionDue4() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(1000.00);
    assertEquals("result", 200.00, result, 0.01);
}

public void testCalculateCommissionDue5() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(9999.99);
    assertEquals("result", 2000.00, result, 0.01);
}

public void testCalculateCommissionDue6() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(10000.0);
    assertEquals("result", 2000.00, result, 0.01);
}

public void testCalculateCommissionDue7() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(10000.01);
    assertEquals("result", 2000.00, result, 0.01);
}

public void testCalculateCommissionDue8() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(19999.99);
    assertEquals("result", 5000.00, result, 0.01);
}

public void testCalculateCommissionDue9() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(20000.0);
    assertEquals("result", 5000.00, result, 0.01);
}

public void testCalculateCommissionDue10() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(20000.01);
    assertEquals("result", 5000.00, result, 0.01);
}

public void testCalculateCommissionDue11() throws Throwable {
    double result = SalesUtil.calculateCommissionDue(999999.99);
    assertEquals("result", 886999.99, result, 0.01);
}

Too many tests? Too few? Just right?

What about the pesky fractional cents problem? Is it acceptable to check the commission to the nearest cent?

Going forward, should we create a DollarAmount class instead of using a double type? Should we throw an exception for negative values or unrealistically large values?

As you can see tests, even those that are automatically generated, really help you think about current and potential problems. Sometimes they raise a lot of questions – good questions.

That’s it for this installment. I hope you found this detour into JUnit Factory interesting and that it motivated you to experiment with it yourself.

Also, if you have followed this series on characterization tests so far, please let me know what you think of it and what you’d like me to cover next.