Each line of Java code requires three or four lines of test code to achieve acceptable test coverage, according Alberto Savoia. Savoia founded Agitar to help eliminate some of that developer tedium. In this interview with Artima, he explains the need for test automation, discusses testing trends, and shares his views on the future of unit testing.
Alberto Savoia is founder of Agitar Software, a company whose flagship product, Agitator, helps automate the creation and execution of unit tests. Prior to founding Agitar, Savoia worked on Google's ad server software, seeing there first hand not only the benefits of good developer testing, but also the true effort involved in achieving and maintaining high code coverage.
In this interview, Savoia talks about various approaches to test automation, reducing the need for hand-written tests, major trends in developer attitudes to testing in the past five years, and his thoughts about the future of unit testing.
Frank Sommers: Why is there a need to automate unit testing beyond what JUnit and other testing tools already provide?
Alberto Savoia: When people start writing JUnit tests, they often realize that unit tests require maintenance just as much as the main code does. Roughly for every line of Java code, it takes between three and four lines of JUnit to achieve anything close to complete code coverage. That’s because testing is a combinatorial problem. Test code very quickly balloons, and few people want to write four hundred lines of JUnit code for a hundred lines of application code.
What happens is what people sometimes call the happy test cases, the normal cases. Tests for all the corner cases and conditions just don’t get done. While it’s great to involve developers in testing, but if you don’t give them tools, it becomes very difficult to stick to good testing practices.
When I was at Google, I discovered that out of the one hundred people we got going on unit testing, only about thirty became test-infected and continued writing tests even after [the initial period]. The other seventy percent just went back to their old ways. They told me that while they realized that testing added value, writing and maintaining tests took a lot of time. We started Agitar to address that problem.
Some things humans do very well, but other things computers are better at—exhaustive searches, combinatorial explosion problems. I firmly believe the best way to approach the unit testing problem is to have a tool that is the moral equivalent of the spreadsheet for accountants. You feed it the formulas, provide some input, and then automagically hundreds of thousands of calculations happen behind the UI. Then you can look and refine your assumptions and the test data until you like the results.
Frank Sommers: How does Agitator help automate testing?
Alberto Savoia: Agitator performs a two-step process. First, it exercises the code, and tries to get every outcome, cover every line, every branch, every conditional statement. Then it presents you with what we call observations: properties of the code that are always true. They are also called class invariants or method invariants.
For example, if you have a checking account, one of the invariants that we may discover is that the account balance is always greater than zero. The developer can look at that observation and decide if the observation represents what the code should do. When executing the code, say, five hundred times, if the bank account always had a balance greater than zero, then [the developer] can decide whether to make that an assertion.
JUnit, too, has the concept of assertions, but Agitator’s concept of assertions are a bit different. In JUnit, assertions are things that must be true for a particular test case. Agitator assertions are properties [of the code] that must always be true.
To illustrate the difference, suppose you have a method called deposit(). To write a JUnit test for this [method], you could create a bank account with an initial balance of $500, and then you’d deposit $100, and write a statement such as assertEquals(600, bank.balance). That’s a valuable test, but it only tests a particular case. Agitator, by contrast, will try to create bank accounts with all possible legal values—one with a balance of $0, one with $0.1, one with $1,000,000, and then it would try to deposit $0, a negative number, $1,000,000, and a number of values between. Computers are pretty tireless in coming up with crazy inputs and test cases that a human would normally not consider.
You are no longer focusing on just one test case, and Agitator comes out with a more generic assertion that must always be true: For instance, that whenever you make a deposit and an exception is not thrown, the balance of the account is equal to the previous balance amount plus the amount you deposited. It covers a much broader range of test values.
People also use this a lot for exploratory testing: discovering test cases that you didn’t think of when writing the code. If a developer doesn’t think of somebody possibly writing a check for a negative amount, you may end up depositing that check. With Agitator, you get to see all of the ranges that happen when you exercise those values. So you may see there is a test case that you didn’t consider, and you can then go and fix that.
Ideally, people should do code reviews to discover those sorts of problems. We know that code reviews and pair programming are the most efficient ways of finding problems of that kind. In reality, I know very few people who actually do code reviews. People say, “Yes, we should do code reviews,” everybody gets excited, they do one or two, and then realize that these are long and boring meetings, and then they drop them. Agitator is a like a mini test partner—you write a piece of code, press an “agitate” button, and Agitator comes up with all these test cases and shows you possibilities that you can then consider. It helps you do code reviews without meetings.
Frank Sommers: To what extent can we automate testing? There are skeptics who believe that automated tests might miss some conditions that a developer can take into account when writing tests by hand.
Alberto Savoia: Some people still believe all tests should be written manually, and that you don’t want a tool to generate the tests for you because if you don’t write your own tests, then you’re not a good developer: You’re sending your laundry to the laundromat instead of doing it by hand.
Some people call them the Amish testers: they refuse to use any technology. I have respect for the Amish, but when it comes to testing, I think some automation is in order. But I agree with them in one respect: You don’t want to abdicate all responsibility and just press a button and have tests generated. You still need to be involved and the tool merely amplifies your efforts. I call that test amplification.
Let’s take an example. In order to test a bank account object, you need to create a bank account. And in order to create a bank account, you may need to know some magic number, a valid social security number, for instance. The tool can’t figure out what that magic number is. So it generates worst-case scenarios, zero-percent coverage. You can then write a test helper, three or four lines of code, and create a valid social security number, for instance. Then you run the tests again, and now you get hundred percent coverage and maybe the tool generated four hundred lines of JUnit tests for your hundred-line Java class. So what you’re doing is a collaboration between the man and the machine that allows you to be very efficient.
Frank Sommers: Many developers think that tests should be written before the application code. How can Agitator work in test-first style development?
Alberto Savoia: First, I don’t think most developers think in terms of test-first. I think it’s a great idea. But if you go out and look at even people who are test-infected, I’d say it’s a single-digit percentage that writes tests first. In the Silicon Valley, we have a really skewed view of the world. Most of our customers are at financial institutions or insurance companies, for example, who just want to get the job done, and not all of them are the types who spend their nights at Borders Books to look for the latest JUnit and test practices books.
But if you do want to do test-first development with Agitator, you can just create a skeleton class, just the name of the class. We need something onto which to append the test assertions. Then you can write an assertion that says account.balance must be greater than zero. That’s going to be [shown] in red at first, because it will be a syntax error—you haven’t created the code. Then you have to write the code until the assertions turn green. And then you proceed to add the next assertion.
Frank Sommers: You recently released AgitarOne, a server-based version of you product. Why the need for server-based agitation?
Alberto Savoia: In theory, you should run your tests every time you make a substantive change to the code. But once you have a lot of tests, they start to take longer and longer to run, and take up CPU cycles on your system. Our customers have asked us to take the load off the developer systems, and to have the possibility to run the tests on multiple servers, executing the tests concurrently.
In-house, we use AgitarOne on twelve CPUs. That makes running tests a fast and fun experience—you press a button, and within a short time you get a bunch of tests created and executed. A fully-loaded programmer costs a lot more than a fully loaded PC does. It’s a no-brainer.
This goes hand-in-hand with another new feature in AgitarOne: the ability to automatically generate JUnit test code. This is something our customers have requested, so that when you look at a snapshot [of assertions] created by Agitator, JUnit tests for those assertions can be automatically generated. And the JUnit test cases we generate are possibly the best JUnit tests for your code because they take into account all the corner cases in inputs I alluded to earlier.
Frank Sommers: You started Agitar in 2001. What would you consider as significant changes in the testing and code quality landscape in the intervening six years? Where do you think unit testing is heading?
Alberto Savoia: When we started the company in 2001, there were no books on JUnit, and no talks on JUnit or unit testing at JavaOne. At that time, if you asked developers if they unit tested their code, they’d say, “Forget about that, that’s what Q&A is for.”
In 2004, I gave the first presentation at JavaOne, which was the last presentation on the last day, and I thought that nobody would show up. I was surprised that the room was filled to capacity, and there was an overflow room. I did another presentation in 2006, this time with Kent Beck. They gave us a much better time-slot, and we had 1,400 people. There are now lots of books on unit testing. Every company we talk to either has a top-down initiative, or some rebel group inside the company wants to start unit testing.
So unit testing is becoming mainstream. When we started, the number of developers interested in unit testing was in the single-digit percentage. Now at least twenty or thirty percent that work in organizations want to do unit testing. It’s an inexpensive way of getting things done and improving code quality.
Going forward, I think there are three possibilities. The first is that this was just a flash in the pants, where people say, “Oh this was kind of cool, but let’s go back to the old way.” I hope that doesn’t happen. I don’t think it will.
A more likely scenario is that unit testing remains a niche thing, where we have thirty percent of companies, and thirty percent of developers doing it, but the majority still ignores it. While this is a likely scenario, it’s also not the one that we’d want. The scenario we want is where unit testing goes from being a minority practice to becoming a de facto standard.
For that to happen, though, you need companies behind that momentum, developing technologies, investing in it, and providing support. I’m a big believer in tools. The evolution of humanity is shaped by tools. There are a lot of desirable practices that would not have become the norm, had people not built the tools [to support them]. We all like to smell fresh and wear a clean shirt every day, but until you have washing machines and running water, and until other people around you start smelling fresh, it’s not going to become a mainstream practice. The same thing applies to keeping your code clean.
The tools will also cause people to write code that’s more testable. In the early days of the Web, everyone did Web pages their own way. As search engines started to play an increasingly important role, people learned that if they wrote pages a certain way, and put some meta-tags on the page, search engines could do a better job figuring out what the page is about. Similarly, developers are finding that if they write code according to certain patterns, and with testability in mind, automated test generation tools can do a better job and with less human intervention. We see that people are starting to refactor with testing in mind. Ugly code leads to ugly tests, so testing will make you even more motivated to write clean code.
Just started working at a company using Agitator, and am really impressed with it. I'm still struggling to integrate it into my JUnit-only background, and get a better feel for when/how to use one vs the other. This article helped to clarify some of that:
Agitator - invariants/true all time JUnit - specific scenarios/outcomes
Both are good: - locking down invariants - ensuring specific scenarios give desired resuls
The Web page cited by Roland Pibinger gives many articles which describe what 'testability' is and why it is valuable. Readers should definitely examine the material referenced by Pibinger. Using Agitator after coding to generate unit tests will not result in better testability.
Worse, using Agitator after coding to generate the unit tests means that the developer has given up on the most important opportunity in writing unit tests - the opportunity to write the tests in advance of the code, as in test driven development. By allowing the tests to drive the design of the code, the developer can not only ensure better testability, but also ensure better code design.
Simply having automatically generated unit tests is not nearly as valuable as following a test driven development methodology.
Tools like the one Savoia's company developed can help the development process, but just using it to generate unit tests is not as helpful as it might first seem. So, Savoia's comments about 'test amplification' are spot on.