Testing Multithreaded Java Code

An Interview with Coverity's Andy Chou

by Frank Sommers
May 21, 2008

In this interview from JavaOne 2008, Coverity chief scientist Andy Chou discusses why traditional unit tests don't often help in uncovering concurrency-related errors, and why a combination of static and dynamic analysis yields better results when testing multithreaded code.

The advent of multi-core CPUs has generated much discussion about ways to make it easier to write concurrent Java applications. While new programming techniques can help create better code, vast amounts of existing Java code will be, or already is, run in a multi-core environment.

Moving an existing Java application to a multi-core environment may reveal hard-to-detect concurrency-related bugs. In this interview from JavaOne 2008, Coverity chief scientist Andy Chou discusses why traditional unit tests don't often help in uncovering such errors, and why a combination of static and dynamic analysis yields better results:

Many developers rely on a unit testing tool, such as JUnit, to ensure the correctness of their code. Although unit testing is still very important in the presence of multithreaded code, concurrent applications have characteristics that are not controllable by just the testing input, which is what you typically do when writing unit tests.

You have new kinds of defects in multithreaded code: Concurrency defects just don't exist in single-threaded code. Deadlocks occur, for example, when you have multiple threads, and each of those threads has a lock that some other thread is waiting for. Each of those threads are then waiting for each other in a circle. Race conditions are another kind of concurrency defect that happen when multiple threads try to access the same piece of data.

Timing is very relevant for those errors to show up: deadlocks might not happen unless the timing for them to occur is just right. Race conditions are also very timing-dependent and, as a result, hard to reproduce.

For that reason, it's not easy to detect concurrency-related errors using a fixed test suite. You might be lucky and detect concurrency-related problems running unit tests, but doing that is not easy or reliable. Because concurrency-related problems are so hard to detect using traditional testing methods, such defects often manifest in a production environment only. 

Concurrency-related testing is not about varying the inputs [in tests], or about testing to ensure the correct output given some input. It's about how the system controls the interleaving concurrent execution of threads that can cause strange behaviors.

Even in a single-threaded environment, there are often multiple paths through a program. The complexity of the number of different possible executions grows dramatically as you move to a multithreaded context. You have to think about the ways in which the multiple threads of execution can work with, or against, each other when they're accessing shared data.

I like to think of it as shuffling a deck of cards: you can shuffle the different threads together, and they then form a single execution, but the shuffle you get is different every time [the program runs].

In addition to unit testing, static analysis is another method to identify coding defects. Static analysis is great at covering all the different paths through the code and the entire code base, but it has a fixed level of precision, which is sometimes less than desirable. That's because static analysis doesn't have all the information available that you can only get at during runtime.

By contrast, dynamic analysis watches an execution, and during the execution of some code, it looks for potential defects, such as race conditions and deadlocks. While dynamic analysis only analyzes code that had actually executed during testing or a manual execution, it has very high precision, because it uses all the information available at runtime.

Static and dynamic analysis are very complementary techniques. When you put them together, they become more powerful than each would be by itself. That combination is especially useful in detecting concurrency-related coding errors.

In our most recent product, Coverity Thread Analyzer, we hooked our static and dynamic analysis tools together. The static analysis can provide hints to the dynamic analysis so you can avoid instrumenting certain locations of the code, and thereby avoid a slow-down that occurs with dynamic analysis. The static analyzer can also help identify what data is being shared, and what is supposed to protect that data, that the dynamic analyzer can then watch out for.

In the other direction, our dynamic analysis tool feeds information to the static analyzer, which then makes the static analysis more precise. We can infer some information at runtime about which locks are used in what way, and the static analysis can do a better job with that information.

Having that combined analysis really gives you the best of both worlds, and gives you benefits that neither has by itself. Running this combined static and dynamic analyzer helps you find defects that your regular testing would not necessarily find. 

Most dynamic analysis tools have a runtime memory overhead almost twice as much as the original program. Profilers and some other tools have an even higher overhead, from ten to a hundred times [the program's]. Our combined dynamic and static analyzer has a very low overhead so you can incorporate it into your testing environment easily and in a minimally intrusive fashion.  

Another thing to note when testing concurrent code is that code coverage takes on a different meaning in that context. Providing good coverage is not just about executing your code base once, but about executing that code in some combination. Since it's hard to execute concurrent code in all possible combinations, an analysis tool that combines dynamic and static analysis can help shortcut that process and pinpoint errors even without you having to set up a massively concurrent testing environment.

What do you think about Coverity's approach of combining static and dynamic analysis for testing concurrent code?

Post your opinion in the discussion forum.



Talk back!

Have an opinion? Readers have already posted 1 comment about this article. Why not add yours?

About the author

Frank Sommers is Editor-in-Chief of Artima Developer. He also serves as chief editor of the IEEE Technical Committee on Scalable Computing's newsletter, and is an elected member of the Jini Community's Technical Advisory Committee. Prior to joining Artima, Frank wrote the Jiniology and Web services columns for JavaWorld.