The popular Java code analysis tool FindBugs reached its 1.1 release today. Artima asked project lead Bill Pugh about new features, about the types of bugs FindBugs can easily identify, as well as about the types of coding problems that are harder to automatically detect.
The FindBugs project released the 1.1 version of its popular Java static analysis tool. FindBugs looks for bug patterns in Java code that can arise from any of the following reasons:
Difficult language features
Misunderstood API methods
Misunderstood invariants when code is modified during maintenance
Garden variety mistakes: typos, use of the wrong boolean operator
The FindBug project notes that,
Static analysis means that FindBugs can find bugs by simply inspecting a program's code: executing the program is not necessary. This makes FindBugs very easy to use: in general, you should be able to use it to look for bugs in your code within a few minutes of downloading it. FindBugs works by analyzing Java bytecode (compiled class files), so you don't even need the program's source code to use it.
FindBugs project lead Bill Pugh noted in emails to Artima that the 1.1 relelase,
Reflects a lot of work over the summer to both find new bugs and reduce false positives. Among other things, FindBugs 1.1 now finds about twice as many null pointer bugs as FindBugs 1.0, without increasing the number of false positives...
We are [also] putting up Java Web start versions of FindBugs preloaded with analysis results for Eclipse, Netbeans, Glassfish and JBoss... The WebStart versions not only show off a new GUI we wrote for FindBugs, but also provide source viewing for all the warnings so that you can see them in context.
The current release of FindBugs identifies the following type of coding problems:
Correctness bug: Probable bug - an apparent coding mistake resulting in code that was probably not what the developer intended. We strive for a low false positive rate.
Bad Practice: Violations of recommended and essential coding practice. Examples include hash code and equals problems, cloneable idiom, dropped exceptions, serializable problems, and misuse of finalize. We strive to make this analysis accurate, although some groups may not care about some of the bad practices.
Dodgy: Code that is confusing, anomalous, or written in a way that that leads itself to errors. Examples include dead local stores, switch fall through, unconfirmed casts, and redundant null check of value known to be null. More false positives accepted. In previous versions of FindBugs, this category was known as Style.
While FindBugs now identifies more types of code defects than previous versions did, we asked Pugh what kinds of bugs are harder to automatically isolate with a tool:
It is easy to evaluate the false positives generated by a static analysis tool, harder to evaluate the false negatives. Many program errors are at a higher level—you implemented the wrong business logic—and finding these with static analysis is a much harder problem than what FindBugs tries to solve.
Right now, we don't really try to report index out of bounds errors. We plan to investigate whether those errors could be effectively found. I worked pretty seriously on array index analysis about a decade ago, and I know how hard it can be to solve exactly. The unanswered question is how many index error bugs can be found using simple techniques.
There are a lot of potential API specific bug patterns that we haven't really investigated yet. For example, I'm sure there are lots of Swing and J2EE bug patterns that we could find.
We don't do a lot of work to look for security problems related to untrusted data: stuff like cross-site scripting and SQL injection. You don't want to just find the security bugs that are easy to detect, you want to find as many of them as possible, and this requires deeper analysis than we support. Fortify Software, the sponsor of the FindBugs project, puts a lot of effort into being able to detect those kinds of bugs, and we are happy to delegate this problem area to them.
FindBugs shows the power of static analysis, but can we do better? What do you find missing from static analysis tools today? What do you think could be done with static analysis that isn't already being done?
I've tried FindBugs on a few small projects. I could not find any serious bugs that eclipse (in REAL WARNING mode) could not find.
Since I use eclipse to work on java stuff, it would be useful to have FindBugs find serious stuff that Eclipse cannot find (I'm sure there are lots of things that FindBugs can find that Eclipse 3.1 currently cannot)
It will be nice if people can add a few lines to their buld.xml (and a few .jars to their LIB) AND BORROW a config file from someone who's used(uses) it on their live project to find hard to find bugs.
I understand that a lot of bugs are in user logic, validation and stuff like that. No, I don't expect FindBugs to help me find that. But if there are OTHER errors that FindBugs can find and IT DOES NOT take much effort on the build maintainer's part to add a decent FindBugs target to their weekly ant build(monthly build), then people might be tempted to!