The Artima Developer Community
Sponsored Link

James Gosling on Java, February 2002, Part II
A Conversation with Java's Creator, James Gosling
by Bill Venners
March 25, 2002

<<  Page 3 of 5  >>

Advertisement

Abstraction Versus Vagueness

Bill Venners: In a previous interview, I asked you how abstract we should make contracts when we're designing the APIs. You said, "...as abstract as possible, because every commitment you make is a piece of flexibility that you've lost." Recently I encountered a problem involving a method called getBoolean in java.sql.ResultSet, which just retrieves a boolean value from a column. This method was being used in an API that I had purchased, and there was a bug. The code was calling getBoolean on an integer field in a PostgreSQL database, in which 0 was stored to mean false and 1 to mean true. The problem was that getBoolean was always returning false even if the value in the database was 1.

If you look at the getBoolean contract, it just says it gets the column's value as a Boolean. It doesn't say what to do if the database field is not actually a boolean, or if it is an integer field. The contract is abstract, but it's also vague. So I'm not sure where the bug is. It's probably in the driver. On the other hand, it may be a valid implementation to say, "Well, if it's not a boolean field because I don't have that in my database, I'll return false." The getBoolean contract doesn't disallow that interpretation.

James Gosling: I would guess that one was a bug in the driver, because that's a situation where it should have been tossing an exception.

Bill Venners: The getBoolean contract does say it should throw an SQLException if there's trouble accessing the database. In this case, though, there was no trouble accessing the database. The method could throw a runtime exception, but should it and which one?

Is there a difference between abstract and being vague? Should you be abstract, but not vague? In this case, I don't think it makes sense to interpret the contract as it's OK to return false all the time. But the contract doesn't say that you can't do that, because it's abstract. This particular implementation of getBoolean in the PostgreSQL driver returns a boolean value. It just happens to always be false when the database field type is integer.

James Gosling: Yes. This is one place where it is the art of computer programming. You need to specify as much as necessary for people to be able to use it correctly, but you don't want to over-specify things. It's very hard to do complete specifications. Almost nobody actually does specifications that are close to accurate. The only ones in the Java world that I think even come close to rigorous completeness is the Java language spec, and that's probably because most of the words for the current edition came from Guy Steele and Gilad Bracha, who are well-known totally anal freaks.

It takes a very special mindset to write specifications, and even so, Guy still gets upset. He's always finding hidden vagueness. So, in some sense, vagueness is inescapable. We're human, and you always have to interpret anybody's documentation with a certain set of reasonable-person filter to it.

Bill Venners: You mean, I have to be a reasonable person when I interpret the contract?

James Gosling: You have to say, "What would the reasonable correct interpretation for something be?" Invoking getBoolean on a field that happens to be an integer has to be wrong. Exactly how the system should respond to that is another question. People who actually implement it might have gone one way or another. But returning false always...

Bill Venners: ...is probably a bug.

James Gosling: I think anybody would agree it's the wrong thing to do.

On the other hand, you can also over-specify in subtle ways. For example, if the contract said, "getBoolean returns a new Boolean object that tells you whether the result is true or false," that has a piece of over-specification in it that can be damaging. It says it returns a new Boolean object. There are really only two values for Booleans in the world, true and false. So you can just have two Booleans in the entire universe, and you return a reference to one or the other. But all too often, because you have the word "new" there, you actually have to construct a new one. So, you have thousands or millions of instances of things that are all true Boolean objects.

Bill Venners: Or in my case, all false.

James Gosling: And that is an issue because the objects, besides having a value, also have an identity. So when you say, "This returns a new Boolean," you're also promising that each returned Boolean object has an identity distinct from any other identity. And that forces you into an implementation that consumes much more memory than necessary.

Bill Venners: That makes sense. You have to find the right place where you're being as abstract as possible, but no more abstract than is appropriate.

James Gosling: It's like this old Einstein quote, "Everything should be as simple as possible, but no simpler."

<<  Page 3 of 5  >>


Sponsored Links



Google
  Web Artima.com   
Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us