Sponsored Link •
Java's Creator James Gosling talks with Bill Venners about many topics, including complexity, simplicity inheritance, composition, JSPs, servlets, and more.
On Thursday, May 10, 2001 I peddled my bicycle up to Sun Labs in Mountain View, California to interview Sun Microsystems Vice President and Fellow James Gosling at his Sun Labs office. In this interview, Gosling gives his thoughts on complexity and simplicity, inheritance and composition, JSPs and servlets, and community design processes, and more.
Bill Venners: What have you been up to this past year?
James Gosling: Doing a lot less traveling. I'm not really going around preaching the gospel anymore. Folks are pretty converted these days. I hardly give talks at all now. I've almost exclusively switched over to doing long Q&A sessions where I get up on a stage and either people ask me questions directly, or I pull questions out of a fishbowl. For the last year I've been working at Sun Labs and ignoring all the usual corporate 'goo' that comes with working at a company.
I've mostly been working on developer tools. I took a little hacking vacation earlier this year to do a relatively sophisticated Web server. It's just a pile of JavaServer Pages that manage the labs and the corporate document archive. And that has mushroomed features like mad. But my real job has been developing tools. In particular, I'm interested in tools for people who have to write code.
Most developer tools try to shield you from actually writing code in constructing the GUI bits or the database bits. Yet when you do write code you usually get glass teletypes where high tech is keyword coloring. Really high tech gives you a bit of help as you're typing in names, but that's where it ends.
So my work lately has centered on refactoring ideas, where you view a program as an algebraic structure, and you start doing essentially algebraic transformations on the program. That's been a lot of fun.
Bill Venners: You mean algebraic transformations in the context of refactoring your program?
James Gosling: Yeah. One thing my tool will do right now is let you rename a class. Renaming a class at one level is really easy; you just change the name. But how do you change all the references to that class and all the imports? And what about renaming when it includes moving a class from one package to another? I did all this stuff to rederive all the import lists in order to deal with the various naming issues. But it's going even further. This is prototype number three, or maybe number four.
Bill Venners: Will this tool be seen outside of Sun?
James Gosling: Yes. If it gets to the state where it looks like it might have even a vague chance of being interesting. My hope is to throw it over the wall some day.
Bill Venners: In several recent interviews, you claim that the main challenge for programmers is complexity. Can you elaborate?
James Gosling: In some sense, that's what this tool I am building is all about. How do you write a complex application? How do you deal with an application that's a million lines long? How do you even come close to understanding it? How do you make a change to a system like that? How can you cope with it?
Another axis of complexity exists as you lay the application out on a network. One of the things that Java is good at is giving you this homogeneous view of a reality that's usually very heterogeneous. One of the things going gangbusters recently is the cell phone business. The last time I heard a number, it was like 60,000 Java cell phones were being shipped every day. The numbers have become staggering. And that's not the highest volume Java platform. The highest is actually smart cards, and that's a really big number, although I don't know the exact number.
People building these applications that span the network from edge to center to edge, where you have part of the application in whatever the edge device is -- whether it's desktop or cell phone or PDA -- some bits and pieces in the infrastructure, and some bits and pieces in the back ends and the databases. One person can work from end to end, but how do you manage the complexity of that?
We don't have really good ways to deal with that issue. We just barely have tools that let us look at what one system is doing and map that across an end-to-end architecture. There are people who have these embedded debuggers for dealing with things like J2ME devices. How do you debug code that's inside your Oracle database? How do you look at the whole thing in totality? That's the hard and interesting problem. The systems being built get more and more complicated every day.
I spent a lot of time a couple of years ago on the whole real-time effort. The motivation for that came entirely from the real-time community. The main issue was that programmers are not writing a few thousand lines of assembly code anymore. These systems are getting really huge. And Java has proven to be pretty successful in building large reliable systems.
Bill Venners: I'm glad I asked that question because I thought you might have been talking more about the complexity of large, monolithic programs.
James Gosling: Yes, even isolated things. Some of these isolated applications that sit on one machine are a million lines of code. How do you deal with that? Most people have no way to wrap their head around it. There are all kinds of tools available, from organizational tool you get from object-oriented methodology, to some of the tools that are based on that, like UML modeling.
But of course the more tools you build to cope with such complexity, the more complex things become. We always strain at the limits of our ability to comprehend the artifacts we construct -- and that's true for software and for skyscrapers.
Bill Venners: Increasing complexity is also being driven by hardware that's getting cheaper and more powerful.
James Gosling: Absolutely. One of my favorite lines is that computers are driven by Moore's Law, which is an exponential process, and human beings are driven by Darwinism.
Bill Venners: And that's a linear process or what?
James Gosling: One way to look at it is as a Monte Carlo process. In Monte Carlo algorithms, the precision of the result doubles as you double the amount of time. So you're getting better on a square root kind of a curve, whereas computers are going on a two-to-the-N kind of curve. Their curvature points in opposite directions. How do people with skulls of a limited size deal with that?
Bill Venners: The opposite of complexity is simplicity. I have often heard you describe your philosophy when designing Java in the early days: you didn't put something in Java unless five people screamed at you and demanded it. In one interview, you told this really good story about moving to a new apartment and something about keeping things in boxes.
James Gosling: That's actually a general principle for life that works really well. When you move to a new apartment, don't unpack. Just sort of move in, and as you need things, pull them out of the boxes. After you've been in the apartment for a couple of months, take the boxes -- don't even open them -- and just leave what's in there and throw them out.
Bill Venners: The 'don't even open them' part is important because it's very hard to throw things away once you know what they are.
James Gosling: Right, because if you open them, you say, 'oh, I can't part with that.'
Bill Venners: So would you say that simplicity is a general philosophy programmers should always have when designing programs?
James Gosling: I think in any kind of design, you must drive for simplicity all the time. If you don't, complexity will nail you. Dealing with complexity is hard enough.
In programming language design, one of the standard problems is that the language grows so complex that nobody can understand it. One of the little experiments I tried was asking people about the rules for unsigned arithmetic in C. It turns out nobody understands how unsigned arithmetic in C works. There are a few obvious things that people understand, but many people don't understand it.
So one of the most important criteria for judging a design for me is the manual. Is the manual out of control, or is it reasonably concise? You can write a pretty decent Java manual in less than 100 pages. The current Java language spec is pretty thick, but that's because it's probably the most detailed language spec ever written. It goes through all of the details. I couldn't write the Java language spec.
Bill Venners: When asked what you might do differently if you could recreate Java, you've said you've wondered what it would be like to have a language that just does delegation.
James Gosling: Yes.
Bill Venners: And we think you mean maybe throwing out class inheritance, just having interface inheritance and composition. Is that what you mean?
James Gosling: In some sense I don't know what I mean because if I knew what I meant, I would do it. There are various places where people have completed delegation-like things. Plenty of books talk about style and say delegation can be a much healthier way to do things. But specific mechanisms for how you would implement that tend to be problematic. Maybe if I was in the right mood, I'd blow away a year and just try to figure out the answer.
Bill Venners: But by delegation, you do mean this object delegating to that object without it being a subclass?
James Gosling: Yes -- without an inheritance hierarchy. Rather than subclassing, just use pure interfaces. It's not so much that class inheritance is particularly bad. It just has problems.
One of the metaphors I like to use for the whole design process is this game you see in arcades, particularly the older ones. If you go to the Santa Cruz Beach Boardwalk, there is this game called Whack-a-Mole. It's a table with 16 holes, and this little mechanical mole sticks his head out of one of the holes for a second. You whack it with a bat. He pops his head up, and you whack him, and he pops up someplace else.
Engineering design is like playing Whack-a-Mole. You have a problem sticking up over there. You go and whack it, and it goes away. But have you really fixed it, or has it just moved somewhere else? It's often hard to tell whether you've solved the problem or moved the problem. And more often than not, when people say they've solved the problem, they've just moved the problem. So one of my big issues with things like delegation, although I feel like there's a right answer in there, is that delegation is going to have its own problems too. But it hasn't been used as extensively, so does it have more problems?
It's almost a truism that you never actually find a perfect answer to a problem. You just find the answer that has the least problems.
Bill Venners: Given that we have both class and interface inheritance in Java, do you have any guidelines you would recommend to people trying to figure out which one of these they should use? When is it appropriate to use class extension, and what is the trade-off versus interface implementation and composition? Or is that too general of a question?
James Gosling: No, it's not too general of a question. I just wish I had some good rules because it always gets kind of vague for me. I personally tend to use inheritance more often than anything else.
Bill Venners: Inheritance meaning class extension?
James Gosling: Class extension. I tend to use classes a lot more than interfaces, and I'm not sure why. I'll use interfaces for things that need to be really abstract and really clean -- something runnable or printable. It's almost like if the class name ends in a-b-l-e, then maybe it ought to be an interface. That tends to be the way I operate, but I suspect that I don't use interfaces as much as one probably should. And delegation is something that I do a lot under the sheets. One of the nice things about delegation styles is that the user of the contract can't tell.
Bill Venners: You said that for several months you've been working on the document archive and using JSPs. I was wondering if you have any words of advice for people who are using JSPs, servlets, tag libraries, and so on.
James Gosling: Probably the most important thing that whacked me in the face was that with the basic JSP model, you have a Webpage template, and you're filling in the blanks. I found that there were a number of places where that worked reasonably well; but for the most important things -- the most complicated and sophisticated pages -- I really only had one. It's like the whole system revolves around this one Webpage that morphs into just about everything, like the portal page at Excite or Yahoo. The page isn't a template into which you plug little things because everything is plugged together; everything is computed.
It's computed on the outside, and you piece together little fragments. If you try to use JSPs for that kind of a page, you'd find it feels like it's inside out. What you want to do is assemble little fragments by the computing results of database queries, and you have the user's profile to provide page layouts. For more sophisticated pages, JSPs tend to get in your way. Doing a servlet is much simpler. There's a lot more mechanism in a JSP to help you and support you. Servlets tend to be simpler in terms of what they provide you, but that simplicity comes with a huge amount of flexibility. My central Webpage quickly devolved into three or four lines, where almost everything is done in one line that invokes one method, and that one method constructs the whole page.
Bill Venners: Two years ago, I asked you how the proliferation of network-connected embedded devices was going to change software, and you said, 'you can't sit alone in a room anywhere and write software. It's more of a social thing.' I've recently been involved in the Jini community, defining the process of how standard APIs will be agreed upon. In the very early days of Java, you were the benevolent dictator. And now the Java Community Process is running that show.
James Gosling: Right. These days I'm the fly on the wall.
Bill Venners: From your perspective, as the fly on the wall then, is there anything you could relate to us in the Jini community about process? What has worked in the Java Community Process? What hasn't? What can be improved?
James Gosling: One of the most important things is the notion of a reference implementation. The ink isn't dry on a spec until somebody has built it.
One continuing problem is the tension between having a bunch of engineers in a room designing and the organizations with their political agendas that turn it into this competitive marketing thing that quickly becomes dysfunctional. Generally, if the engineers can just sit down and do their thing, the end result will be higher quality. Then there's this problem that the word 'quality' has a point of view because various industrial concerns tend to look at it and say, 'well, it's not perfect for me.'
Bill Venners: I see. I don't know if it was Ron Goldman and Dick Gabriel, the guys who wrote the first draft of the Jini constitution, or where it came from, but from Jini's beginning there were two houses. One's a commercial house, which represents people who have invested real money in the technology. They want to have a voice in its evolution. And the other house is nerds. They simply care about technology -- they are the designers in a room. For a spec to be approved, it has to pass both houses. It's like an experiment to keep an appropriate balance of commercial and technical concerns. I'm not sure how well it will work.
James Gosling: It's pretty hard, but certainly the computer industry is filled with examples of places where the only people in the room were really the politicians, and the bits of technology that came out were really goofy. It's hard. I wish there was a nice answer, but anytime you have more than one person in the room, politics is everywhere.
Bill Venners: It becomes politics. Even in the room of engineers, you have to deal with personalities.
James Gosling: Yes. That's one of the things they never teach in school, and it ends up being the hardest part of any engineer's job -- the whole interpersonal thing. No matter where you are, you're dealing with people.
Bill Venners: Does an interface imply a contract?
James Gosling: Yeah, an interface certainly does imply a contract, through what one's notion of a contract is is pretty variable. The part of the contract that is embodied in the interface in Java is really about typing and what the parameters are. There are lots of other extensions to that. You could go into things like what Eiffel does with pre- and post-conditions. You can get into some pretty serious behavioral modeling.
One time I had actually done a bunch of stuff to try to figure out how to do that in Java, long before Java was actually launched. And I got hung up on wanting the contract specifications to be ones that are actually analyzable, rather than just a Boolean expression that gets evaluated. I wanted something you could actually do some theorem proving about.
Bill Venners: In other words, look at the formal contract specification and then look at the code and say, yeah, this code does that?
James Gosling: Well, actually doing that all the way is--that's the fantasy of the folks in the formal verification field. And by and large, those people have been butting their heads against the wall so hard for so long that they have given up on the ultimate fantasy. But there are things on the way there that feel pretty valuable. I never got happy with doing something like that, and then we had to ship something.
Bill Venners: So the contract for Java classes and interfaces ended up being a human language description.
James Gosling: It's pretty much text, yeah.
Bill Venners: So as a programmer, I have to read that text and understand it. If I implement that contract myself, I have to make sure I do it correctly.
James Gosling: Right.
Bill Venners: And then if someone someone writes to that interface, they have to make sure they are not making any assumptions that are not beyond the scope of the contract.
James Gosling: Right, right. The Javadoc stuff I put into the original compiler in the very beginning. It's funny, you talk to tech writers about Javadoc and they say, "it's so horrible." A good tech writer can always do a better than the Javadoc can. And the answer is yes, that's absolutely true. A good tech writer can organize things and structure things. But having the text of the spec associated with the code deals with all kinds of synchronization issues. When you update the code, the fact that the spec is sitting there in front of you kind of forces you to be aware of the spec as you're editing. And heaven forbid, you might actually decide to conform to it, or edit it if it's changing.
That is one of the central problems with all of the mechanisms that people put together for doing interface contracts. How do you give people the discipline to actually maintain them? How do you have any idea that they are correct?
Bill Venners: That the contracts are correct?
James Gosling: That the contracts are correct. And in particular, how do you express all kinds of really subtle things. Just a Boolean equation that says, "stack depth is greater than zero," doesn't help much when you get to more complex APIs. And so English language text in many senses is woefully inadequate, but it does have the advantage of being clear, in your face, have some hope of being updated regularly, and it's very flexible.
Bill Venners: Last year when I interviewed you, you said, "by golly, you have to fulfill the contract."
Certainly if the next version of Java came out and they changed the meanings of all the methods in class
String, it would
break everybody's code. You're not supposed to ever break those contracts. The idea of a contract is:
it's a contract. But are there situations in which I can sometimes bend a contract? Such as if nobody is actually using something,
or it doesn't hurt too many people?
James Gosling: This is messy, right? If you are writing a bunch of code and the API is not exposed to people in the outside world, if you have this contained universe where you can actually see all the bits and pieces that reference something, then you're actually in a really great situation. Because you can do the kind of thing you just talked about. You can say, "This guy was never getting used."
One of the nice things about Java is that, at least for the language parts, people were using Java for five years before we released it. And it went through a lot of really big changes, because although there was a developer community of about 20 people, I could find every piece of source code in the world. And that let me do experiments to find things out.
One of the things I did was there were some issues with
goto. Java had a
goto at one point. I did this study of what people were doing with
goto? And based on that
study of a half million lines of code, I just got rid of it.
At one point I was doing a revision to the set of bytecodes. So I found every constant and figured out what integers people use commonly, what floating point numbers people use. When you have a controlled universe, a contained universe, that is such a luxury. And that's why I have this tendency to not release things real soon. Because as soon as you have users, all of a sudden, you have this problem.
In your question, you asked, isn't it okay to change things if nobody is using it? That's fine. You should be able to change things if nobody's using it. The hard part is, how do you know if nobody's using it? And how do you evaluate the consequences of that change? And once your user community becomes this kind of unknowable group out there, then you have published this contract. You don't actually know which are the parts you get to dicker with and which are the parts you don't. You know what you've promised the world, and by God you've got to live by it.
Bill Venners: Bertrand Meyer, in his Object-Oriented Software Construction book, talks about his philosophy of contract. If I'm a method and I say in my documentation -- or my preconditions, in the case of Eiffel -- that you have to pass me a positive int, then if you pass me a negative number, Meyer says it's okay if I give you a wrong answer back. Because you have broken the contract. What he wanted to avoid is you checking to make sure you're passing a positive number and me checking to make sure you're passing a positive number.
My take on Java's philosophy on that, however, is no, I should check for a negative value and throw you back an exception, so it's predictable what I do if you break the contract.
James Gosling: Yes, and actually what I think the correct answer is--and this is what I was trying to do -- was to make the contract evaluatable as much as possible at compile time. So that, where possible, you could actually say, "the value you're passing me is going to be negative or could be negative. Bad, bad, bad." Because when a contract violation is indicated by passing an exception, you don't actually see that until the code is actually running, which gives you this testing problem.
Whenever you're testing something, you have to make sure you actually exercise all these bits and pieces. And the world is filled with hunks of code that were written to handle some exceptional situation, and they have never actually been tested.
It may be that if you have a parameter check, "Is this a negative number?" it almost never actually happens that it's negative. It's the Ariane 5 problem. Remember the Ariane 5 failure?
Bill Venners: You have a link to that off your webpage.
James Gosling: Yeah. It like blew up. And that was because they had taken this one piece of software with a parameter in it that had to do with the trajectory. It was built for the Ariane 4, but the Ariane 5 had bigger engines, and the rocket was getting into the flat part of its trajectory while it was still accelerating. So numbers were coming out of range.
I believe it is completely beyond the sate of the art to try to do that kind of analysis statically, that particular one, but there are all kind of analyses that you could actually do. You don't declare failure if you can't do everything statically, but every thing that you can push into the static analysis phase of the system to get earlier and earlier is yet another source of reliability to the system.
Bill Venners: Interesting. Given what we have, then -- we have Java as it is -- what should we do to prevent our rockets from exploding? Should I, at the beginning of my method, just make sure you pass me good data and throw an exception? In either case we have a problem, especially if this is controlling a rocket, whether I give you a wrong result or throw an exception. Or let me ask it this way: Bertrand Meyer says an exception indicates a broken contract. What do you think of that?
James Gosling: I guess in the Java world, exceptions can mean lots of different things.
Some of them are broken contracts. In the world of
Throwable things that can be tossed around
by the exception mechanism, there are two classes. There are the ones for which the checking algebra
happens. I think those are predominantly the contract violations. But, for example, in the file I/O system,
when you go to open a file and the file isn't there, you get an
IOException. In that case, it seems
to me that the exception is a part of the contract and you can actually handle it.
There are things that can go wrong for which it's actually reasonable to
expect the application to be able to handle. And even outside of that, one of the things that
people do a lot in Java if they are building a command dispatch loop, and it's invoking
things that are plugged in, is wrap that call with a
try catch Throwable. That will catch all
kinds of errors, even the ones that are contract violations. That can be a really great way to
increase the reliability of the system, because it gives you a way of saying, "Okay, this piece of the
system just died horribly because it had a contract violation, but damn it, I've got to carry on."
So, in the Ariane 5 case, the problem wasn't actually that they had this range problem, because the range problem was only in this diagnostic system. The problem was that when the out of range value happened (and the out of range value happened on all of the computers, because they all had the same code) then they dumped ASCII -- I don't know if it was a stack trace or what -- out on this main serial line. And that serial line happened to go to the gimbals that controlled the motors. And so, instead of getting pointing angles for the engines, it was getting a stack trace and the gimbals went erh! erh! The engine went erh! And the thing just tore itself apart.
Bill Venners: That makes sense. It's a good justification for throwing an exception,
because then I can catch
Throwable. If you just give me the wrong answer, I'm not going
James Gosling: Right. And in systems that have to be long lived and reliable, they have to have a comprehensive strategy for dealing with failure, because failure always happens. There will always be bugs, there will always be pieces of equipment that get smacked. There will always be alpha particles that hit busses. Things go weird and the average answer, which is to just roll over and die, is not a useful one. And particularly as you get into the systems like flight avionics, you just don't get to crash.
Bill Venners: You don't get to crash.
James Gosling: From the software's point of view, you have to keep on going, you have to do something sensible. You can't just say, "Oops, no, I'm not going to work anymore."
Bill Venners: In 1.4, assertions are due to come into Java.
James Gosling: Yeah.
Bill Venners: I'm curious when you would use assertions versus when you would use exceptions.
James Gosling: Assertions are just a syntax for generating exceptions. When an assertion fails, an exception gets tossed. So, an assertion in Java is essentially: if this funny condition isn't true, then throw an exception. At its heart, that's what an assertion is.
Standard assert macros in C and C++ have all done basically that. So the assert statement is not actually adding anything really new. What it's adding is a somewhat easier notation, and there's some other stuff under the sheets for enabling and disabling them.
Bill Venners: If the main purpose of assertions is to streamline the process of me checking for things and throwing exceptions, should I remove that at run time?
James Gosling: One of the nice things about the way Java works is that in general people ship around Java class files and the compilation phase happens just in time. These JIT compilers are all over the place. One of the really nice things you can do with that, which the assertion facility uses very heavily, is the code you ship around has the assertions in it. Assertions look like: if my gnarly test fails, then throw the exception. It gets turned into roughly: if assertions are enabled and my test fails, throw the exception.
And the first clause in there, the "if assertions are enabled," is generated
in such a way that most just-in-time compilers will go, "Oh, that is statically false, so false and anything
is false, so I don't have to evaluate the assertion condition, and since the
if is false, the then clause can never be executed." So the
whole thing just drops out.
You can deliver your code with all the assertion tracking enabled in it, then, and at run time when you go to launch the app, you can
say, "Turn off assertions."
Bill Venners: To improve performance?
James Gosling: To improve performance. And one of the great beauties of that is you can deliver one library to people rather than a debug library and a regular library. And they can decide.
Bill Venners: In an interview I read, you were asked what's the most common mistake Java programmers make, and you replied, "creating too many little objects." I wanted to get more information about what is too little and what is too often instantiated for an object. If I am making this big API or writing a program, to what extent should I worry about how big my objects are and how many of them I'm creating? And isn't that a performance versus ease of use tradeoff?
James Gosling: Often systems that have lots of little objects are easier to understand, easier to maintain, a lot cleaner. That's often how people get there is they have taken courses on building nicely structured systems. They do that, and the system is nicely structured. But then a little later, you do some performance benchmarks and almost always, there is not a problem. But every now and then, you find that holy mackerel, I went to do something really simple and I'm 27 levels deep in methods. There was an app I was looking at the other day where the guy was building this data structure and the way he did it, he eventually doubled the storage consumption.
Bill Venners: How about immutables? When should I use immutables versus non-immutables?
James Gosling: I would use an immutable whenever I can.
Bill Venners: Whenever you can, why?
James Gosling: From a strategic point of view, they tend to more often be trouble free. And there are usually things you can do with immutables that you can't do with mutable things, such as cache the result. If you pass a string to a file open method, or if you pass a string to a constructor for a label in a user interface, in some APIs (like in lots of the Windows APIs) you pass in an array of characters. The receiver of that object really has to copy it, because they don't know anything about the storage lifetime of it. And they don't know what's happening to the object, whether it is being changed under their feet.
You end up getting almost forced to replicate the object because you don't know whether or not you get to own it. And one of the nice things about immutable objects is that the answer is, "Yeah, of course you do." Because the question of ownership, who has the right to change it, doesn't exist.
One of the things that forced
Strings to be immutable was security.
You have a file open method. You pass a
String to it. And then it's doing all kind of authentication
checks before it gets around to doing the OS call. If you manage to do something that effectively
String, after the security check and before the OS call, then boom, you're in. But
Strings are immutable, so that kind of attack doesn't work. That precise example is what really
Strings be immutable.
Bill Venners: How about the wrapper types? I was asked a couple of weeks ago, "Why are wrapper types immutable?" I wasn't exactly sure what to say.
James Gosling: Same answer.
Bill Venners: Because you can, and it helps with caching?
James Gosling: Yes.
Bill Venners: Okay. The tradeoff seems to be that sometimes I may end up creating a lot more little objects.
James Gosling: You may. One of the proposals that keeps surfacing -- I actually wrote up a proposal for this four or five years ago. It's something that happened after Java got released, and so it gets really hard. And this one has really resurfaced with a vengeance lately: this notion of strengthening the support for immutables.
Another thing about immutable objects: if you have a class that's final and whose fields are final, except for one nasty problem, the optimizers can do really cool things with them, because they don't necessarily have to allocate them in the heap. They can have pure stack lifetime. You can copy them at will. You can replicate them at will, which is what happens with primitives.
That's one of the reasons that primitives are not objects, because it is so nice to be able to just replicate them. When you pass an integer to a method, you don't have to pass the pointer to that integer. You can just replicate that integer and push it on the stack. So, it's a different integer. The same value, but a different integer and you can't actually tell. And if you look at a complex number class in C++ versus complex numbers in Fortran. In Fortran, they do all kinds of goofy things allocating complex numbers to registers, which really doesn't work in C++. And that mostly has to do with the fact that in C++, they are still objects and they have an identity. It's this whole platonic thing about identity. The nit that causes problems with optimizers and immutable objects is that as soon as you have a notion of identity that is independent of the value, then various things get really hard.
Bill Venners: Why are there primitive types in Java? Why wasn't everything just an object?
James Gosling: Totally an efficiency thing. There are all kinds of people who have built
ints and that are all objects. There are a variety of ways to do that, and all of
them have some pretty serious problems. Some of them are just slow, because
they allocate memory for everything. Some of them try to do objects where
sometimes they are objects, sometimes they are not (which is what the standard LISP system
did), and then things get really weird. It kind of works, but it's strange.
Just making it such that there are primitive and objects, and they're just different. You solve a whole lot of problems.
Bill Venners: Earlier you said "the whole platonic thing about identity." What does Plato have to do with identity?
James Gosling: The notion of identity is something that the philosophers have argued about for a gazillion years. One of Plato's things, I don't remember the names, but the basic story is you have Bob the fisherman. He's out there fishing. His boat springs a leak. He goes to see his friend Fred, who fixes boats. Fred pulls a plank off of Bob's boat and puts a new plank in. Bob's now happy and he goes away. His fishing boat is now fine.
A month later, the same thing happens again. He goes to his friend and after several years of doing this, every last plank in Bob's boat has been replaced by Fred. And so, now you have a whole new boat. Is this still the same boat? Then the trick to it is, unbeknownst to Bob, Fred has been saving all the little bits of wood that he has taken off of Bob's boat. And he's put them back together again.
Bill Venners: And he wants to swap.
James Gosling: So, which is Bob's boat? That with different names is one of Plato's things.
Bill Venners: Another question that I was recently asked is, "Why are there no static methods or non-public methods in interfaces?"
James Gosling: There almost were. At least, they almost were static methods. And non-public probably would make sense. I actually don't think there is a strong rationale now. At one time, there was kind of a delirious period, at least for the requirement to do public.
Static is kind of different in that one of the things I was trying to get to was that interfaces were sort of purist specification, no behavior. And it felt to me like there was a sort of a cleanliness to saying, "interface purely." And actually, that has often worked out pretty well.
There is still part of me that says, maybe interfaces should never have existed. People should just use classes for interfaces. But there turned out to be some nice things that get done with interfaces that are different. There's an interesting performance difference that most people never think about, which is that interfaces need to do a kind of a dynamic dispatch, whereas strict classes don't. The class model in Java is really rather reactionary. It's almost exactly the original class model from Simula 67. And one of the nice things about that model is you can make method dispatch just fiendishly fast.
There are all kinds of tricks for doing interface-style dispatches, flexible multiple-inherited dispatches, pretty fast. But they are always a couple of instructions longer at least and maybe more, depending on how you it. Although there are techniques that trade it off against memory. You can actually get it to the same performance as single inheritance, if you are willing to basically spend more RAM building tables. But one of the unsung nice things about the difference between classes and interfaces is that it is statically knowable whether you can do dynamic versus static dispatching.
Bill Venners: You were talking earlier about complexity being the major challenge for programmers. The main other thing I think of as being a major challenge for programmers is dealing with the uncertain future and the changes it will require.
James Gosling: I guess one of my deals is that the word complexity covers a whole lot of ground for me, including the complexity of dealing with the future. The future becomes this great unknowable. One other thing we talked about earlier was the luxurious situation that some people are in when they are building an application, and they are the only people who will see the code. They are running it for themselves and that's it. When it comes to dealing with the future, those folks have a better situation. They can speculate the directions in which they might want to go. So they build the foundations for things they might do, given that they have the guts of the thing open right now.
But where life gets really hard is for people who are building APIs for public consumption, which is most of what happens in the Java software organization. All of the people who are doing APIs in the JCP, APIs like Swing. One of the reasons that Swing is so incredibly complex is that it has to serve 3 or 4 million masters. There are all these folks doing all kinds of different things, and when you are coming up with the API, if you come up with some speculation of the form, "Gee, I think somebody might want to do this," you're probably right. Even for arbitrarily bizarre values of "this." And it can be really difficult to keep things small when the domain that it's going to be applied to is so very large.
Bill Venners: One thing you seemed to really care a lot about in Java was making it possible to make changes once you have frozen a particular API in stone. You've released it. You've thrown it over the wall. Now you really can't change it in a way that will break those peoples' code, but there are a bunch of changes you can make that are binary compatible.
James Gosling: That are binary compatible, right. I was searching everything I could find that I could come up with a decent way to implement that would allow you to evolve and still maintain binary compatibility.
Probably the central thing in that is in the virtual machine spec. The central thing is really about dynamic binding rather than static binding. The whole notion of having a just-in-time compiler has a whole lot of different aspects to it. One of them is this issue that if you look at a Java binary, it doesn't have fixed offsets for fields. It doesn't have slot numbers for virtual function tables. It's got symbolic references. And they get bound essentially as late as possible. And that gives you a terrific amount of flexibility. Some things seem like they ought to be obvious, that you ought to be able to do them, like adding private members to classes. But in something that does static determination of offsets of fields, you can't do that. You have to do all of that stuff dynamically. It really makes the issue of evolution work a lot better.
Bill Venners: To go back to the subject of complexity, I think there's an algorithm for dealing with complexity that in general people use and in software in particular it's used. If you have this massive complex thing, you chop it up in pieces to start with.
James Gosling: Right. Divide and conquer. It's been with us for years.
Bill Venners: And each piece is then a understandable amount of complexity. And that's one of the ways I think about what a class or an object should encapsulate. It should encapsulate an amount of complexity that if there is someone working on the guts, working on the implementation, they can understand it.
James Gosling: Right.
Bill Venners: And then you have interfaces to those parts. And in those interfaces, you raise the level of abstraction. You abstract away some of the details.
James Gosling: Right.
Bill Venners: For example, if someone peels of the back of my television set, they could focus on and understand the guts of the TV. They don't have to think about and understand the entire television broadcast system all at once.
James Gosling: Right.
Bill Venners: But when I use my TV, I just push on. And maybe when I push on, 30 things happen inside the TV. And maybe over on this other TV when I push on, 50 things happen inside it that are different. I don't have to know all that. I just think in more abstract terms of on - off. The external interface to the TV abstracts away much of the details of its implementation. So, if you think about pulling all these pieces together again, I can understand this complex whole in terms of the simpler, more abstract interfaces to its parts.
James Gosling: Right. It's the pair of intellectual tools that people have used since the dawn of time. One is the divide and conquer, splitting things up in pieces, so they are individually understandable. And then there is hierarchical composition, where you take these things and you wrap them and put them together in these more and more abstract concepts.
There are often flaws in that, though. There's a fancy word in here that I'm forgetting. But there is often this view that with large complex systems, you can deconstruct them, decompose them into their parts and then analyze them individually. In biology, that kind of view is one of the things that led to a lot of major mistakes. Because you have to take this ecological view, of viewing the entire system. Because if you just look at a leaf or a frog, it's almost impossible to understand that leaf in isolation. You have to understand the whole ecology surrounding the leaf.
And so, there's this dilemma. You really do need to split things down into pieces, but when you split them down into pieces, you have to understand that they don't make any sense in isolation. You can't forget the ecology within which they live.
And so this business of on the one hand decomposition, dividing and conquering and hierarchical composition, they actually go hand in hand because hierarchical composition, putting things together, is really the only way that we see large systems of things, because it lets us encapsulate their behavior. And in order to understand a little piece, we have to understand the rest of it and so, you may end up with this funny way of working where you never forget the whole system, but you might have some parts that you're thinking of decomposed all the way down to the iron. Other parts are conceptually more aggregated, so that you don't lose the perspective of how this thing fits into the world.
A certain amount of that manifests itself in the way that a lot of designs become iterative. You do one design, you see how it all comes together and you say, nyeah, I don't think so and you go back again. Because in an awful lot of systems, they might be these big hairy systems, but there's a nut in the middle, around which it all kind of rotates. There's one little technological detail that kind of makes it all work. 99% of the code you ever write is kind of do-whatever-you-want code, but there's something in the middle that's really crucial. You just have to get that one algorithm, that one data structure, absolutely right. And often you find that these things pervade entire systems.
Bill Venners: What things pervade? That little kernel of importance?
James Gosling: That one little kernel of importance. If you're building a text editor it's often either like the document representation or the redisplay algorithm. And the rest of the work rotates around that.
Bill Venners: Looking at the complexity of distributed systems, then, each process on the network is like one piece of this whole system. And the interface between those pieces has traditionally been a protocol. One of the things Jini does is use mobile objects to be a nexus between the client and the service. And that lets you raise the level of abstraction between the client and service. Now, when we define a protocol as our interface, there is some handshaking going on and then there's just data structures. And nowadays, XML is all the rage for structuring data. But in my protocol spec I have to talk pretty much about all the details of that data structure. The client-service contract is in terms of information. Whereas in an API, I'm talking about code, and the contract is more in terms of behavior.
James Gosling: Right.
Bill Venners: And the thing I think I can do with behavior is I can be more vague. I don't have to say how, just what.
James Gosling: Right. Almost always, the less you can say, the better off you are.
Bill Venners: Because?
James Gosling: Because every commitment you make is a piece of flexibility you've lost.
There is this duality between network protocols and interfaces. They are both kind of the same thing. They are a way for these two parties to communicate. When you deal at the level of a network protocol, you don't have to specify what the program interfaces on the other side look like. So that gives you the freedom to write a C API here and a Pascal API there and an Ada API there, and they can be completely different. They just have to arrange the bits on the wire exactly the same.
And then the dual of that is when what you specify is the API. And you say, this is what it is, and how things get between here and there is, who knows? And that gives you a tremendous flexibility to have different ways of moving things back and forth. So for instance with RMI, one of the reasons that it's so popular is you can play all kind of games in that space in between. One of the things you can do is, if the two sides happen to be on the same box or in the same address space, all the stuff in the middle collapses out and it's just a direct call. Whereas, in these other systems, often you'll find that you're encoding and decoding even though you're not moving the bits anymore.
But you can also play games with how you encode it. In RMI once upon a time you could put in all kinds of different encoders. You could put in a binary wire encoder. It was actually possible to write an XML encoder or an IIOP encoder. People don't do that much, but the ability to drop in your own encoder is basically there.
You get independence of what the actual transport mechanism is. And to a certain extent, you can blend these. Because if you are doing the interface version, you can decide to target a particular binary encoding or not. So, you can get hybrid ways of building things. Nobody actually does that, although the RMI over IIOP stuff is kind of in that space.
Bill Venners: Or for example, a Jini service proxy isn't necessarily doing remote procedure calls, so it could actually implement the service locally. Or it could talk to something across the network using any protocol. There are all these web service things that will probably be coming up in the next few years. You could wrap one of those in an object that has an interface. And when a client talked to that object, it would be using that XML-based protocol.
James Gosling: Right. So that gives you pretty tremendous flexibility.
Bill Venners: Another thing you mentioned when talking about complexity was how do I understand the behavior of a large distributed system end to end when it's very complex. And I think it's worse than that, in that often the whole system doesn't exist. Because there are pieces that haven't even been imagined yet or created yet, that will eventually hook into this.
James Gosling: Absolutely.
Bill Venners: So the whole system can't be tested together because parts of it don't exist yet. And another reason large systems can't be tested as a unit is that the parts that do exist can simply be too vast. The test matrix is too big. Too many clients and too many servers that want to talk to each other. So, one of the tools that I think Java itself uses and I think will be important in getting things to work together that can't be tested together is testing against the contract. Testing each piece against its contract.
James Gosling: Right. And that's fine as far as it goes, the problem being that you probably don't actually know what the contract is in some sense. There is no such thing as a complete contract. You can write more and more, but you'll always find that there is some little thing that wasn't really expressed in the contract, however it is an artifact of the way that one system behaves that some other piece depends on.
A standard thing that nobody ever puts in their contract is stuff about timing. You have a container class of some sort. It just happens to be that this container that we use over here does insertions really really quickly. Something else that might be totally compatible with it might have been optimized so you can do lookups really quickly, but insertions are kind of slow. And a lot of the time, that won't matter at all. Some of the times, it will be a killer.
Bill Venners: Yeah. You said there will always be failure in the code. In the contracts, there will always be some ambiguity. There will be some question that I have when I am working on my application, and I go look at the spec and it's not addressed. So I don't know what the answer is. So I have to guess. Someone else on the other side of the world may have the same question, and make a different guess. And now our stuff won't work together.
James Gosling: Right. Big problem. And in some sense, that is completely unavoidable, but you do what you can to make that happen less and less. I wish there were such a thing as a perfect answer to that question. There is this whole thing about mathematic specification of behavior and that ran into all kinds of troubles. One of its problems was that it didn't actually solve the problem, it moved the problem.
Bill Venners: Because now you have to figure out what the semantics of the mathematical formal specifications are?
James Gosling: Right. So, it doesn't actually provide an actual accurate specification. What it provides you is an alternate specification written in a different way. At least you can cross check the two and see if they make sense, but in no way is it necessarily complete.
Bill Venners: And that's how I think of test suites, as an alternate form of specification. I could go out and buy the Java virtual machine spec and look at the Javadoc, and that's a specification for the Java Platform in a human language. And I can go code it up. But I can't call it's Java unless it passes this huge battery of tests.
And I think the combination of having a spec as well defined as people can make it -- which is never going to be 100% -- plus the tests, gets you, not 100%, but pretty far along the way. I write a servlet on Windows and I deploy it on Linux and I deploy it on Solaris, and it works.
James Gosling: Yeah, it works. It's actually pretty amazing.
Bill Venners: And my servlet and those Java Platform implementations were never tested together until I ran my servlet, but all those platform vendors ran their implementations through these tests against the platform contract.
James Gosling: Right. And that actually is one of the big sticking points that starts arguments between various folks. There are people who care about interoperability and people who don't. And by and large, we actually care. So we are pretty particular that if somebody wants to call something Java, by God it ought to actually run Java programs.
Bill Venners: I think there's a third category of people. Those who care about inoperability.
James Gosling: Yeah, well. There's this... Let's not go there.
Bill Venners: Okay. New subject. We've talked about complexity and performance and change. Another thing that matters, to some extent, is the productivity of the programmer. I was wondering to what extent programmer productivity was a concern or goal, how it entered your thinking, as you were designing Java. And to what extent did you made tradeoffs in programmer productivity versus program performance?
James Gosling: It's funny how some of these things like performance and reliability actually fit hand in hand with developer productivity. There's a folk theorem out there that systems with very loose typing are very easy to build prototypes with. That may be true. But the leap from a prototype built that way to a real industrial strength system is pretty vast.
By and large, I wasn't really concerned about how quickly you could slap together a demo. I was much more concerned about how quickly you could build a real system. And boy, strong typing is a great thing there. Anything that tells you about a mistake earlier not only makes things more reliable because you find the bugs, but the time you don't spend hunting bugs is time you can spend doing something else. I don't know what fraction of my life I blew away hunting down obscure memory corruption bugs in random bags of C code, but it's a lot. You can get a huge amount of developer productivity just out of making things less error prone.
Bill Venners: That's interesting, because the next thing I was going to ask about was weak typing and Python. Have you ever done anything with Python?
James Gosling: A little bit. Things like Python can be pretty nice. One of the issues with weak typing systems is they tend to be very hard to get them up to really high performance. In a lot of systems, for example, when integers overflow, they turn into doubles or big ints, or whatever. That means that adding a number together is not just an add instruction. It's: "Gee, are these guys both integers? Okay. Then I guess I can do an add. Let's get the integers. Oh, but this guy's a bigger number than that."
So an add isn't one instruction any more. I cared pretty deeply that
a = b + c
should almost always be compiled into one instruction on just about any
architecture. And you look at any of the JITs today and they pretty much do that.
optimized in the registers. And that's what it is most of these days.
So you pay an awful lot for weak typing.
Bill Venners: In performance?
James Gosling: In performance. There are certain forms of weak typing which can actually be pretty useful. If you have something where you are worried about numeric range overflow, or you have something that is mostly a small integer but occasionally turns into a 500 bit number, it's kind of nice to have it roll over. Statistically speaking, those are pretty hard to find. Most people are unwilling to give up their performance for that flexibility, at least in the production systems.
In a lot of systems, however, it just doesn't matter if you spend 10-20 times as long doing an integer wrap. And also, a lot depends on how much you can vary in the libraries. So, for instance, PERL is in some sense is a horrific language in terms of performance, because everything is based around strings. But if the stuff you're doing really is string processing, the weak-typing polymorphic dispatch of everything doesn't matter because all the computation you're doing is done in the string matching algorithms, and they have optimized the hell out of that.
And so, you can end up with systems that perform just fine, so long as the real work is being done in libraries and what you're doing in the language is stitching things together. But it does mean that the language then becomes something that you can't actually use for everything. You can't use it for writing a string matching algorithm in something like PERL. Without the built-in string matching algorithms, it would be pretty slow. Whereas in Java, the string matching libraries are all written in Java. In the world of mathematics, it's a property generally known as completeness. How much of a system can you describe in itself, without having to go outside the system? That actually works pretty well in Java. You can write all kinds of funky low level stuff and the performance is good enough.
Bill Venners: What are the challenges of using mobile objects?
James Gosling: It's the standard fallacies of distributed programming. When people use mobile objects straight out of the gate, they think mobile objects are like any regular objects. But you have to absolutely take into account things like the performance is different. At least 2, probably 3 or 4 orders of magnitude slower. And that's not just a minor annoyance at that level. That deeply affects your architecture.
Bill Venners: You mean, if that object is doing something back on the network?
James Gosling: Yeah, so if you have an array or an object that represents and array. Pinging that thing for every element can be expensive. It might actually be cheaper just to make a local copy of the whole frigging thing in one transaction.
But also, there are issues like errors. If it's not a distributed object, not a remote object, people tend to -- and fairly legitimately can -- ignore all kinds of errors. And yet when it is a remote object, you have to pay attention to the fact that the network can go down. Things that are equivalent to that. The remote host can be hit by a meteorite. Or there are a vast number of things which are similar to being hit by a meteorite. Anybody who does programming in California these days...
Bill Venners: ...could be hit by a blackout.
James Gosling: I run Solaris on my desktop here and for the last 10 years, the #1 source of reliability problems has been the frigging power supply to the building. And when you are building a distributed system, talking to somebody else, there's this question of is he dead or is he slow? You can't tell the difference. Or did the network just go away and it will come back again? Or was it just glitch in the network? Did it just drop a few packets?
Bill Venners: That's what Jini really tries to address with leases and the fact that
you can put
RemoteException in your throws clause to say, "This method may do something the network,
therefore, it might be slower or it might fail." And that way I'll know because it will throw me back this
checked exception that I have to deal with.
James Gosling: Right. And that's really built into RMI.
Bill Venners: It's a philosophy of not trying to paper over the fact that there may be a network there.
James Gosling: Right, because if you try to paper it over, you're fundamentally kidding yourself. You just can't paper over either the performance issue or the error issue.
Bill Venners: Another programmer challenge is the heterogeneity of hardware being programmed. This is particularly a challenge when creating mobile objects. The Java platform of course provides a great deal of homogeneity, but Java platforms are heterogeneous to the extent that different versions exist, different flavors exist (J2ME, J2SE, etc.), different API profiles exist, and so on. I'd like to be able to send out a Jini service proxy and when it lands, have it look around and dynamically load an implementation that makes sense. Could you speak to this and other strategies programmers could use to deal with heterogeneity?
James Gosling: The environmental differences from place to place get difficult to deal with. There have been a number of things that people have tried to do. This isn't exactly what you are talking about, but it's close, which is WebStart. In that, this mobile object, to use your term, comes along with a specification of the versions of the things it depends upon. It depends on a particular VM version at least. It needs the following add-on libraries. It needs the point of sale terminal and 3D rendering libraries. And that actually has been working pretty smoothly.
Bill Venners: I see. We used the same describe-the-requirements approach in the Jini Service UI spec. We don't actually send UI objects along with the Jini service proxy, we send a list of descriptions of available UI objects, including describing required packages.
James Gosling: Yeah.
Bill Venners: The client looks through the list and says, "This is a
JFrame. I can't use
that because I don't have Swing, but this here
Frame. I do have AWT, so I could use
Frame, except that over here it says it requires the
speech APIs, and I don't have that." So the client looks through the list and picks a best-fit UI
object. But a Jini service proxy, that one, it just goes and needs to work wherever it lands. Well, it
doesn't have to run everywhere, it just has to run everywhere you care about.
James Gosling: Right.
Bill Venners: Lastly, is there anything that people never ask you in interviews that you wish they would ask you?
James Gosling: People never ask me?
Bill Venners: Something that you would enjoy talking about?
James Gosling: In some sense, there are zillions of things that people never ask me. Generally, in interviews, people never ask me really geeky questions. By and large, I'm happy you're talking about really geeky things. This interview has been particularly geeky on the interview scale.
Bill Venners: I'll take that as a compliment.
James Gosling: So it's been a lot more interesting. Nobody ever asks me questions about tree attribution algorithms or what's a good representation for a parse tree -- basically things that nobody's interested in.
Bill Venners: Things the readers of the interviews probably don't really care about.
James Gosling:They don't care. It's not relevant to most people, but it's what I spend my days doing. Those are the kinds of things I find more interesting.
Portions of this interview was first published under the name A Conversation with James Gosling in JavaWorld, a division of Web Publishing, Inc., June 2001.