JavaSpaces: Data, Decoupling, and Iteration

A Conversation with Ken Arnold, Part V

by Bill Venners
October 7, 2002

Summary
Ken Arnold, the original lead architect of JavaSpaces, talks with Bill Venners the data-driven nature of JavaSpaces, how JavaSpaces facilitates decoupling, and why iteration isn't supported in the JavaSpace interface.

Ken Arnold has done a lot of design in his day. While at Sun Microsystems, Arnold was one of the original architects of Jini technology and was the original lead architect of JavaSpaces. Prior to joining Sun, Arnold participated in the original Hewlett-Packard architectural team that designed CORBA. While at UC Berkeley, he created the Curses library for terminal-independent screen-oriented programs. In Part I of this interview, which is being published in six weekly installments, Arnold explains why there's no such thing as a perfect design, suggests questions you should ask yourself when you design, and proposes the radical notion that programmers are people. In Part II, Arnold discusses the role of taste and arrogance in design, the value of other people's problems, and the virtue of simplicity. In Part III, Arnold discusses the concerns of distributed systems design, including the need to expect failure, avoid state, and plan for recovery. In Part IV, Arnold describes the basic idea of a JavaSpace, explains why fields in entries are public, why entries are passive, and how decoupling leads to reliability. In this fifth installment, Arnold talks about the data-driven nature of JavaSpaces, how JavaSpaces lets you "throw in a grain and watch it grow," and why iteration isn't supported in the JavaSpace interface.

Type versus State

Bill Venners: In both the Jini lookup service and in JavaSpaces, you can look up objects by both type and state. For Jini lookups, you can specify the types of service you want plus the types and values of attribute entries. In JavaSpace reads and takes, you specify an entry type and field values. How do the kinds of questions you ask in your query differ when you look up objects by type versus by state?

Ken Arnold: The difference is function. If you are writing code that talks to some data source, you can ask it questions. Method calls let you ask the data source for the information you want, and get the answer back. You can view a particular set of method calls as a particular socket shape. So you've written code that translates when you compile it into these method invocations. If those method invocations are all resolved locally, you would have another class to plug into that socket. With a Jini lookup service, you actually go on a network and ask: Is there anything on the network that plugs into this socket? Data is not the point there. It is only methods that matter in that part of the query.

JavaSpaces tries to accomplish something rather different. Yes, JavaSpaces is data-driven. It is object-oriented in the sense that the entries have type and you can match subtypes, and that entry fields can be object types. But at some point, you get past objects in any system. At some point, you call a method with an integer value. In some languages that integer is still logically an object, but it doesn't contain other objects. At some point, you hit the bottom. JavaSpaces is a bottom point in the sense that it is a way to make an asynchronous method call. You can consider that the entry fields are like the parameters to the method call. The fact that those are data shouldn't bother you, because that is when you hit bottom.

Asking Questions of a Space

Bill Venners: How is it that I can take an entry template and populate it with a few objects, and that forms a question? How do you get from bits and bytes to high-level conceptual questions?

Ken Arnold: When you query a JavaSpace with an entry containing some filled-in fields, you are saying, "These are the pieces I care about, please fill in the rest." When you write an entry, each field is serialized separately. When you do a read or take with an entry template, each field that you specify is serialized separately. The JavaSpace compares the template fields with stored entries. For each field in the entry, if something is specified in the template, the space compares the fields' serialized forms. The space returns the first stored entry it finds in which all fields specified in the template match the corresponding field in the entry.

Serialization and Private Data

Bill Venners: Somebody once told me he didn't like serialization because it breaks encapsulation and you can see the private data.

Ken Arnold: I think he misunderstood the purpose of private data. Most private data doesn't need to be private for you to hide it from other people because those people should never know about it. Most private data needs to be private because if you let people touch it, they will screw around with it. They'll think they know the right thing to do with the data. Private data is a way of protecting yourself to allow future change. What if somebody serializes an object, plays with the resulting bits, and then deserializes the object to get another object? That is like saying, objects in C++ don't matter because somebody can put a pointer to your private data and muck with it. Although in some abstract logical sense that is true, it is not really the point.

The real point of private data is that you can prevent those people who are not trying to screw you over, but who are just trying to know too much, from knowing too much. If someone goes to that much trouble to interfere with your objects' internals, you should ignore them. Because the value of private data is that you can release a second version and all the existing client code should work because it doesn't rely on internal implementation details.

Say you change the internal structure in your product's second version. If somebody's code breaks because the code has serialized objects—because the person messed with the private stuff and then reserialized it—I don't know how sympathetic you are likely to be. It seems like that person has stepped outside the bounds and got what he or she deserved. So few private things need to be private to be secret. Most of them just need to be private as a language-enforced way of saying hands off.

Trusting Actors with Your Data

Bill Venners: To me, JavaSpaces has always felt like shared memory between processes on different hosts.

Ken Arnold: Yes, it is often associated with shared memory.

Bill Venners: But JavaSpaces lets you share objects, not just data. So JavaSpaces has a weird dual personality. It is about sharing objects, but to a great extent, it is about sharing data too. And since data is shared, don't I have to trust that every actor is correct and well behaved? Is a JavaSpace, therefore, appropriate only in environments in which every actor trusts all the other actors?

Ken Arnold: You can view JavaSpaces as an alternative way to design distributed systems compared to RPC (remote procedure call) mechanisms. In either approach, you have to design a set of interactions. You must make tradeoffs about complexity and trust in those designs. You could set up systems that do or do not detect someone mucking with the system. Using a JavaSpace, at least one that other people can access, you probably cannot achieve certain kinds of security, like data security. Can someone read a particular entry in a space? If someone has access to the space, depending on the space's security model, he could probably read the entry. But everything has its ups and downs, and its tradeoffs. You can certainly design algorithms that are robust in the face of others behaving incorrectly. It had better be possible, because the difference between a bug and security hole is intention. At this point, most uses of JavaSpaces live behind the firewall. Just as a form of entertainment, I am currently writing a poker game that uses a JavaSpace to communicate between the participants and the decision maker. One question is: What happens if somebody comes in and screws around with your data? There are ways you can deal with that.

Decoupling Request from Response

Bill Venners: Here's a quote from JavaSpaces: Principles, Patterns, and Practice: "Message passage remote method invocations barricade data structures behind one manager process. Processes must wait in line, but the space enables concurrent access." Why isn't a JavaSpace yet another manager process managing the data?

Ken Arnold: You can describe things in a million ways with all sorts of metaphors, and each way is partially true. So if you want to think of JavaSpaces as a manager, you can. The distinction has to do with decoupling sending the request from receiving the response. If I make an asynchronous call to you and ask you to do something, the job's processing is barricaded behind you. It doesn't mean other people can't send you requests simultaneously, or that I can't send you another request simultaneously in a separate thread. But this particular interaction is blocked on that one call. Whereas with a JavaSpace, if I had 70 things to do, I can write 70 requests into the space and just wait for the results, because the making of the request and the receiving of the result are completely separate operations. In a normal RPC-style system, they are consequent operations. They are serialized one after the other.

By "barricaded," the authors Eric Freeman and Susanne Hupfer basically mean that I make the call to you. If the processing is going to be broken down into partial pieces, you have to break it down. It is all behind you as far as I am concerned. My interface to solving the problem is invoking a method on you. And if the right way to do that is to do a little bit here, a little bit there, and a little bit over there, then you have to handle that. If instead I write an entry into a space and the processing can be broken down, an actor can take out the request and write a partial result back.

Today you could decide that those partial processing tasks have to happen in order— that A happens before B, which happens before C. Tomorrow you may figure out a way to do the tasks in parallel. So if something is already working on one request's part A, and a new request comes in, something else can start on the B part without waiting for the A task to complete. And then tomorrow a new theory can arrive. It is not mediated by my interaction with someone processing my request. Instead, I am tossing the grain into the space and watching it grow. It grows in the sense that the response comes back. All that other stuff is not mediated by anything with which I directly interact. It is mediated by the request-satisfying process, which ever way the entities processing the request are configured to solve the problem.

Iterating over a JavaSpace

Bill Venners: Once people start using JavaSpaces, they often find themselves wanting to iterate over matches. But they can't do that directly via the JavaSpace interface. Why doesn't the JavaSpace interface support iteration?

Ken Arnold: Basically the problem is there isn't one iteration model that satisfies everything. It sounds like there should be and your instinct says there should be. But you can ask all sorts of questions about ordering and about interaction with transactions. If someone adds an entry after I start iterating, am I guaranteed to see it? Can someone remove an entry if I am past it in an iterator? Is there a way to go backwards? If you take all these factors and put them together—to make up a number off the top of my head—there are maybe 80 possible iterators. And if I choose one, who is to say that one is right for you?

Instead, we provide you a set of tools. It is like a RISC (reduced instruction set computing) instruction set. This is ancient history, but there used to be a system called a Vax. Some people's hair will turn gray when they hear that. Actually, their hair will already be gray, but otherwise would turn gray. Anyway, the Vax had the world's largest instruction set as far as I know. It had, for example, a solve quadratic equation instruction. This was the reducto ad absurdum of instruction sets, for which RISC was a response. A RISC instruction set doesn't even necessarily have a divide instruction. How can you live without a divide construction? Well, if you combine some three instructions, you get this kind of divide. If you combine some other five instructions, you get this other kind of divide.

Just because the JavaSpace interface doesn't offer a way to iterate doesn't mean there aren't legitimate reasons to want to iterate. It is a question of whether the space should be picking winners or losers. It is a question of whether you can even pick a winner that can satisfy all people.

Everybody who has said they want iteration thinks they are asking for the same thing. But when you break it down, different people want different things. Everyone wants iteration, but if I give you one kind of iterator, it may not satisfy you. It may only satisfy 10 percent of the people. Do I really want to add a method that only satisfies 10 percent of the people?

The problem with the Vax instruction set was people had complaints like, "You didn't solve the quadratic equation the way I wanted to do it." Or, "You're inefficient in the kind of quadratic equation that I have." The JavaSpace interface doesn't offer a way to iterate, but you can create a utility class, where you hand it the space and say, "Iterate!" And that class is designed based on certain principles. And you could have another utility class that iterates based on other principles.

Iteration is one of those things that sounds grand, like peace. Everybody wants peace, brotherhood, and love. But how do you get there from here? What do you mean by peace? Everybody has different visions. Granted, iteration over a JavaSpace is much less important than peace, but the same issue is there. Iteration is a word that covers a multitude of sins. It is better—and completely possible—to build iteration structures on top of a JavaSpace. And then you can provide different APIs for iterators with different properties.

Resources

Perfection and Simplicity, A Conversation with Ken Arnold, Part I:
http://www.artima.com/intv/perfect.html

Taste and Aesthetics, A Conversation with Ken Arnold, Part II:
http://www.artima.com/intv/taste.html

Designing Distributed Systems, A Conversation with Ken Arnold, Part III:
http://www.artima.com/intv/distrib.html

Sway with JavaSpaces, A Conversation with Ken Arnold, Part IV:
http://www.artima.com/intv/sway.html

You can obtain information about Linda from here:
http://www.cs.yale.edu/Linda/linda.html

Ken Arnold first mentioned idempotency in Part III of this interview:
http://www.artima.com/intv/distrib.html

JavaSpaces: Principles, Patterns, and Practice by Eric Freeman, Susanne Hupfer, and Ken Arnold, the book from which Bill Venners reads quotes in this article, is at Amazon.com at:
http://www.amazon.com/exec/obidos/ASIN/0201309556/

The Jini Community, the central site for signers of the Jini Sun Community Source License to interact:
http://www.jini.org

Download JavaSpaces from:
http://java.sun.com/products/javaspaces/

Design objects for people, not for computers:
http://www.artima.com/apidesign/object.html

Make Room for JavaSpaces, Part I - An introduction to JavaSpaces, a simple and powerful distributed programming tool:
http://www.artima.com/jini/jiniology/js1.html

Make Room for JavaSpaces, Part II - Build a compute server with JavaSpaces, Jini's coordination service:
http://www.artima.com/jini/jiniology/js2.html

Make Room for JavaSpaces, Part III - Coordinate your Jini applications with JavaSpaces:
http://www.artima.com/jini/jiniology/js3.html

Make Room for JavaSpaces, Part IV - Explore Jini transactions with JavaSpaces:
http://www.artima.com/jini/jiniology/js4.html

Make Room for JavaSpaces, Part V - Make your compute server robust and scalable with Jini and JavaSpaces:
http://www.artima.com/jini/jiniology/js5.html

Make Room for JavaSpaces, Part VI - Build and use distributed data structures in your JavaSpaces programs:
http://www.artima.com/jini/jiniology/js6.html

Talk back!

Have an opinion? Readers have already posted 6 comments about this article. Why not add yours?

About the author

Bill Venners is president of Artima Software, Inc. and editor-in-chief of Artima.com. He is author of the book, Inside the Java Virtual Machine, a programmer-oriented survey of the Java platform's architecture and internals. His popular columns in JavaWorld magazine covered Java internals, object-oriented design, and Jini. Bill has been active in the Jini Community since its inception. He led the Jini Community's ServiceUI project that produced the ServiceUI API. The ServiceUI became the de facto standard way to associate user interfaces to Jini services, and was the first Jini community standard approved via the Jini Decision Process. Bill also serves as an elected member of the Jini Community's initial Technical Oversight Committee (TOC), and in this role helped to define the governance process for the community. He currently devotes most of his energy to building Artima.com into an ever more useful resource for developers.