The Artima Developer Community
Sponsored Link

Weblogs Forum
Going all in...

22 replies on 2 pages. Most recent reply: Aug 16, 2005 3:47 PM by Mike Petry

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 22 replies on 2 pages [ 1 2 | » ]
Jim Waldo

Posts: 34
Nickname: waldo
Registered: Sep, 2002

Going all in... (View in Weblogs)
Posted: May 10, 2005 11:12 AM
Reply to this message Reply
Summary
More on objects and networks, but this time I try to make my own opinions clear, if not coherent...
Advertisement
This particular thread has gotten far more interest than I had originally expected, and has gone off in some interesting directions. But, as Mike Petry points out, it is hard to see where I'm going with all this. I'm not sure that I knew (or know) myself, but after the last couple of rounds of betting, perhaps it is time to go all in and show my cards.

So, let's go back to the first assertion that I made, which is that the object model that you use is tied to the language in which that object model is presented. From this, a lot of you concluded that it was therefore a bad idea to try to pass objects from one place to another while doing distributed computing, because you certainly don't want to tie yourself to a single language. After all, we all know that different languages are good for different things (or, just as important, are preferred by different people). So to keep from being tied to a particular language, we had better confine ourselves to simply passing data from one process to another while doing distributed computing.

To paraphrase a former president, you could do this, but it would be wrong (here is where I turn my cards face up). First of all, it means that you need to have a language-independent mechanism for passing data from one place to another. But there is no more a language-independent way of expressing data than there is a language-independent way of expressing objects. If I tell you that a data type is an int, you don't know its size unless you know if it is an int in Java, of C (well, even then you won't necessarily know) or C++. But you do know that I'm not talking about COBOL; but if you want to talk COBOL you need to know how to translate my int into your PIC 9(x) for some value of x.

Sure, I will admit that the number of data types that need to be translated from language to language is smaller than the number of object types that might need to be so translated. But in both cases you are going to be needing to translate from one language to another. The task is essentially the same, whether you are translating objects or data.

What it really comes down to, at base, is that doing distributed computing of any form requires that you begin with a base set of conventions that you require everyone to buy into to make the communication possible. One set of conventions has to do with the way you will translate the wire representation into a set of data types that can be understood by the two (or more) participants. An alternative convention would be to agree on a language that will be used on the network (as stated earlier by Bill Venners). And if that network language is both object-oriented and a language in which you can implement the client or the server, you get a bunch of nice properties in your system.

For example, if you are sending Java objects over the wire, you can insure that sets of invariants that should govern the data in those objects will be respected on the other side (or at least checked). You can, if necessary, download new code for new objects (the thing that makes Javatm RMI and the Jinitm networking system so powerful). This last ability lets you change the system over time in interesting ways, which can't be done if all you are doing is passing data. It even lets you change the way communication happens, since some of the objects that can be passed are the proxy objects that are used to do the communication.

Does this mean that I'm saying you shouldn't program in anything but the Java language? Hardly; I fully agree with those who claim that different languages are useful for different things. There is now a need, and always will be, for mechanisms to translate information expressed in one language into another language. My only claim here is that this shouldn't be confused with the need to transmit the information across the network. You are much better off doing this within an address space (or on a single machine in different address spaces) where you can control all of the mechanisms. Trying to do this on the network is the equivalent of saying that the phone system should translate any language into an intermediate form and then re-construct into the language of the receiver on the other end. A much better approach, I would claim, is to carry a single language on the wire, and then have a translator on the other end when needed.


Maarten Hazewinkel

Posts: 32
Nickname: terkans
Registered: Jan, 2005

Re: Going all in... Posted: May 11, 2005 2:51 AM
Reply to this message Reply
I agree with your point on having a unified language at the network level. That would help make heterogenous distributed computing possible.

I am however wondering whether Java (or any other current language) is sufficient for the job. Committing to a network language is a significant, and not easily reversable step.

I expect that with some extra translation work, Java can carry any required datatypes around, though there is going to be extra work if something like 128-bit integers are going to appear in a new language.

My main question is, can Java carry the semantics of other languages?

Sticking to static class-based OO languages will probably be fine. But what about OO that is not class-based? For instance, prototype based languages such as Self and Javascript. Or languages that allow run-time modification/addition of methods to classes or objects such as the popular 'scripting' languages Python, Ruby, etc.

Something to ponder...

indranil banerjee

Posts: 11
Nickname: indranilb
Registered: Feb, 2004

Re: Going all in... Posted: May 11, 2005 5:45 AM
Reply to this message Reply
Sounds suspiciously like CORBA...

Kay Schluehr

Posts: 302
Nickname: schluehk
Registered: Jan, 2005

Re: Going all in... Posted: May 12, 2005 11:03 PM
Reply to this message Reply
> Sounds suspiciously like CORBA...

But CORBA again forces each language to conform an IDL. It does not pick it up where it is but says that before it can speak to objects of another language it has to translate itself into an IDL. The advantage of this scheme is the linear growth of the translation effort and reduction of complexity. In the human world of translation/interpretation the effort grows quadratic with the number of languages but it has the advantage that each speaker can articulate thoughts in his own language and the interpreter translates it for him. The interpreter takes over an active role and is not simply a mock of a transparent medium. I guess that SOA/Webservices are nothing but a resurrection of CORBA in terms of XML - "service" that forces conformance. This is the empowerment of the secondary.

A real translator would adapt to the needs of his customers but I don't think that this is part of the mindset of big software vendors that think the other way round.

Regards,
Kay

Dan Creswell

Posts: 49
Nickname: dancres
Registered: Apr, 2003

Re: Going all in... Posted: May 13, 2005 1:53 AM
Reply to this message Reply
> Sounds suspiciously like CORBA...


No, I don't think so - Here's what Jim said:

"Trying to do this on the network is the equivalent of saying that the phone system should translate any language into an intermediate form and then re-construct into the language of the receiver on the other end"

This is what I call the "translation everywhere" category. IMHO, this includes the CORBA/Web Services proposition. It's a "lowest common denominator approach" where you have to ensure that all data passed can be interpreted/translated by all systems/languages that encounter the data which provides interoperability but at the expense of expressiveness.

I think Jim is suggesting an approach toward the other end of the spectrum where one, for the most part, uses one language and all the expressiveness it brings for most communication but inserts translators at endpoints where a different language is being used. Something like the adapter pattern maybe.

In the real world that might mean using a lowest common denominator approach for communication between enterprises whilst, internally, each uses their language of choice be it Java, C# or whatever and communicates using the data structures available in that language.

Warren Strange

Posts: 1
Nickname: warren
Registered: May, 2005

Re: Going all in... Posted: May 13, 2005 8:15 AM
Reply to this message Reply
>
> This is what I call the "translation everywhere" category.
> IMHO, this includes the CORBA/Web Services proposition.
> It's a "lowest common denominator approach" where you
> u have to ensure that all data passed can be
> interpreted/translated by all systems/languages that
> encounter the data which provides interoperability but at
> the expense of expressiveness.
>

So Web Services is kinda like the Esperanto of computing.

Politically correct, but not terribly useful :-)

Gregg Wonderly

Posts: 317
Nickname: greggwon
Registered: Apr, 2003

Re: Going all in... Posted: May 13, 2005 11:44 AM
Reply to this message Reply
> Sounds suspiciously like CORBA...

I've gotten into this discussion late, and made some comments this morning to posts in Jim' previous blog...

I think there is growing preference that inter-business transactions between disparate systems might be a carried out most effectively with CORBA like or XML like transactions because the predominent need is "getting the data".

I, personally don't believe this to be the right choice in all cases. I've found the use of direct, language based interfaces such as the RMI programming model to make many things a lot easier.

In the end, poor system designs can sink the ship no matter what programming language/platform used. Corba's initial audience seemed to include such disparate systems communications. What happened was that programming problems that the RMI model is better at became more common. Corba seemed like a bad idea because it couldn't solve the dynamic granularity problems that mobile code can.

And now we have XML and SOAP as the new data-only exchange paradigm in distributed applications. There are still a large class of problems that can't be solved with data-only exchanges.

There are also many standard services such as directories and registries which need running code somewhere. The result is that there has to be standards invented to handle these.

With Jini, the Lookup Service is a Jini service that is located via APIs found in the platform. Because it's a language based platform, it can provide such things for everyone. Beyond that, everything else will be a discovered service implementing a well known programatic interface as well as having some other attributes to separate same interface services into groups as needed by deployment.

The simplifications that the Jini model provides to the deployment and use of services is very powerful. It is only through the use of mobile code, a unified object model and a single "language" platform that this power is recognized so easily.

John Bayko

Posts: 13
Nickname: tau
Registered: Mar, 2005

Re: Going all in... Posted: May 13, 2005 5:38 PM
Reply to this message Reply
How does this apply to databases and ORM? I've always suspected that trying to keep a seperate "universal" data format in a database, and mapping it to a nice object model supported by a full language, is kind of missing the point. It would be a step up if there were at least *one* language that would work well, and transparently, with a relational database, rather than none, which is the case now.

Gregg Wonderly

Posts: 317
Nickname: greggwon
Registered: Apr, 2003

Re: Going all in... Posted: May 13, 2005 6:19 PM
Reply to this message Reply
> How does this apply to databases and ORM? I've always
> suspected that trying to keep a seperate "universal" data
> format in a database, and mapping it to a nice object
> model supported by a full language, is kind of missing the
> point. It would be a step up if there were at least *one*
> language that would work well, and transparently, with a
> relational database, rather than none, which is the case
> now.

I am really curious if a majority think that object oriented programming is just about embedding data in an object and providing access methods.

I consider that to be a structure, not an object. An object more often includes code that does useful things with the data, besides setting and getting data content.

indranil banerjee

Posts: 38
Nickname: indranil
Registered: Nov, 2004

Re: Going all in... Posted: May 14, 2005 4:47 AM
Reply to this message Reply
> In the real world that might mean using a lowest common
> denominator approach for communication between enterprises
> whilst, internally, each uses their language of choice be
> it Java, C# or whatever and communicates using the data
> structures available in that language.


That was what CORBA tried to be. You write business logic in whatever language you want. CORBA looked after the low level communications. In practice it didnt live up to the expectations

Alexander Jerusalem

Posts: 36
Nickname: ajeru
Registered: Mar, 2003

Re: Going all in... Posted: May 15, 2005 1:27 AM
Reply to this message Reply
> For example, if you are sending Java objects over
> s over the wire, you can
> insure that sets of invariants that should govern
> govern the data in those
> objects will be respected on the other side (or at
> (or at least checked).

The negative consequences of doing this are of course that what is sent over the wire will be closely linked to implementation details of the Java class unless the serialization format is deliberately managed.

My argument against sending Java objects even between two Java applications is versioning. If you just send Java objects back and forth without paying attention to the serialization format, you have to make sure that class (and J2SE) versions are in sync at all times. If you don't want that, you have to deliberately design and manage the serialization format. And if you do that, you could as well use a format that is interoperable beyond Java.

Another argument against sending Java objects is the weakness of Java's type system that forces us to express a lot of constraints as part of the implementation that really belong to the public interface. I'm talking about things such as string lengths, number ranges, referential constraints, etc.

And this is not just a problem with Java. It is the general neglect of data constraints in object oriented languages. The philosophy is that if you have encapsulation and you marry procedural code and data, then you don't have a need for declarative constraints on data. It turns out that this assumption makes object oriented languages utterly unsuitable for loosely coupled distributed systems. It causes no end of versioning problems and it makes interoperability a nightmare.

I'm convinced that successful distributed, loosely coupled systems must be based on a flexible generic data representation and a powerful declarative constraints language. But I'm also convinced that people will go on trying to make objects roam the world. They never will.

Gregg Wonderly

Posts: 317
Nickname: greggwon
Registered: Apr, 2003

Re: Going all in... Posted: May 15, 2005 7:43 AM
Reply to this message Reply
> > For example, if you are sending Java objects over
> > s over the wire, you can
> > insure that sets of invariants that should govern
> > govern the data in those
> > objects will be respected on the other side (or at
> > (or at least checked).
>
> The negative consequences of doing this are of course that
> what is sent over the wire will be closely linked to
> implementation details of the Java class unless the
> serialization format is deliberately managed.

But how is this any different from deciding on the design of and managing the implementation of any other data convehence mechanism?

> My argument against sending Java objects even between two
> Java applications is versioning. If you just send Java
> objects back and forth without paying attention to the
> serialization format, you have to make sure that class
> (and J2SE) versions are in sync at all times. If you don't
> want that, you have to deliberately design and manage the
> serialization format. And if you do that, you could as
> well use a format that is interoperable beyond Java.

If your system ever changes, you will have versions of your data that go across the wire and which need to be managed. Liberal use of interfaces helps eliminate class version issues, and you can trivally code deserialization and serialiation code to manage versioning literally. The serialVersionUID field is a hash in its default implementatation, which only services to reveal to you that the visible structure of the class has changed. It should be set to an incrementing version number instead, and then it becomes completely obvious to users of the class what iteration of it they are dealing with. Long term storage should never use serialization so that data versioning issues are a short lived problem.

> Another argument against sending Java objects is the
> weakness of Java's type system that forces us to express a
> lot of constraints as part of the implementation that
> really belong to the public interface. I'm talking about
> things such as string lengths, number ranges, referential
> constraints, etc.

It doesn't matter how you express the data items. If you create poor data designs that don't encapsulate the information that is mandatory for proper representation, you will have problems.

> And this is not just a problem with Java. It is the
> general neglect of data constraints in object oriented
> languages. The philosophy is that if you have
> encapsulation and you marry procedural code and data, then
> you don't have a need for declarative constraints on data.
> It turns out that this assumption makes object oriented
> languages utterly unsuitable for loosely coupled
> distributed systems. It causes no end of versioning
> problems and it makes interoperability a nightmare.

I have no problem with the java serialization model across the network. I am not sure what you mean by loosely coupled systems. Any system that uses the data of another system is coupled to that system. HTML browsers are coupled to HTML data streams. When that data stream is versioned, they have to be changed where applicable. If they don't protect themselves against the assumptions they are making, there are problems.

> I'm convinced that successful distributed, loosely coupled
> systems must be based on a flexible generic data
> representation and a powerful declarative constraints
> language. But I'm also convinced that people will go on
> trying to make objects roam the world. They never will.

Please provide some more practicle works for your definition of 'loosely coupled.' It's not clear to me whether you are talking about a text editor loading an HTML document for editing, or a 1995 version of mosiac loading a 2005 generated HTML document. I.e. are you complain about loose data type coupling of loose version coupling or something completely differnt.

Alexander Jerusalem

Posts: 36
Nickname: ajeru
Registered: Mar, 2003

Re: Going all in... Posted: May 16, 2005 7:32 AM
Reply to this message Reply
> Please provide some more practicle works for your
> definition of 'loosely coupled.'

What I mean is something like the way web browsers communicate with web servers. There are many different browser versions and many different HTTP servers implemented in many different programming languages, running on many different platforms. The whole thing keeps working because each participant acts upon the HTTP headers and HTML elements it understands and tries to ignore the rest as gracefully as possible. That's a desireable quality in many loosely coupled systems, but we all know that it also causes lots of trouble and that it's not good enough for stricter business oriented purposes where you need to make sure some things (like money amounts, etc) are exact..

You ask why interfaces + serialVersionUID are not enough. The answer is because serialVersionUID allows you to make things break visibly when needed and interfaces allow you to define some aspects of a formal contract but neither provides a language for describing the wire format including including all the rules about the data the service requires. Many rules that must be part of the contract are hidden away in the Java class implementation and are not part of the interface. I'll give an example of what I mean, maybe it becomes clearer as a result:

interface OrderItem {
 
   String getProductCode();
   void setProductCode(String c);
 
   int getQuantity();
   void setQuantity(int q)
}
 
 
class OrderItemImpl implements OrderItem {
 
   private String productCode;
   private int quantity;
   
   public void setProductCode(String c){
      if (c == null)
         throw new OrderException("product code must not be null");
      if (c.length() > 20)
         throw new OrderException("Invalid product code");
         
      productCode = c;
   }
   
   public void setQuantity(int q){
      if (q < 1 || q > 1000)
         throw new OrderException("quantity must be between 1 and 1000");
      
      quantity = q;
   }
   
   //... getters and the rest
}


If someone sees only the interface, he doesn't know that the product code must not be null, that its length must not be more than 20 and that the quantity must be a number between 1 and 1000. All this knowledge is hidden away in the implementation instead of being expressed in the interface where it really belongs. Of course it would be sensible to write it into the documentation as well, but that's not a good replacement for a formal definition of all the requirements of the service's contract, because documentation cannot be automatically enforced.

Now, you could argue that what's needed is a ProductCode and an ItemQuantity class, so the OrderItem interface becomes more expressive. But by doing so, you just defer the problem to the constructors or setters of these classes. There is simply no way to formally express not null, number range and string length constraints in the public parts of a Java class or interface. It's impossible. And that's why I'm saying that Java is not a good language to express contracts.

So far, this has little to do with versioning. You say that by managing the Java serialisation mechanism, you can cope with versioning. That's true. But there are many people who think that using POJOs (e.g. RMI) to communicate within a distributed system is somehow simpler than using XML web services. However, if we accept that the wire format has to be managed then that argument breaks down. It's never simple to manage the versioning of serialisation formats or any data format for that matter. It boils down to the question, what is the best language to define message formats that provide some degree of backward and forward compatibility, so that older senders can communicate with more recent receivers and vice versa.

My argument is that Java simply doesn't help in all of this and that all the Java/XML serialisation stuff like JAXB makes things much more complicated than dealing with XML messages directly. I have a suspicion that most people who come up with the simplicity argument, simply don't manage their serialisation format, because they build closely coupled systems where they have central control over versioning and deployment anyway.

Thankfully, you have already accepted that Java's default serialisation is unsuitable if you need to manage versioning in a distributed system without central control, so I don't have to come up with an example for that one ;-)

Gregg Wonderly

Posts: 317
Nickname: greggwon
Registered: Apr, 2003

Re: Going all in... Posted: May 16, 2005 8:46 AM
Reply to this message Reply
> > Please provide some more practicle words for your
> > definition of 'loosely coupled.'
>
> What I mean is something like the way web browsers
> communicate with web servers. There are many different
> browser versions and many different HTTP servers
> implemented in many different programming languages,
> running on many different platforms. The whole thing keeps
> working because each participant acts upon the HTTP
> headers and HTML elements it understands and tries to
> ignore the rest as gracefully as possible.

Yes, this is true. I'll add that if the browser instead got mobile code from the server, the mobile code would render the document perfectly without worries about content versions. That is one of the powers of mobile code. Versioning is explicitly dealt with by the downloaded code instead of the client always having to be in tune with data format versioning.

> That's a
> desireable quality in many loosely coupled systems, but we
> all know that it also causes lots of trouble and that it's
> not good enough for stricter business oriented purposes
> where you need to make sure some things (like money
> amounts, etc) are exact..

I believe mobile code is a better solution for these business solutions because it hides the data and presents only the public interface to that data, no matter how the data evolves.

> You ask why interfaces + serialVersionUID are not enough.
> The answer is because serialVersionUID allows you to make
> things break visibly when needed and interfaces allow you
> to define some aspects of a formal contract but neither
> provides a language for describing the wire format
> including including all the rules about the data the
> service requires. Many rules that must be part of the
> contract are hidden away in the Java class implementation
> and are not part of the interface. I'll give an example of
> what I mean, maybe it becomes clearer as a result:

This example does show how stricter requirements for data format can create problems related to more rigid data range requirements. I don't think that this is an unsolvable problem. I'm not saying that construction exceptions are perfect, but they do at least provide some checkpoints that will eventually reveal problems. Javadoc based commentary on ranges should be used, and there are cases where enum should be used now, instead of small integer ranges.

> So far, this has little to do with versioning. You say
> that by managing the Java serialisation mechanism, you can
> cope with versioning. That's true. But there are many
> people who think that using POJOs (e.g. RMI) to
> communicate within a distributed system is somehow simpler
> than using XML web services.

I am one of those people. It is vital to understand my point of view on why. It has to do with how many times the data changes forms, and the amount of bandwidth and text based manipulations the data goes through. Tools are evolving, but I still don't perceive a signficant value for JVM to JVM communications. The RMI programming model, and more importantly the JERI stack provided in Jini2.0 make a lot of things possible that JRMP really can't do easily, if at all.

> However, if we accept that
> the wire format has to be managed then that argument
> breaks down. It's never simple to manage the versioning of
> serialisation formats or any data format for that matter.
> It boils down to the question, what is the best language
> to define message formats that provide some degree of
> backward and forward compatibility, so that older senders
> can communicate with more recent receivers and vice versa.

I think that you have to be ready to decide is the design so wrong that the new version will be difficult to manage. If the difficulty is high and the likelyhood of breakage is dominating the conversations, then it might be best to decide that there will not be a compatibility. Mobile code makes it much easier to make this decision because you can send the client a smart proxy that provides the same interface contract, but understands only the new data stream format.

When your data and your interface are the same, then you are stuck with versioning the data in a backward compatible way. This makes the problem much harder to solve!

> My argument is that Java simply doesn't help in all of
> this and that all the Java/XML serialisation stuff like
> JAXB makes things much more complicated than dealing with
> XML messages directly. I have a suspicion that most people
> who come up with the simplicity argument, simply don't
> manage their serialisation format, because they build
> closely coupled systems where they have central control
> over versioning and deployment anyway.

And, this is probably the predominate use of the RMI model and the JRMP protocol. There are times when it make sense. There are few public interfaces that need continued maintenance. HTML/HTTP is the most predominate one, and thus the most often chosen transport. It is a popular transport because of APIs such as java.net.URLConnection.

However, I strongly believe that this interface to HTTP is keeping people from using more trivial solutions such as those provided by the RMI programming model. It is harder to get started with mobile code, but only because we don't have deployment automation tools sitting around which build jar and deploy them to webservers and plug in codebase URLs to servers as needed.

> Thankfully, you have already accepted that Java's default
> serialisation is unsuitable if you need to manage
> versioning in a distributed system without central
> control, so I don't have to come up with an example for
> that one ;-)

It is unsuitable if you think that it should be used for long term storage of data exchanged by disparately managed systems where change is common. It works fine for local systems long term storage and for disparately managed systems which never change.

Put another way, if you use simple valued data items, serialization works fine.

If you have some strange belief that you can manage infinite change on a class over an extended time where the change rate is high, you'll be disappointed with the performance of your system with regard to versioning.

But, if you use mobile code, instead of just data streams, you'll find that you can make better choices on the evolution of your data, and probably minimize the number of versions you have to deal with being active simultaneously.

Mobile code and interface based programming reduce your dependencies to just the operations that are needed, instead of formatting the data for local use and then code to use that data.

Kay Schluehr

Posts: 302
Nickname: schluehk
Registered: Jan, 2005

Re: Going all in... Posted: May 16, 2005 12:22 PM
Reply to this message Reply
> class OrderItemImpl implements OrderItem {
>
> private String productCode;
> private int quantity;
>
> public void setProductCode(String c){
> if (c == null)
> throw new OrderException("product code must not
> must not be null");
> if (c.length() > 20)
> throw new OrderException("Invalid product
> d product code");
>
> productCode = c;
> }
>
> public void setQuantity(int q){
> if (q < 1 || q > 1000)
> throw new OrderException("quantity must be
> y must be between 1 and 1000");
>
> quantity = q;
> }
>
> //... getters and the rest
> }
> [/java]
>
> If someone sees only the interface, he doesn't know that
> the product code must not be null, that its length must
> not be more than 20 and that the quantity must be a number
> between 1 and 1000. All this knowledge is hidden away in
> the implementation instead of being expressed in the
> interface where it really belongs. Of course it would be
> sensible to write it into the documentation as well, but
> that's not a good replacement for a formal definition of
> all the requirements of the service's contract, because
> documentation cannot be automatically enforced.

Except for almost trivial cases it is not reasonable to publish contracts as interfaces. If Your input depends on previous input, on different other parameters or has to be parsed first the only practical solution may be testing. And this is already the answer: write a testpec and testcode and publish it together with the more abstract API description. The user of the API will be gratefull, because he recieves tons of examples about the most relevant do's and dont's.

Kay

Flat View: This topic has 22 replies on 2 pages [ 1  2 | » ]
Topic: When it comes to Computer Science, don't reference Wikipedia Previous Topic   Next Topic Topic: High Performance Dynamic Typing in C++

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use