I've been an active user of Jini and Java for the life of these platforms. The distruption that Web Services is causing in the software industry is amazing. Is this an investment for good, or a cost that we'll wish we'd skipped?
> But there is a more fundamental thought. Java, C# are not XML, these
> languages are object oriented and you will likely use internal object
> models in your implementation. You will have to map your XML to
> On the other side of the service, you will likely use a relational
> database. Guess what, Java, C# are not relational minded either. You
> will end up with a mapping between the Java code and the database too.
What is most disturbing about this issue, to me, is the fact that the primary focus for web services is to use so many impedence dependent tools to solve a problem that is more easily solved without such tools. The problem is not an impedence problem. I think it's a wrong tool problem.
Everyone seems dead set on investing in XML to create an impedence match at the network level. The problem is that it is creating huge impedence mismatches with programming languages that are being patched from all directions.
Don't get me wrong, I am not against using XML where it makes sense. I'm just wondering if we are going after the right solution. For unrelated IT infrastructures, selecting a common transport that is easily mapped to standards makes sense. So, all of the new "business" relationships that the internet is allowing, can perhaps benefit from using XML document exchange. But, is that the only solution for distributed communications. And/or is this just what is visible in the industry press and people are doing lots of other things besides XML.
I'm using Java Serialization through Jini based RMI style RPC. Of course their is no impedence mismatch for object to object calls. JERI lets you plug in an alternate transport, so technically you can talk from Jini in Java to some other language environment, including XML web services. I don't do this, but others do.
Java's dynamic code downloading and associated late bindings make it really powerful and easy to create powerful RPC based applications using whatever level of granularity you need. If you need to send large amounts of data, you can use a smart proxy to stream data under the covers and then hand it to the client with a object representation if needed. You can turn a stream into an SQL result set etc. There's also the ability to join multiple services into a new, single API at the client so that you don't have to change the services, you can just deploy a smartproxy as a new service that coalesces multiple services of differing granularaties.
What's everyone else doing to manage distributed communications infrastructure?
If you could have a new tool to solve a problem you're having with developing your distributed application what would it do?
infinite cost. the notion that XML is some kind of universal-better-than-anything data format is bred of either ignorance or stupidity. tagged text data files date from the 1960s. there's nothing new here; especially as it's being used.
one still needs file specific code to interpret the data file. at that point, it matters not whether the file has been processed by an XML parser, or simply read in as CSV. the latter is easier. add in the convolutions of XQuery/XPath to emulate RDBMS data, and we see a burgeoning clusterf*k.
In my humble opinion, the hidden cost is not the cost of switching to a new transport, be it XML/SOAP or something else.
To me, the big cost comes from the the combination between the hype of web services (the one you read in CIO magazines) and their implementations, at a stage where a given code base may not be ready for web services.
There is a huge difference between a working piece of code, and the same one able to be part of a web service, able to stay up and running for a long period of time, able to possibly accept concurrent requests while maintaining an acceptable quality of services. Managers and the like want to be able to say that they have a SOA architecture, but they do not realize that it is not just a matter a taking a bunch of code, right clicking on some kind of M$ IDE and deploy it on some kind of web server to have it running forever.
The hidden cost lies in the following questions: - How do we achieve scalability (SOAP requests are not the most efficients invention, especially when the services are fine grained). - How do we achieve fail over (when tomcat of IIS crashes) ?. - How do we make sure that a buggy request does not screw up the next request ?. - What do we do when the network connectivity is lost ?. - How do we preserve the same security levels (How to prevent DOS attacks for example ?).
All of these issues do not arise with standalone applications to the same extent. It is all of a matter of knowing what web services give you. If it is just to avoid to have to include a jar file or link a library, then it is definitely not worth it. If it is to provide a 24x7 service to outside customers, then the cost is higher, but the gain is there. The problem is always the same: There is no silver bullet in software development, but people (developers and managers) have the same tendency over and over to overuse the buzzword. I have seen millions of USD spent on various switch from mainframe to client/service using proprietary protocols then from proprietary protocols to some kind of RPC then CORBA and now web services, for more or less the same level of functionality. :-).
SOAP and Web Services are so full of hot air. There is no need for so much cumbersome syntax and packaging. We implemented a Web Service layer in our home built Servlet App Server in a few days. Distilling the minimum necessary for a Web Service server and the mountains of fluff was a trial, but worth it in the end.
"Simple Object Access Protocol" - yeah, right. There's nothing simple about it.
There is already a standard for exchanging relational data. People have been working on it for decades, and it's easy to learn yet very powerful. It's called SQL. It will let you basically get any answer you could possibly get from the available data.
When a "web service" is put on top of the database, all that really happens is that the interface is severely limited (I won't even be wining about XML data bloat and horrible performance). With an SQL query, you can query the database for a list of customers who have not been paying their bills in time lately. With web service, you can only do that if that particular question was in the requirement list of the web service (Department.GetEvilCustomers() or something to that extent). If not, the best you can do is probably retrieve huge lists of customers and payments and try to draw conclusions based on that.
Of course there are sensible reasons to place a shield in front of the database. For one, you may not want your IBM customers to be able to view Intel's orders. In all database systems i've ever worked with, you cannot place access restrictions on rows, so you will have to use some intermediate code for that. Another reason is that you don't want to put a big load on the machine by having it run 50 sequential table scans.
Most projects, however, deal only with a few Windows clients on the intranet, and a bunch of WEB servers spewing out reports on demand. Looking at the amount of useful code that goes into the "business layer" (like, for example, preventing custumers to see other customers' orders), that is laughably little code compared the the tens of thousands of lines that go into the user interfaces. Usually, the code needed to get to the SOAP/XML/RPC implementation is much more than the functionality it implements. As for code re-use, we have been using libraries for that for ages.
> My vote is, in most cases, for infinite cost. > > There is already a standard for exchanging relational > data. People have been working on it for decades, and it's > easy to learn yet very powerful. It's called SQL. It will > let you basically get any answer you could possibly get > from the available data.
And for data which isn't relational or questions for which sensible SQL queries don't exist? There is no sensible SQL query that will return the shortest/cheapest path between A and B even if you have the road network in a relational database.
> > My vote is, in most cases, for infinite cost. > > > > There is already a standard for exchanging relational > > data. People have been working on it for decades, and it's > > easy to learn yet very powerful. It's called SQL. It will > > let you basically get any answer you could possibly get > > from the available data. > > And for data which isn't relational or questions for which > sensible SQL queries don't exist? There is no sensible SQL > query that will return the shortest/cheapest path between > A and B even if you have the road network in a relational > database.
This is another good question to ask. What types of remote services are people developing that are interfaced with XML transports? I'd guess that its the whole gammit. There seems to be a lot of people talking about intra-business data exchange. There's also process oriented services. Those that enable business processes to actually work. Those are more internal things rather than "web-like" services.
If the shortest path between A and B is calculated by Google, you get back some HTML describing the outcome, including graphics and driving instructions. But, is that the only use of that service. Is the service just used by people, or should the results come out in a different form, and a "connector" do transformation to a human consumable form? Are people defining services with these layers to help manage consumption by unanticipated users?
In the Jini world, we can use JERI Exporters to export the service with a multitude of transports and constraints (security or otherwise). JMX provides the connector notion so that a JMX client can be a program, or a person.
What are people doing to provide this flexibility with the services that have web services connectivity. Is that the only connectivity available?
Even for cases where SQL does not provide a solution, SQL provides a solution called "stored procedure".
If you insist that the shortest-path calculation code run on the data center and not on the client, write the algorithm as a stored procedure (many DBMS will let you write stored procedures in a variety of languages, usually including C).
The main point of my argument is that in many projects, we end up with interfaces like (pseudo-code):
If you just look at the sheer amount of code required to get the list of open orders using this via SOAP, and compare it to the SQL code, you'll find that the latter will be much, much smaller and easier to understand.
Then somebody asks "I just want the orders that were in the BILLED_UNPAID state for more than two weeks" and we get another getOrders method that takes a "maxInterval" parameter. And then sombody wants this for all customers, not just for one. And so on. In the end, the interface has about 50 different getOrders calls. 30 of them are actually never used because they are obsolete.
Another argument for WEB interfaces is that you could modify the data model without affecting the interface. Well, dream on then. It may be true for when you change the name of a column, or (de)normalize some data. But in plain SQL, a view , or a stored procedure if thing got complicated, will solve that problem too. Imagine what happens if we realize that orders are being delivered in multiple phases, so we need to move the delivery data from the order table into another entity? No way is any client ever going to cope with that complexity without changing the interface - how would it ever be able to handle an order that has been delivered on two different dates on different addresses?
Using the stored procedure escape route can also be an impedance mismatch. Running shortest path calculations on a server seems a perfectly good idea to me --- the road map information may run to hundreds of megabytes which is tedious to distribute to clients, yet the query and result data are both very small (a few hundred bytes, growing to a few thousand with more detail in the returned path).
Suggesting that everything go via SQL is no more sensible than requiring XML everywhere.
> No way is any client ever going to cope with that complexity > without changing the interface - how would it ever be able > to handle an order that has been delivered on two > different dates on different addresses?
One of the reasons I use Jini and Java based solutions is because I can deal with this issue using a smart proxy with additional interfaces on it. That downloaded code can coerce the varied implementation details into an appropriate API for the client. It can ask the service how big the data is, and then ask the service to do the work locally if it has bandwidth or data storage issues on the client side. If it's a small problem, the client can do it locally.
You can also send some of the data with the smart proxy to reduce server and network latency involved in the second connection back to the server after the smart proxy starts doing work for the client.
This makes bandwidth and interface issue resolution be a standard part of the support cycle of the software. It doesn't have to be a version upgrade that the client needs to deal with! Instead, I can change the Jini services proxy implementation details, redeploy it and the client will see the update.
With typically deployed Web Services, you have to be able to predict all of this up front. You have to anticipate all the uses and users of your code, and accomodate their usage patterns. Now, it's also possible to use Java and Jini based solutions that talk via web services protocol to other entities in your network. Then, you can still take advantage of the dynamic, late binding of downloaded code, while having access to multiple transport protocol via the JERI stack.
There are some interesting mixes of all this technology.
> Using the stored procedure escape route can also be an > impedance mismatch.
this old chestnut. in real processors, there is one copy of the method text (text segment in *nix); even if it's really a function under the covers. (J)VMs behave the same way. put simply, what makes an instance an instance is the instance data. this is perfectly stored in a SQL database. the method text can be stored anywhere. one can duplicate this to each instance, but that would be foolish. each instance is populated from the DB. local variables get built out from the method text.
used properly, a SQL[relational] database provides way cool more data protection than any OO "model" yet proposed.
> > Using the stored procedure escape route can also be an > > impedance mismatch. > > this old chestnut. in real processors, there is one copy > of the method text (text segment in *nix); even if it's
I wasn't thinking of the performance issues, which as you say may be negligible, but the human comprehension of what is being done. After all if you end up passing some of the arguments as strings full of complex XML because a mapping to standard SQL types is awkward or contrived haven't we ended up where this discussion started?
How about merely passing coordinates --- well one ought to also pass a description of what coordinates. That ends up something like this:
the implication: hierarchical data structures (starting with IMS, really) were never as cool as some thought, and are being replaced. the article doesn't expressly endorse the RM, sadly, but a cogent argument contra-XML is always heartwarming.
I don't dispute the power of the relational model and searching over heirarchical organisation, there are always likely to be exceptions. Some, like the example in my last post are preordained defacto standards --- one has to live with them.
Flat View: This topic has 22 replies
on 2 pages