The Artima Developer Community
Sponsored Link

Java Community News
Gavin King: In Defence of the RDBMS

195 replies on 14 pages. Most recent reply: Jun 20, 2007 6:11 AM by James Watson

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 195 replies on 14 pages [ « | 1 2 3 4 5 6 7 8 9 ... 14  | » ]
James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Gavin King: In Defence of the RDBMS Posted: May 30, 2007 6:05 AM
Reply to this message Reply
Advertisement
> So, the only feasible way to get more
> throughput is with inherently parallel code. That's the
> database. The PC becomes just an Xterm, or 3270.

This is a non-sequitur. Grid computing is an obvious contradiction to this assertion.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Gavin King: In Defence of the RDBMS Posted: May 30, 2007 6:11 AM
Reply to this message Reply
> > > > > Some (OK, a lot) of OO programmers dislike my
> > > approach
> > > > > because it is different from what they have been
> > > > > taught. One table per class? No real OO
> > > > > programmer would do that! It is not "pure"
> enough!
> > > > > That's all BS as far as I am concerned. My
> approach
> > > > > works, and works well, and knocks the spots off
> > many
> > > > > other methods I have seen.
> > > >
> > > > It seems to me that this only works if the
> programmer
> > > has
> > > > control of the database schema.
> > >
> > > Correct. I have given up working with other people's
> > > second rate application designs with 3rd rate
> database
> > > designs, so nowadays I only work on applications
> where
> > I
> > > have total control over everything. That means new
> > > applications (which I design), new databases (which I
> > > design) using a RAD framework (which I designed and
> > > built).
> >
> > You must be every employer's dream. Do you bring this
> up
> > in interviews?
> >
>
> It's not a big deal to normalize a schema: all that is
> needed is an application which takes the data from one
> database, which are unnormalized, and puts it in a new
> database, which is normalized. And it can happen while a
> company does not work, i.e. at night, at weekends, at
> public holidays.
>
> Personally, I prefer to migrate the data to a better
> schema rather than continue work with a bad schema. I
> think that's beneficial, in the long run, for an
> enterprise, to get rid of the old things that hinder its
> development (things here refer to applications/data).

I never suggested this is the hard part. That's the easiest part of all. It boggles my mind that someone would believe this was my concern.

The hard part is modifying everything that depends on the schema. In the case of what I was responding to, the poster stated that code would be directly generated from the schema. Normalizing the schema would then change all these classes in non-trivial ways. I'm waiting for someone to explain to me how all the dependencies on those classes will be rectified.

Kay Schluehr

Posts: 302
Nickname: schluehk
Registered: Jan, 2005

Re: Gavin King: In Defence of the RDBMS Posted: May 30, 2007 8:14 AM
Reply to this message Reply
> That's because it's almost apples and oranges: OO is about
> organizing code, DBs are about organizing data. But
> objects will contain data at some point, so these objects
> have to be organized like databases.

I do think you missed a bit the point here. An object unifies code and data. Data is treated as internal state of the object, as an object attribute / property and shall be accessed by the object only. Relational programming intentionally separates both and provides a framework/infrastructure for dealing with pure data, something OO lacks by principle [1]. In your example you have shown lots of pure data classes. These are tables in OO clothes. Of course languages like Ruby, Python, Lisp etc. don't care much about about the boundaries of paradigms. They do even look at themselves as data. Organizational principles are adapted to the particular task.

[1] There is no doubt that paradigms are mixed in all general purpose languages. Haskell promotes imperative style programming by means of monads, Java is a reflective programming language and supports FP concepts using e.g. the stratgey design pattern etc.

Frank Wilhoit

Posts: 21
Nickname: wilhoit
Registered: Oct, 2003

Color me shocked Posted: May 30, 2007 8:28 AM
Reply to this message Reply
Gavin King is a very bright guy; yet he writes this:

"At core, the reason we need [object-relational] mapping technology is that data and data models last longer than applications, longer even than programming languages."

This is wrong: utterly, monstrously, discreditingly wrong.

The reason we need ORM is because data is hierarchal and the optimal physical representations of data hierarchy 1) in persistent store, 2) in memory, and 3) in flight are completely different.

But each one of these three kinds of representations is at least fairly well understood: the relational model is optimal for persistence, the object model is optimal for in-memory, and we^H^Hsome of us are beginning to figure out how to stream the object model onto a wire in the form of self-sufficient, self-descriptive messages.

All of these transformations are in principle deterministic although they can be arbitrarily complex. There should be no issue.

I think one reason why there seems to be an issue is premature optimization. Correctness, generality and efficiency form one of those triangles with a forbidden center. You have to pick two out of three. To me it seems obvious that you must start with correctness--else you have nothing and will never have anything--and then decide which one of generality or efficiency you need to face towards. But the creators of many ORM frameworks seem to quickly lose themselves grabbing for both generality and efficiency, apparently thinking that correctness will take care of itself. The mere fact that there is any debate about the basic principles shows that correctness will NOT take care of itself.

robert young

Posts: 361
Nickname: funbunny
Registered: Sep, 2003

Re: Gavin King: In Defence of the RDBMS Posted: May 30, 2007 8:34 AM
Reply to this message Reply
> The hard part is modifying everything that depends on the
> schema. In the case of what I was responding to, the
> poster stated that code would be directly generated from
> the schema. Normalizing the schema would then change all
> these classes in non-trivial ways. I'm waiting for
> someone to explain to me how all the dependencies on those
> classes will be rectified.

Go to the Andromeda site. He explains it.

http://www.andromeda-project.org/

As I said, one of many database driven frameworks. It's no different from MDA or Executable UML or BPEL or ... All derive from the same principle: declarative specification drives code generation.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Color me shocked Posted: May 30, 2007 8:54 AM
Reply to this message Reply
> All of these transformations are in principle
> deterministic although they can be arbitrarily complex.
> There should be no issue.

I'm not sure I buy that. It seems to me the issue with O-to-R and O-to-XML (or other format) is that both the database and XML have much richer descriptions of static structures than most (all?) OO languages allow. The only language (OO or procedural) that I've worked with that has a similarly complex static structure syntax is COBOL.

It's not that you cannot represent these structures in these languages, it just requires executable logic to do so. For example, In Java, I can't declare a variable called parent in such a way that the compiler will ensure that it will be set to a non-null value. I must write logic (i.e. code) to ensure that. On the other hand, there are some constraints that I can create in a programming language like Java that the database cannot ensure (at least not in a simple way) with it's static structures.

IMO, the problem with ORM is the same as with every tool that tries to oversimplify complex problems. It starts out as being inadequate and then bloats rapidly as it tries to address those inadequacies. By the time it's covered all the bases, it's become just as complex as the solution it was supposed to simplify.

robert young

Posts: 361
Nickname: funbunny
Registered: Sep, 2003

Re: Color me shocked Posted: May 30, 2007 9:02 AM
Reply to this message Reply
> Gavin King is a very bright guy; yet he writes this:
>
> "At core, the reason we need [object-relational] mapping
> technology is that data and data models last longer than
> applications, longer even than programming languages."
>
> This is wrong: utterly, monstrously, discreditingly
> wrong.
>
> The reason we need ORM is because data is hierarchal and
> the optimal physical representations of data hierarchy 1)
> in persistent store, 2) in memory, and 3) in flight are
> completely different.

Data does persist beyond code; although COBOL refuses to die. But languages do come and go. The data they support remains, although the paradigm does tend to be extended bifurcation: old system/old language/old data and the new system/current language/current data.

And the notion that data *is* hierarchical is completely misinformed. It is not. period. full stop. Some folks insist on forcing all data into hierarchical data stores. Dr. Codd proved the foolishness of that. The XML crowd have re-invented IMS, which Dr. Codd demolished. Most of them likely don't even know they're reliving 1964.

Oracle was the first commercial database, and not from IBM. Dr. Codd took some heat from IBM PHBs for demonstrating the failure of IMS. Only later did IBM release DB2; and then only as a thin veil over VSAM.

"Those that ignore history are condemend to repeat it". Or similar.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Gavin King: In Defence of the RDBMS Posted: May 30, 2007 9:12 AM
Reply to this message Reply
> > The hard part is modifying everything that depends on
> the
> > schema. In the case of what I was responding to, the
> > poster stated that code would be directly generated
> from
> > the schema. Normalizing the schema would then change
> all
> > these classes in non-trivial ways. I'm waiting for
> > someone to explain to me how all the dependencies on
> those
> > classes will be rectified.
>
> Go to the Andromeda site. He explains it.
>
> http://www.andromeda-project.org/

Where specifically?

> As I said, one of many database driven frameworks. It's
> no different from MDA or Executable UML or BPEL or ...
> All derive from the same principle: declarative
> e specification drives code generation.

It's not the same at all. By this standard, Java is also a declarative specification that drives code generation as is any other compiled language.

Andromeda seems fine for an fairly trivial application that is one-to-one with the database. Personally, that seems like the perfect user case for an OODMBS. If I develop my app with andromeda and all the PHDs next door create a bunch of reports and statistical models around the database, will andromeda fix that for me? Creating a web page is a pretty simple thing to do. Can andromeda do anything else?

Ravi Venkataraman

Posts: 80
Nickname: raviv
Registered: Sep, 2004

Re: Color me shocked Posted: May 30, 2007 9:16 AM
Reply to this message Reply
> Gavin King is a very bright guy; yet he writes this:
>
> "At core, the reason we need [object-relational] mapping
> technology is that data and data models last longer than
> applications, longer even than programming languages."
>
> This is wrong: utterly, monstrously, discreditingly
> wrong.
>

Frank, here's what I think Gavin King was trying to say: We know that data stays for a very long time. For example, if I have a banking account with a bank for 30 years, then the data must stay for quite a significant part of that period, if not all. During this time, people may have accessed it using COBOL, C++, Java, dotNet, or whatever. In the future this data may be accessed by other types of applications. During each stage, different paradigms of data-handling are involved at the application level. Hence the problem of mapping data between application and database will always be there. Currently, since OO technologies are in vogue, we see OR mapping being discussed. But the fundamental fact is that there has to be a mapping layer between the application and the database when these two layers have different paradigms.

In that sense, I do not think that Gavin's statement is wrong in any way at all.

PS: I'm not a fan of Gavin King or ORM tools. In fact, a few years ago, when I first read some of Gavin's comments about relational databases, I realized that he knew very little about them, at that time. Since then he seems to have educated himself quite well indeed.

Frank Wilhoit

Posts: 21
Nickname: wilhoit
Registered: Oct, 2003

Terminology? Posted: May 30, 2007 11:45 AM
Reply to this message Reply
When I say that data is hierarchal, I am referring to things like the ubiquitous header/detail pattern, e.g. an invoice with line items. If that is not hierarchy, what would be a better word for it?

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Terminology? Posted: May 30, 2007 12:27 PM
Reply to this message Reply
> When I say that data is hierarchal, I am referring to
> things like the ubiquitous header/detail pattern, e.g. an
> invoice with line items. If that is not hierarchy, what
> would be a better word for it?

I'm not sure who misunderstood you or what disconnect there was but in context of this discussion I think people might mean how the data is structured. In XML, it's stored hierarchically while in a RDBMS, the hierarchy is inherent in the structure of the data (necessarily) but a relationship between records.

Ravi Venkataraman

Posts: 80
Nickname: raviv
Registered: Sep, 2004

Re: Terminology? Posted: May 30, 2007 12:40 PM
Reply to this message Reply
> When I say that data is hierarchal, I am referring to
> things like the ubiquitous header/detail pattern, e.g. an
> invoice with line items. If that is not hierarchy, what
> would be a better word for it?

Could you please find the hierarchy in this common situation?

I have 2 bank accounts: one that I share with my wife as joint account holders, and the other my very own savings account. Please give me a simple hierarchical model using Customer and Account classes.

Not all data is hierarchical. A lot of information can be modelled as a hierarchy, but that doesn't make everything hierarchical.

Remember that a model is not necessarily the same as the thing itself. An OO model, generally, contains hierarchical types. It does not mean that the underlying objects are inherently hierarchical. Hierarchical representation is just one of several possible ways to model reality.

Mark McConkey

Posts: 2
Nickname: markmcc
Registered: May, 2007

Re: Gavin King: In Defence of the RDBMS Posted: May 30, 2007 1:11 PM
Reply to this message Reply
I have a question for the OO folks. The need to denormalize tables in order to speed queries does and can occur when using an RDBMS. You may have many terabytes worth of data and some very complex search requirements across many tables and you find that your queries are still slow so you have the option of denormalizing your tables. How would you accomplish this using an OO database/language?

I also have a question for the RDBMS folks? How quickly could you produce a small application, install it, etc. with no DBA involved?

Please don't answer, my point is as follows:

I think one of the key underlying problems throughout the discussions here is that each author is concerned about their own context which has a high liklihood of differering greatly from that of the next author. Does the application need to scale? Is it a fairly simple application? Will complex and currently unknown searches against the data need to be performed in the future? Will the application need to change drastically and quickly due to the business you are supporting? The tools we choose should be based upon the context of the given problem.

"there is no silver bullet" - Fred Brooks

robert young

Posts: 361
Nickname: funbunny
Registered: Sep, 2003

Re: Terminology? Posted: May 30, 2007 2:27 PM
Reply to this message Reply
> When I say that data is hierarchal, I am referring to
> things like the ubiquitous header/detail pattern, e.g. an
> invoice with line items. If that is not hierarchy, what
> would be a better word for it?

Not to say I Told You So, but...

The BOM processing problem as implemented in IMS was why Dr. Codd rethought data, and devised the relational model. This is not to assert that past or current SQL databases are fully RM. Most are pretty close. MySql being a notable exception.

Now, as to header/detail line items. They are not hierarchically related, in that the line item is related to: the order, the invoice, the part, the customer, the supplier, etc. This was what Dr. Codd figured out. Using the IMS query language, being able to deal with *each* of the relationships was a mess. If one defined any one as *the* relation, the rest became a rat's nest. At the time, IBM had devised this hierarchical database just because it didn't want to be under the thumb of standards, CODASYL, which had the network model database. So, they took a step backwards, even then.

There is a reason the relational database was devised. It wasn't just to be new and spiffy. It was a mathematical answer to the question of how to relate data.

robert young

Posts: 361
Nickname: funbunny
Registered: Sep, 2003

Re: Terminology? Posted: May 30, 2007 2:59 PM
Reply to this message Reply
> When I say that data is hierarchal, I am referring to
> things like the ubiquitous header/detail pattern, e.g. an
> invoice with line items. If that is not hierarchy, what
> would be a better word for it?

Oh, and I collect quotes for my email signature here in CubeLand (mostly javBOL coders, so it's loads of fun).

Here's one that's apropos:

Codd had a bunch of ...fairly complicated queries, and since I'd been studying CODASYL (the language used to query navigational databases), I could imagine how those queries would have been represented in CODASYL by programs that were five pages long that would navigate through this labyrinth of pointers and stuff [presaging XML datastores]. Codd would sort of write them down as one-liners. ...(T)hey weren't complicated at all. I said, 'Wow.' This was kind of a conversion experience for me. I understood what the relational thing was about after that.
-- Don Chamberlin/1995

What's depressing about this quote is that Chamberlin defined (mostly he, anyway) SQL, which paid lip service to the Relational Model at the time he did it. He's also responsible (mostly he, anyway) for XQuery. For those who don't get it, XQuery is a re-creation of those five page navigational programs. And the network model (CODASYL) was better adapted to dealing with multiple relationsips than the IMS/hierarchical method. I guess his conversion was faux.

The time he's talking about is not 1995, but about 1975, likely earlier.

Flat View: This topic has 195 replies on 14 pages [ « | 1  2  3  4  5  6  7  8  9 | » ]
Topic: JavaOne, Day 3: Java Puzzlers Previous Topic   Next Topic Topic: Josh Davis Explains JavaScript's Prototype Objects

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use