The Artima Developer Community
Sponsored Link

Weblogs Forum
Database Constraints: Use Them or Lose Them?

56 replies on 4 pages. Most recent reply: Oct 8, 2009 9:07 AM by John Zabroski

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 56 replies on 4 pages [ « | 1 2 3 4 | » ]
John Zabroski

Posts: 272
Nickname: zbo
Registered: Jan, 2007

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 24, 2009 1:26 PM
Reply to this message Reply
Advertisement
Bill,

You're forgetting to mention the Artima forums basically have never received a significant upgrade. You hardly qualify as a case study for enterprise development, even if the slogan is "best practices in enterprise software development".

Where I work, we have a name for this paradox/joke: "The all-in-one solution is itself not an all-in-one solution." Sure, I'm an MDA weenie, but I spend half a year just knitting.

Bill Venners

Posts: 2284
Nickname: bv
Registered: Jan, 2002

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 24, 2009 2:10 PM
Reply to this message Reply
Hi John,
>
> You're forgetting to mention the Artima forums basically
> have never received a significant upgrade. You hardly
> qualify as a case study for enterprise development, even
> if the slogan is "best practices in enterprise software
> development".
>
> Where I work, we have a name for this paradox/joke: "The
> all-in-one solution is itself not an all-in-one solution."
> Sure, I'm an MDA weenie, but I spend half a year just
> t knitting
.
>
I'm not sure I get your point. The point I was trying to make is simply that if you're feeling like you're being forced to duplicate code, a little metaprogramming may be able to eliminate it. It isn't just database layers, but anything. However in this discussion some folks sounded like they felt they had to choose between dealing with integrity in the database, or in the OO language, or if in both places, be forced to duplicate code. I've felt that pain multiple times in my programming career and have created a code generators to address it. I can think of three generated database layers that I've done, one in C, another in C++, and the Artima one in Java.

John Zabroski

Posts: 272
Nickname: zbo
Registered: Jan, 2007

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 24, 2009 2:37 PM
Reply to this message Reply
Bill,

Understood, but most metaprogramming doesn't scale well. You need to understand how to write compilers, and know how to tune the platform you are targeting, to really eliminate the duplicate code.

In particular, if you're building a software factory or software product line (like we do), then you generally want something fancier than just a database layer code generator... or fancier than ActiveRecord.

Otherwise you end up repeating the sins Frank's original article discusses.

Simple as that, really. If all this talk of factories and lines is intimidating, well, it can be dumbed down trivially. It an acid test: are you managing labor, or are you eliminating it?

Bill Venners

Posts: 2284
Nickname: bv
Registered: Jan, 2002

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 24, 2009 3:57 PM
Reply to this message Reply
> Understood, but most metaprogramming doesn't scale well.
> You need to understand how to write compilers, and know
> w how to tune the platform you are targeting, to really
> eliminate the duplicate code.
>
> In particular, if you're building a software factory or
> software product line (like we do), then you generally
> want something fancier than just a database layer code
> generator... or fancier than ActiveRecord.
>
> Otherwise you end up repeating the sins Frank's original
> article discusses.
>
> Simple as that, really. If all this talk of factories and
> lines is intimidating, well, it can be dumbed down
> trivially. It an acid test: are you managing labor, or
> are you eliminating it?
>
I hear a lot of people complain about code generation. I have had problems with the kind where they generate code for you, leaving little marked up sections that you are supposed to fill in, then later they try and avoid overwriting those customized sections when regenerating. That I also find problematic. But a simple grammer parsed with a parser generated by a code generator for parsers, like ANTLR, seems pretty easy to understand and maintain. I.e., one that there's just one source in the external DSL you've defined, and you don't touch the generated code. Plus that generated code would be source you'd otherwise write by hand, so it should be readable and understandable I would think. My experience with these things has always been small teams, but I'd think the main factor in whether such an approach would be more trouble than its worth is how well the code generator itself is designed and documented.

Anyway, I think we're straying a bit off topic. I myself like having validation in both the app and the database. I like it in the database because I sometimes go in there with SQL. I.e., there isn't just one monolithic app that will be accessing that data, so I'm more comfortable if the database is helping keep the data clean. But I don't like the duplication of effort, and and in the past have dealt with that with static metaprogramming that generates the duplicate code artifacts.

Knut Mork

Posts: 1
Nickname: knutm
Registered: Sep, 2009

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 24, 2009 4:15 PM
Reply to this message Reply
Quite a few DBA's I know of would agree that the db constraints should be used more often, because for them the database is the center of the universe. And if so, then that does make perfectly sense. However, that's of course not always true. I would say it depends on the architecture.

Historically relational databases have been a shared resource between multiple applications or other type of clients. It's also been used as an integration medium between multiple systems. When several clients may insert or update data directly to the same database, and you have decided that data integrity constraints is required, then db constraints is a valid technique. This will adhere to the DRY principle, avoiding duplication of integrity logic. And if you don't control the clients, then you really don't have any other options.

In other scenarios were the database is wrapped within an API of some sort (application layer etc), such that the only access to the data is by using the APIs, then I do not see the benefits of having db constraints. The following arguments come to mind:
1. The type of possible constraints are limited. For example how to express that "if field A is null field B cannot be null and vice versa" ?
2. General programming languages are more expressive. It can be more elegant, more precise, less duplication, richer, etc.
3. Easier to test and change. The database is used for persisting data, not for fragmented domain logic. Possible to test logic without database.
4. Handling violations to the data constraints must be done in the application logic (if you want more than a general cryptic db-specific exception in return). Keeping the constraint checks and the action upon the result from those checks very separate from each other is not ideal.
5. In OO systems, integrity is part of the objects responsibilities, and that logic should thus be contained within the objects themselves.

John Zabroski

Posts: 272
Nickname: zbo
Registered: Jan, 2007

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 24, 2009 4:47 PM
Reply to this message Reply
> Anyway, I think we're straying a bit off topic.

No, no.

It is the heart of the matter.

Topic: "Database Constraints: Use Them or Lose Them?"

Constraints are tests. So the basic question is, how can we make sure our tests always run?

Bruno Vernay

Posts: 4
Nickname: brunov
Registered: Sep, 2009

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 25, 2009 4:44 AM
Reply to this message Reply
> Understood, but most metaprogramming doesn't scale well.
> You need to understand how to write compilers, and know
> w how to tune the platform you are targeting, to really
> eliminate the duplicate code.
>
> In particular, if you're building a software factory or
> software product line (like we do), then you generally
> want something fancier than just a database layer code
> generator... or fancier than ActiveRecord.
>
> Otherwise you end up repeating the sins Frank's original
> article discusses.
>
> Simple as that, really. If all this talk of factories and
> lines is intimidating, well, it can be dumbed down
> trivially. It an acid test: are you managing labor, or
> are you eliminating it?

I don't understand. Do you mean that code auto-generated from a model is generally to simple to match your requirement ? Isn't it just because you have complicated requirement ?

It seems that for simple constraint like "unique", non-null, foreign key and even constraint involving multiple fields as long as their are expressed in a simple SQL query. Well code generation will eliminate the labor of coding it in the database and in the application(s).

So you only have to express your constraints in the model, then you even mentioned that AndroMDA provides free generators.

Are you saying that code generation isn't there yet or that it isn't the right solution to constraint management ???

Rinie Kervel

Posts: 26
Nickname: rinie
Registered: Oct, 2005

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 25, 2009 5:34 AM
Reply to this message Reply
> For instance, setting a
> constraint so that the <code>salary</code> field can't
> accept a value less than 5000 and greater than 1000000
> will cause a database to return an error code when

This value seems like a 'business logic' constraint, that may change in time without modifications to the database schema.
(More a data entry min/max validation that should not be in code either).

In my opinion DB constraints should be used to guarantee your database integrity from a DBA point of view. NOT NULL constraints, Foreign key / REF constraints are examples that are hard to duplicate in code. Unfortunately active record patterns and orm tend to deal with a single table at a time, wherease the true power of a RDBMS comes from joins and views

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 25, 2009 7:17 AM
Reply to this message Reply
> In my opinion DB constraints should be used to guarantee
> your database integrity from a DBA point of view. NOT NULL
> constraints, Foreign key / REF constraints are examples
> that are hard to duplicate in code. Unfortunately active
> record patterns and orm tend to deal with a single table
> at a time, wherease the true power of a RDBMS comes from
> joins and views

I agree with this. Business rules are far too malleable to be implemented in the database. One lesson I've learned over time is that the persistence layer should actually be pretty lenient and only ensure the core constraints of the data. Too many times I've found that what seemed to be an obvious constraint was really functionally incorrect. In addition, if you are too strict, you'll end up with no data instead of partial data. You need to consider the frustration of the people who need to put something into the database that your constraint will not allow.

John Zabroski

Posts: 272
Nickname: zbo
Registered: Jan, 2007

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 25, 2009 7:49 AM
Reply to this message Reply
> You need to consider the frustration of the people who need
> to put something into the database that your constraint
> will not allow.

And it could also be that your DBMS can't perform writes with integrity checks fast enough to handle the write-load against your database.

So, again, you need a way to not tightly couple things to tiers, including the storage tier.

With models, there is no reason why you can't move duplication checks into an a posteriori process. This is what accounting systems effectively do when they "close the books" on a period of time/business activity.

In a distributed, federated system, you have much bigger problems than these, though. You need to also consider the frustration of customers who are looking at their account and still see a charge for $x dollars even though they paid the charge yesterday. This cost goes directly to your client's bottom line, as a customer service rep now has to go call up somebody, IT maybe, to reconcile the customer's record. Now you're paying somebody $15-20 / hour to straighten out something the system can't handle correctly. There at some industries I know of where this is a pervasive problem, and in the WSJ you hear about this industry weekly, where the business is either "doing it right" or "doing it wrong". But the WSJ almost always is clueless as to why.

John Zabroski

Posts: 272
Nickname: zbo
Registered: Jan, 2007

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 25, 2009 8:00 AM
Reply to this message Reply
> Unfortunately active record patterns and orm tend to deal
> with a single table at a time, wherease the true power of a
> RDBMS comes from joins and views

This usually leads to anti-patterns such as "Data Access Object Within Data Access Object". It's one reason why you often see classes suffixed with things like DAO and DTO, because the metaprogramming isn't smart enough to avoid row-by-agonizing-row (RBAR) class models. So this way the programmer can look at classes and how they talk to each other, recognize a bad pattern, and say "this is going to blow up in our face, let's refactor it". Most ORM tools force you to also configure these things such that you treat your class model as tokens to blit your data stream on.

Hibernate/NHibernate has all kinds of knobs you can turn to blit things in different ways.

So, to rephrase your comments, what you are saying is Active Record tightly couples the data access pattern to the data dictionary. You're right. It just turns out that for most applications, and average hardware and average loads, it doesn't matter whatsoever.

I'm not criticizing Hibernate or Rails or whatever. I'm just explaining, programmatically, how these things work and what happens when you use them. People get rich using these tools, so it's not like you can't.

John Zabroski

Posts: 272
Nickname: zbo
Registered: Jan, 2007

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 25, 2009 8:22 AM
Reply to this message Reply
> So you only have to express your constraints in the model,
> then you even mentioned that AndroMDA provides free
> generators.

I have never used AndroMDA. Its name bears coincidental resemblance to a SourceForge/GoogleCode project called Andromeda Project, or simply "Andro" for short. Andromeda is written in PHP. I also don't use it. Awhile back I also got kicked off the mailing list for being a colossal jerk (posting code reviews that would make Al Viro of the linux kernel mailing list blush). It was a situation where I probably should've banned myself.

What Andromeda does well is it has a simple, additive model for security. It also decouples columns from tables, and tables from schemas. This means you only define everything once, including columns. Columns are "placed into" tables, as opposed to being defined as "part of a" table. Tables are therefore composed additively, as are the security constraints on them. Tables actually aren't even directly grouped into schemas, but instead grouped into an Andromeda Data Dictionary concept called a "module" (effectively a namespace). Modules can in turn have security settings. Security itself can be composed out of groups, which map to roles.

Other code generators, like Oslo, have a slightly different way of doing things, but they also don't allow you to directly specify security settings and other compositional properties. Oslo is still very immature, and I personally dislike the language they came up with. It was designed by the same folks who thought COM/CORBA was a good idea. The saving grace is probably that the same folks who invented VB are also on the project. IMHO, looking at Oslo, it's not as complicated as COM, but it's not targeted for the layman and will require several iterations and facade patterns to cover up the more complicated portions of it.

The scope of Oslo is also huger. For instance, with Oslo, I can use MS Researcher's Common Compiler Infrastructure: Metadata project as a browser for basically all my software artifacts that MSBuild uses.

> Are you saying that code generation isn't there yet or that
> it isn't the right solution to constraint management ???

I'm saying millions of people have done code generation before. What matters is how well you do it, or how low your expectations are.

If you're building software product lines, things can get very complicated quickly, and its not something you want to do ad-hoc, or you'll be writing millions of lines of Java code (which SpringSource is rooting for you to do). For instance, Minnesota State requires that you not show employees full birthdate (but you are allowed to show the month and day). So now your model has to accomodate a legal requirement, and you've got to be able to verify you're meeting these legal requirements.

James Watson

Posts: 2024
Nickname: watson
Registered: Sep, 2005

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 25, 2009 8:34 AM
Reply to this message Reply
> In a distributed, federated system, you have much bigger
> problems than these, though. You need to also consider
> the frustration of customers who are looking at their
> account and still see a charge for $x dollars even though
> they paid the charge yesterday. This cost goes directly
> to your client's bottom line, as a customer service rep
> now has to go call up somebody, IT maybe, to reconcile the
> customer's record.

I've seen many cases where constraints cause this kind of problem. For example, I once was doing my taxes using a personal tax software package. The company I had just left was purchased by a Canadian company and when I went to enter my W-2 info, it told me that the postal code could only contain numbers.

Obviously this is no at the DB level but the issue is the same. Constraints can hurt as much as they help. And the lower the level you implement them, the more havoc they cause if they are not correct. My take on DB constraints is that they are generally good if they enforce rules that are essential to downstream processing working properly and they are bad if they are an attempt to implement business logic.

John Zabroski

Posts: 272
Nickname: zbo
Registered: Jan, 2007

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 25, 2009 8:40 AM
Reply to this message Reply
> The company I had just left was purchased by a Canadian
> company and when I went to enter my W-2 info, it told me
> that the postal code could only contain numbers.

It's also the case that there aren't 50 two-letter abbreviation codes for the US. There's way more - commonwealths, military, etc. I think there is even one reserved for "unknown".

H&R Block TaxCut Pro had a similar constraint problem for me this year. I was worried it would increase the chances I would get audited, too. So far that has not been the case. Any time you switch jobs or move, tax software sucks, in my experience.

In this particular scenario, H&R Blocks model sucks. They should decouple the printing stage from the integrity check stage, and allow me a "Screw you H&R block, i know what I am doing" stage. But they really don't allow this, probably for legal reasons. All I want is H&R Block to put me on "Rails" and give me an ActiveRecord-like convention over configuration eliminate-redundancy yada-yada-goodness to complete my taxes as soon as possible.

robert young

Posts: 361
Nickname: funbunny
Registered: Sep, 2003

Re: Database Constraints: Use Them or Lose Them? Posted: Sep 25, 2009 11:29 AM
Reply to this message Reply
Frank, I luv you. I think. Not sure. Will have to decide whether to make the effort to educate the reactionary heathen yet again; and it's only been a few days since the last database thread. Has Artima gotten Religion?

Flat View: This topic has 56 replies on 4 pages [ « | 1  2  3  4 | » ]
Topic: Speaking at FOSDEM Previous Topic   Next Topic Topic: Speaking At Developer Day in Boulder

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use