This post originated from an RSS feed registered with PHP Buzz
by Alan Knowles.
Original Post: SQL change management done right.
Feed Title: Smoking toooo much PHP
Feed URL: http://www.akbkhome.com/blog.php/RSS.xml
Feed Description: More than just a blog :)
I'm slowly going through mtrack at present, breaking the HTML rendering into templates, and making nice clean classes for the fetching of underlying data fetching. With quite a few plans ahead for it once it's in a state to be easily changed.
The more deeply I go through mtrack, there are parts I find that you look at and think, "That's a really smart way to handing that problem". Like the change control auditing, where each component has a ID for the last updated (and created), which maps to a event table id containing who/when data. Much cleaner than the more common practice of using two datetime and user fields.
However, as always, there are parts of the code where you want to pull you hair out and say, No way should you solve a problem like this. Mtrack's SQL change management is one of those areas.
It's approach is not uncommon, and I've seen it before. Basically it uses an XML file to define the schema and then has a series of files schema1.xml,schema2.xml,schema3.xml... and so on.
The installer works out what the current version is, then compares schema's between the current and selected version. and works out how to update the database. There is a driver for PostgreSQL and SQLite.
My fundamental issue with all this is that while on the face of things it seems like not a bad idea, it ignores the plain simple fact that there is already a language used for Database Schema definitions and modifications, and it's called SQL!
Anyone who uses SQL can quickly read a SQL file and understand what it's doing, even if it's in a different dialect (SQLite etc...), but putting that type of information in a XML file just adds so much noise and involves learning and remembering a small subset of information, that is only relevant to a tiny problem domain, along with being just annoying to read.
For my projects I'm luckly only having to deal with a single database type (MySQL usually), and have been keeping a single SQL file that contains all the definitions, and later changes which can just be run with '-f' (force) at the MySQL command line to update any server and synchronize it with the project code when needed.
The big problem however is that SQL is for that for all these great benefits has turned out to be horribly incompatible between database vendors. If I carry on with the above idea of keeping my database modifications in a SQL file then I would have to have one for each of the databases I want to support, along with making sure I kept them up-to-date, not really a good long term plan.
So rather than keep multiple files up-to-date, why not convert the SQL schema changes from one dialect to another, and just keep an SQL file as I currently do, and make it feaasible for anyone to install using the database of their choice.
This is one of those 'why has no-one done this before moments', but for the life of me, I could not find anything that came up to quickly on google. So I had a look at what was 'close enough' for this idea to work, and what a supprise, most of the code for this is already in PEAR.
The SQL Parser package, as very basically introduced here, http://www.sjhannah.com/blog/?p=16, provides pretty much all the backend code for a solution to this. However, there was not previously any code in either the parser, or writer/compiler to actually deal with DDL commands like alter table etc.
I've just commited the changes for this, so you can now very easily extend the current SQL_Parser_Compiler class to output your favourite dialect of SQL. based on reading an SQL file containing the base changes.
For a example of how to use it in real life, here's a nice simple example from the forked mtrack codebase.