In which I return to this blog after a slow start and a multi year gap
It is so long since I wrote anything in the weblog it is probably worth saying hello again. A lot has changed since I was last here.
I left Scotland to move to do web archiving at the British Library in London. Life in the public sector was a serious culture shock. The library has a lot of brilliant people. I also got to meet a lot of interesting people in external organizations including the likes of the BBC and the ubiquitous Cory Doctorow and I learned a lot about how our public sector works here in the uk. There was a serious downside though (apart from the low wages) - the library has what I can only describe as a 'can't do' attitude. When you are looking after books and manuscripts there is no great rush, in fact going slow is a positive advantage. Not so in the digital world and sadly they where happy to ignore the advice of the technical people they had hired (including me) and bumble about wasting public money.
I did learn an enormous amount about web archiving though. I, and a small group of like minded people, became convinced that not only was web archiving possible at a scale only hinted at by the public institutions but it was also possible to make it available to the world at large (unlike, and complimentary to, the sterling efforts of the Internet Archive who archive primarily broad snapshots of the web). To this end we started the Hanzo project. We have a beta version of our first service available now: hanzo
I am the main technical person behind this project and have pretty much implemented it single handedly, some of which I hope to talk about here. The service has been guided by a very big name in the world of Web Archiving and we are just at the start of our journey. We have many many exciting features to come.
I am still doing a day job to pay the bills and I will be talking about one or two of the challenges I face there as well.
Technology wise I am currently working in Python (Django and Twisted) and in the day job a mixture of Python, SQL server and for the last day or two .Net
As someone who's worked for the government in the past (briefly), I can tell you that speed is not a requirement. You just have to use it to your advantage. More time for things that the public sector likes. Better documentation and presentations. Basically, anything to give the higher ups confirmation as to where the money went and to show that putting money into the project was the right decision and that you need more time and money to expand on these ideas (with your boss' consultation of course). Working in the government is all about making the people you work with look good. It's a fine line though. Because of the extra time, you tend to do too much at first and make others look bad (or rather that you just don't get it). That's where the "can't do" attitude comes from. It has to balance the money put into the project. If the project is completely successful, then they have no argument to get more money and their budget will get cut.
Then again, it's been a while. Things may have changed since and I don't know how it works in the UK.