The Artima Developer Community
Ruby Code & Style | Discuss | Print | Email | First Page | Previous | Next
Sponsored Link

Ruby Code & Style
Linux Clustering with Ruby Queue: Small is Beautiful
by Ara Howard
October 10, 2005

Page 1 of 3  >>

Advertisement

Summary
Ruby Queue software package lowers the barriers scientists need to overcome in order to realize the power of Linux clusters. It provides an extremely simple, economic, and easy-to-understand tool that harnesses the power of many CPUs while simultaneously allowing researchers to shift their focus away from the mundane details of complicated distributed computing systems and back to the task of actually doing science. The tool set is designed with a K.I.S.S, research-focused, philosophy that enables any ordinary (non-root) user to set up a zero-admin Linux cluster in 10 minutes or less. It is currently being used successfully in such diverse fields as bio-chemical research at the University of Toronto, geo-mechanical modeling at IGEOSS, and studying the nighttime lights of the world at the National Geophysical Data Center.

My friend Dave Clements is always game for a brainstorming session—especially if I'm buying the coffee. We met at the usual place and I explained my problem to him over the first cup: We had a bunch of Linux nodes sitting idle and we had a stack of work lined up for them, but we had no way to distribute the work to them. And the deadline for completion loomed over us. Over the second cup, I related how I had evaluated several packages such as Open Mosix [0] and Sun's Grid Engine [1], but had ultimately decided against them. It all came down to this: I wanted something leaner than everything I'd seen, something fast and easy, not a giant software system that would take weeks of work to install and configure.

After the third cup of coffee, we had it: Why not simply create an NFS [2] mounted priority queue and let nodes pull jobs from it as fast as they could? No scheduler. No process migration. No central controller. No kernel mods. Just a collection of compute nodes working as quickly as possible to complete a list of tasks. But there was one big question: Was concurrently accessing an NFS-mounted queue possible to do safely? Armed with my favorite development tools (a brilliant IDE named vim [3] and the Ruby programming language [4]), I aimed to find out.

History

I work for the National Geophysical Data Center's (NGDC) [5] Solar-Terrestrial Physics Division (STP) [6] in the Defense Meteorological Satellite Program (DMSP) group [7]. My boss, Chris Elvidge, and the other scientists of our group study the nighttime lights of earth from space. The data we receive helps researchers understand changes in human population and the movement of forest fires, among other things. The infrastructure required to do this sort of work is astounding. This image:

Figure 1. Average intensity of nighttime lights over part of North America.

showing the average intensity of nighttime lights over part of North America, required over 100 gigabytes of input data and 142 terabytes of intermediate files to produce. Over 50000 separate processes spread across 18 compute nodes and a week of wall clock time went into its production.

Linux clusters have become the new super computers. The economics of teraflop performance built on commodity hardware is impossible to ignore in the current climate of dwindling research funding. However, one critical aspect of cluster-building, namely orchestration, is frequently overlooked by the people doing the buying. The problem facing a developer with clustered systems is analogous to the one facing a home buyer who can only afford a lot and some bricks: He's got a lot of building to do.

Building a small brick house on a shoestring

Yukihiro Matsumoto, a.k.a Matz, has said that "The purpose of Ruby is to maximize programming pleasure" and experience has taught me that enjoying the creative process leads to faster development and higher quality code. Ruby features powerful object oriented abstraction techniques, extreme dynamism, ease of extensibility, and an armada of useful libraries. It is a veritable "Swiss Army machete," precisely the sort of tool one should bring into uncharted territory like this.

Laying the foundation

The first task was to work out the issues with concurrent access to NFS shared storage, and the first bridge I had to cross was how to accomplish NFS-safe locking from within Ruby. Ruby has an fcntl(2) interface similar to Perl's and, just like Perl's, the interface requires you to pack a buffer with the struct arguments. This is perfectly safe, but, unfortunately, non-portable. I've wondered about this oversight before and decided to address it by writing a little C extension, "posixlock", which extends Ruby's built-in File class with a method to apply fcntl(2), or posix style, advisory locks to a File object. Here is a majority of the code from posixlock.c:

static int
posixlock (fd, operation)
     int fd;
     int operation;
{
  struct flock lock;
  switch (operation & ~LOCK_NB)
    {
    case LOCK_SH:
      lock.l_type = F_RDLCK;
      break;
    case LOCK_EX:
      lock.l_type = F_WRLCK;
      break;
    case LOCK_UN:
      lock.l_type = F_UNLCK;
      break;
    default:
      errno = EINVAL;
      return -1;
    }
  lock.l_whence = SEEK_SET;
  lock.l_start = lock.l_len = 0L;
  return fcntl (fd,
    (operation & LOCK_NB) ? F_SETLK :
    F_SETLKW, &lock);
}

static VALUE
rb_file_posixlock (obj, operation)
     VALUE obj;
     VALUE operation;
{
  OpenFile *fptr;
  int ret;
  rb_secure (2);
  GetOpenFile (obj, fptr);
  if (fptr->mode & FMODE_WRITABLE)
    {
      fflush (GetWriteFile (fptr));
    }
retry:
  TRAP_BEG;
  ret =
    posixlock (fileno (fptr->f),
         NUM2INT (operation));
  TRAP_END;
  if (ret < 0)
    {
      switch (errno)
  {
  case EAGAIN:
  case EACCES:
#if defined(EWOULDBLOCK) && EWOULDBLOCK != EAGAIN
  case EWOULDBLOCK:
#endif
    return Qfalse;
  case EINTR:
#if defined(ERESTART)
  case ERESTART:
#endif
    goto retry;
  }
      rb_sys_fail (fptr->path);
    }

void
Init_posixlock ()
{
  rb_define_method (rb_cFile, "posixlock",
        rb_file_posixlock, 1);
}

Granted it's a bit ugly, but C code always is. One of things that's really impressive about Ruby is that the code for the interpreter itself is very readable. The source includes array.c, hash.c, and object.c—files that even I can make some sense of. In fact, I was able to steal about ninety eight percent of the above code from Ruby's File.flock implementation defined in file.c.

Along with posixlock.c I needed to write an extconf.rb (extension configure) file which Ruby auto-magically turns into a Makefile. Here is the complete extconf.rb file used for the posixlock extension:

require 'mkmf' and create_makefile 'posixlock'

Usage of the extension mirrors Ruby's own File.flock call, but is safe for NFS mounted files. The example below can be run simultaneously from several NFS clients:

require 'socket'
require 'posixlock'

host = Socket::gethostname
puts "test running on host #{ host }"

File::open('nfs/fcntl_locking.test','a+') do |file|
  file.sync = true
  loop do
    file.posixlock File::LOCK_EX
    file.puts "host : #{ host }"
    file.puts "locked : #{ Time::now }"
    file.posixlock File::LOCK_UN
    sleep 0.42
  end
end

A "tail -f" of the NFS mounted file "fcntl_locking.test" will show it being concurrently accessed in a safe fashion. Notice the lack of error checking—Ruby is an exception-based language, any method which does not succeed will raise an error and a detailed stack trace will be printed on standard error.

One of the things to note about this extension is that I was able to actually add a method to Ruby's built-in File class. Ruby's classes are open—you can extend any class at any time, even the built-in ones. Obviously extending the built-in classes should be done with great care, but there is a time and a place for it and Ruby does not prevent you from doing so where it makes sense. The point here is not that you have to to extend Ruby but that you can. And it is not difficult.

Having resolved my locking dilemma, the next design choice I had to make was what format to store the queue in. Ruby has the ability to serialize any object to disk and also includes a transactions-based file-backed object storage class, PStore, which I have used successfully as a 'mini database' for many cgi programs. I began by implementing a wrapper on this class that used the posixlock module to ensure NFS safe transactions and which supported methods like "insert_job", "delete_job", and "find_job." Right away I started feeling like I was writing a little database.

Not being one to reinvent the wheel (or at least not too often!) I decided to utilize the SQLite [8] embedded database and the excellent Ruby bindings for it written by Jamis Buck [9] as a storage back end. This really helped get the project moving as I was freed from writing a lot of database-like functionality.

Many database APIs have made the choice of returning either a hash or an array to represent a database tuple (row). With tuples represented as hashes you can write very readable code like:

ssn = tuple['ssn']

and yet are unable to write natural code like

sql =
  "insert into jobs values ( #{ tuple.join ',' } )"

or

primary_key, rest = tuple

While with an array representation you end up with undecipherable code like:

field = tuple[7]

Now, what was field "7" again?

When I first started using the SQLite binding for Ruby, all it's tuples were returned as hashes and I had a lot of slightly-verbose code converting tuples from hash to array and back again. Anyone who's spent much time working with Ruby will tell you that its elegance inspires you to make your own code more elegant. All this converting was not only inelegant, but inefficient. What I wanted was a tuple class that was an array, but one that allowed keyword field access for readability and elegance.

For Ruby this was no problem. I wrote a pure Ruby module, ArrayFields, that allowed any array to do exactly this. In Ruby a module is not only a namespace but can be "mixed-in" to other classes to impart functionality. The effect is similar, but less confusing, than multiple inheritance. In fact, not only can Ruby classes be extended in this way, but instances of Ruby objects themselves can be dynamically extended with the functionality of a module—leaving other instances of that same class untouched. Here's an example using Ruby's Observable module, which implements the Publish/Subscribe design pattern:

require 'observer'
publisher = Publisher::new
publisher.extend Observable

In this example only this instance of the Publisher class is extended with Observable's methods.

Jamis was more than happy to work with me to add ArrayFields support to his SQLite package. The way it works is simple: if the ArrayFields module is detected at runtime the tuples returned by a query will be dynamically extended to support named field access. No other Array objects in memory are touched—only those Arrays returned as tuples are extended with ArrayFields.

Finally I was able to write readable code like:

require 'arrayfields'
require 'sqlite'

...

query = 'select * from jobs order by submitted asc'

tuples = db.execute query

tuples.each do |tuple|

  jid, command = job['jid'], job['command']

  run command

  job['state'] = 'finished'

 # quoted list of job's fields

  values = job.map{|val| "'#{ val }'" }.join ','

  sql = "insert into done values( #{ values } )"

  db.execute sql

end

and elegant code like:

tuples.sort_by{ |tuple| tuple['submitted'] }.reverse

This is no mere convenience; using arrays over hashes is faster, requires about 30% less memory, and makes many operations on tuples more natural to code. Allowing keyword access to the arrays makes the code more readable and frees the developer from remembering field positions or, worse, having to update code if a change to the database schema should change the order of fields. Finally, a reduction in lines of code almost always aids both development and maintenance.

Page 1 of 3  >>

Ruby Code & Style | Discuss | Print | Email | First Page | Previous | Next

Sponsored Links



Google
  Web Artima.com   
Copyright © 1996-2014 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use - Advertise with Us