Creating DSLs with Ruby

by Jim Freeze

March 16, 2006

Summary

Broadly speaking, there are two ways to create a DSL. One is to invent a syntax from scratch, and build an interpreter or compiler. The other is to tailor an existing general-purpose language by adding or changing methods, operators, and default actions. This article explores using the latter method to build a DSL on top of Ruby.

A DSL, or domain specific language, is a (usually small) programming or description language designed for a fairly narrow purpose. In contrast to general-purpose languages designed to handle arbitrary computational tasks, DSLs are specific to particular domains. You can create a DSL in two basic ways:

Invent a DSL syntax from scratch, and build an interpreter or compiler.
Tailor an existing general-purpose language by adding or changing methods, operators, and default actions.

An advantage to the second approach is that you save time because you don't have to write and debug a new language, leaving you more time to focus on the problem confronting the end-user. Among the disadvantages is that the DSL will be constrained by the syntax and the capabilities of the underlying general-purpose language. Furthermore, building on another language often means that the full power of the base language is available to the end-user, which may be a plus or minus depending on the circumstances. This article explores using the second approach to build a DSL on top of Ruby.

Describing Stackup Models

At my job, where I work as an interconnect modeling engineer, we needed a way to describe the vertical geometric profile of the circuitry on a semiconductor wafer. Descriptions were kept in a stackup model file (We coined the word stackup because of the logical way the metal wires are constructed by stacking layers on top of each other in the fabrication process). The issue was that each vendor had their own format for describing a stackup; but we wanted a common format so we could convert between the various file types. In other words, we needed to define a common stackup DSL and write a program that could export from our stackup format to any of the other vendors’ stackup format.

The vendors did not use a sophisticated DSL language, but instead their languages contained only static data elements in a mostly flat textual database. Their file formats did not allow for parameterized types, variables, constants, or equations. Just static data. Further, the format was overly simple. It was either line based or block based with one level of hierarchy.

We started out describing our stack-up format with limited ambition since we only needed to meet the vendors’ level of implementation. But once we saw the benefit of having a more expressive language, we quickly augmented our format. Why were we able to do this, but not the vendors? I believe it was because we used Ruby as a DSL, rather than start from scratch using C as the vendors did. Granted, other languages could have been used, but I don’t think the finished product would have been as elegant; the selection of the general-purpose language is a critical step.

I also believe the vendors’ development speed was handicapped from using C, prompting them to keep their stackup syntax simple to parse. Perhaps, not coincidentally, many of the vendors used simple syntax constructs in their file formats common to many DSLs. Because they occur so often, we are going to first look at how we can mimic these in Ruby before we move into more sophisticated language constructs.

Line-based and Block-level DSL Constructs

Line-based constructs are a way to assign a value or a range of values to a parameter. Among the vendors’ files we were looking at, the following formats were used:

parameter = value
parameter value
parameter min_value max_value step_value

Formats 1 and 2 are equivalent except for the implied ’=’ in 2. Format 3 assigned a range of values to the parameter.

The more complicated formats contained a block construct. The two that we encountered could be manually parsed with a line-based parser and a stack, or a key-letter and word parser with a stack. These two formats are illustrated below:

begin
  type = TYPE
  name = NAME
  param1 = value1
  param2 = value2
  ...
end

One of the block formats used “C” style curly braces to identify a block, but parameter/value pairs were separated by white space.

TYPE NAME {param1 = value1 param2 = value2 }

Third Time’s a Charm

When we were building our DSL for the stackup file we solved the problem three times. First, we wrote our own parser and decided it was too much work to maintain. Not only the code, but also the documentation. Since our DSL was sufficiently complicated, it wasn’t obvious how to use all its features and therefore it had to be copiously documented.

Next, for a short period, we implemented the DSL in XML. This removed the need for us to write our own parser, as XML is universally understood, but it contained too much noise and obscured the contents of the file. Our engineers found it too difficult to mentally task-switch between thinking about the meaning of the stackup and mentally parsing XML. For me, the lesson learned was that XML is not to be read by humans and probably a bad choice for a DSL, regardless of the parsing benefits.

Finally, we implemented the DSL in Ruby. The implementation was quick since Ruby provides the parsing. Documentation on the parser (i.e. Ruby) was not required since it is already available. And the final DSL was easily understood by humans, yet compact and versatile.

So, let’s build a DSL in Ruby that lets us define ‘parameter = value’ statements. Consider the following hypothetical DSL file.

% cat params_with_equal.dsl 
name = fred
parameter = .55

This is not valid Ruby code, so we need to modify the syntax slightly so Ruby accepts it. Let’s change it to:

% cat params_with_equal.dsl
name      = "fred" 
parameter = 0.55

Once we get the DSL to follow valid Ruby syntax, Ruby does all the work to parse the file and hold the data in a way that we can operate on it. Now let’s write some Ruby code to read this DSL.

First we want to encapsulate these parameters somehow. A good way is to put them into a class. We’ll call this class MyDSL.

% cat mydsl.rb
class MyDSL
  ...
end#class MyDSL

From the developer’s perspective, we want a simple and straightforward way to parse the DSL file. Something like:

my_dsl = MyDSL.load(filename)

So, let’s write the class method load :

def self.load(filename)
    dsl = new
    dsl.instance_eval(File.read(filename), filename)
    dsl
end

The class method load creates a MyDSL object and calls instance_eval on the DSL file (params_with_equal.dsl above). The second argument to instance_eval is optional and allows Ruby to report a filename on parse errors. An optional third argument (not shown) gives you the ability to provide a starting line number for parse error reporting.

Is this code going to work? Let’s see what happens:

% cat dsl-loader.rb
require 'mydsl'

my_dsl = MyDSL.load(ARGV.shift) # put the DSL filename on the command line
p my_dsl
p my_dsl.instance_variables
% ruby dsl-loader.rb params_with_equal.dsl
#<MyDSL:0x89cd8>
[]

What happened? Where did name and parameter go? Well, since name and parameter are on the left hand side of the equals sign, Ruby thinks they are local variables. We can tell Ruby otherwise by writing self.name = "fred" and self.parameter = 0.55 in the DSL file or we can impose upon the user to do this using the '@' symbol:

@name      = "fred" 
@parameter = 0.55

But that is kind of ugly and, to me, about the same as if we had written

$name      = "fred" 
$parameter = 0.55

Another way to let Ruby know the context of these methods is to declare the scope explicitly by yielding self (the MyDSL object instance) to a block. To do this, we will need to add a top level method to jump start our DSL and put the contents inside of the attached block. Our modified DSL now looks like:

% cat params_with_equal2.dsl
define_parameters do |p|
  p.name      = "fred" 
  p.parameter = 0.55
end

where we have defined define_parameters as an instance method:

% cat mydsl2.rb
class MyDSL
  def define_parameters
    yield self
  end

  def self.load(filename)
    dsl = new
    dsl.instance_eval(File.read(filename), filename)
    dsl
  end
end#class MyDSL

And we change the require in dsl-loader to use the new version of the MyDSL class in mydsl2.rb:

% cat dsl-loader.rb
require 'mydsl2'

my_dsl = MyDSL.load(ARGV.shift)
p my_dsl
p my_dsl.instance_variables

Theoretically, this should work, but let’s test it out just to make sure.

% ruby dsl-loader.rb params_with_equal2.dsl
params_with_equal2.dsl:2:in `load': undefined method `name=' for #<MyDSL:0x26300> (NoMethodError)

Oops. We forgot the accessors for name and parameter . Let’s add those and look at the complete program:

% cat mydsl2.rb
class MyDSL
  attr_accessor :name, :parameter

  def define_parameters
    yield self
  end

  def self.load(filename)
    # ... same as before
  end
end

Now, let's test it again.

% ruby dsl-loader.rb params_with_equal2.dsl
#<MyDSL:0x25ec8 @name="fred", @parameter=0.55>
["@name", "@parameter"]

Success! This now works, but we have added two extra lines to the DSL file and have added some noise with the ‘p.’ notation. This notation is better suited when there exists multiple levels of hierarchy in the file and there is actually a need for and a benefit from explicitly specifying context. In our simple case we can implicitly define context and leave no doubt for Ruby that name and parameter are methods. We do this by removing the ’=’ sign and write the DSL file as

% cat params.dsl
name      "fred" 
parameter 0.55

Now we need to define a new type of accessor for name and parameter . The trick here is to realize that name without an argument is a reader for @name , and name with one or more arguments is a setter for @name . (Note: it is convenient to use this methodology even when multiple levels of hierarchy are present and context is explicitly declared.) We define the accessors for name and parameter by removing the attr_accessor line and adding the following code:

% cat mydsl3.rb
class MyDSL
  def name(*val)
    if val.empty?
      @name
    else
      @name = val.size == 1 ? val[0] : val
    end
  end

  def parameter(*val)
    if val.empty?
      @parameter
    else
      @parameters = val.size == 1 ? val[0] : val
    end
  end

  def self.load(filename)
    # ... same as before
  end
end#class MyDSL

If either name or parameter is seen without arguments, they will return their value. If arguments are present, they will be assigned the value when a single argument is present, or they will be assigned to an array of values for multiple arguments.

Let’s run our sample parser (changed to require the file mydsl3.rb) to test our handiwork:

% ruby dsl-loader.rb params.dsl
#<MyDSL:0x25edc @parameter=0.55, @name="fred">
["@parameter", "@name"]

Success again! But defining these accessors explicitly is a pain. So let’s define a custom DSL accessor and make it available to all classes. We do this by putting the method in the Module class.

% cat dslhelper.rb
class Module
  def dsl_accessor(*symbols)
    symbols.each { |sym|
      class_eval %{
        def #{sym}(*val)
          if val.empty?
            @#{sym}
          else
            @#{sym} = val.size == 1 ? val[0] : val
          end
        end
      }
    }
  end
end

The above code simply defines the dsl_accessor method which creates our DSL specific accessors. We now plug it into the application and use dsl_accessor instead of attr_accessor to get:

% cat mydsl4.rb
require 'dslhelper'

class MyDSL
  dsl_accessor :name, :parameter

  def self.load(filename)
    # ... same as before
  end
end#class MyDSL

Again, we update the require statement in dsl-loader.rb to load the mydsl4.rb file and run the loader:

% ruby dsl-loader.rb params.dsl 
#<MyDSL:0x25edc @parameter=0.55, @name="fred">
["@parameter", "@name"]

This is all well and good, but what if we don’t know the parameter names in advance? Depending on the use cases for the DSL, parameter names may be generated by the user. Never fear. With Ruby, we have the power of method_missing. A two-line method added to MyDSL will define a DSL attribute with dsl_accessor on demand. That is, if a value is to be assigned to a (thus far) non-existent parameter, method_missing will define the getters and setters and assign the value to the parameter.

% cat mydsl5.rb
require 'dslhelper'

class MyDSL
  def method_missing(sym, *args)
    self.class.dsl_accessor sym
    send(sym, *args)
  end

  def self.load(filename)
    # ... Same as before
  end
end

% head -1 dsl-loader.rb
require 'mydsl5'

% ruby dsl-loader.rb params.dsl
#<MyDSL:0x25b80 @parameter=0.55, @name="fred">
["@parameter", "@name"]

Wow! Doesn't that make you feel good? With just a little bit of code, we have a parser that can read and define an arbitrary number of parameters. Well, almost. What if the end-user doesn't know Ruby and uses parameter names that collide with existing method calls? For example, what if our DSL file contains the following:

% cat params_with_keyword.dsl 
methods %w(one two three)
id      12345

% ruby dsl-loader.rb params_with_keyword.dsl 
params_with_keyword.dsl:2:in `id': wrong number of arguments (1 for 0) (ArgumentError)

Oh, how embarrassing. Well, we can fix this (mostly) in short order with a little help from a class called BlankSlate [ 0], which was initially conceived by Jim Weirich [ 1]. The BlankSlate class used here is a little different than the one introduced by Jim simply because we want to keep a little more functionality around. So we keep seven methods. You can experiment with these to see which ones are absolutely required and which ones we are using just to visualize the contents of our MyDSL object.

% cat mydsl6.rb 
require 'dslhelper'

class BlankSlate
  instance_methods.each { |m| undef_method(m) unless %w(
       __send__ __id__ send class 
       inspect instance_eval instance_variables 
       ).include?(m)
  }
end#class BlankSlate

# MyDSL now inherits from BlankSlate
class MyDSL < BlankSlate
  # ... nothing new here, move along...
end#class MyDSL

Now when we try to load the DSL file that is loaded with keywords, we should get something a little more sensible:

% head -1 dsl-loader.rb 
require 'mydsl6'

% ruby dsl-loader.rb params_with_keyword.dsl 
#<MyDSL:0x23538 @id=12345, @methods=["one", "two", "three"]>
["@id", "@methods"]

And sure enough, we do. This is good news that we can remove spurious methods and free up more possibilities of parameter names for our end-users. However, note that we can't give end-users a completely unrestrained license to use any name for a parameter. This is one of the downsides of using a generic-programming language as a DSL, but I think that an end-user being prohibited from using class as a parameter name has only a small risk of being a deal killer.

Getting More Sophisticated

We are now ready to look at more complex DSL features. Instead of a DSL for manipulating data, let’s look at one that performs a more concrete action. Imagine that we are tired of manually creating a common set of directories and files whenever we start a new project. It would be nice if we had Ruby do this for us. It would even be nicer if we had a small DSL such that we could modify the project directory structure without editing the low-level code.

We begin this project by defining a DSL that makes sense for this problem. The file below is our version 0.0.1 of just such a DSL.

% cat project_template.dsl 
create_project do
  dir "bin" do
    create_from_template :exe, name
  end

  dir "lib" do
    create_rb_file name
    dir name do
      create_rb_file name
    end
  end

  dir "test" 

  touch :CHANGELOG, :README, :TODO
end

In this file, we create a project and add three directories and three files. Inside the “bin” directory we create an executable file with the same name as the project using the :exe template. In the ‘lib’ directory, we create a .rb file, and a directory, both named after the project. Inside that inner directory, another .rb file with the same name as the project. Next, back at the top level, the ‘test’ directory is created, and, finally, three empty files are created.

The methods needed for this DSL are: create_project, dir, create_from_template, create_rb_file and touch. Let’s look at these methods one by one.

The create_project method is our top level wrapper. This method provides scope by letting us put all the DSL code inside a block. (Complete listings are at the end of the article.)

  def create_project()
    yield
  end

The dir method is the workhorse. This method not only creates the directory, it also maintains the current working directory in the @cwd instance variable. Here, the use of ensure allows us to trivially maintain the proper state of @cwd .


  def dir(dir_name)
    old_cwd = @cwd
    @cwd    = File.join(@cwd, dir_name)

    FileUtils.mkdir_p(@cwd)
    yield self if block_given?
   ensure
    @cwd = old_cwd
  end

The touch and create_rb_file methods are the same except that the latter adds ”.rb” to the filename. These methods may be given one or more filenames where the names can be either strings or symbols.

  def touch(*file_names)
    file_names.flatten.each { |file| 
      FileUtils.touch(File.join(@cwd, "#{file}")) 
    }
  end

Finally, the create_from_template method is just a quick dash to illustrate how one may put some actual functionality into a DSL . (See the source listings for the complete code.)

To run and test the code, we build a small test application:

 % cat create_project.rb 
require 'project_builder'

project_name = ARGV.shift
proj = ProjectBuilder.load(project_name)
puts "== DIR TREE OF PROJECT '#{project_name}' =="
puts `find #{project_name}`

And the results are:

 % ruby create_project.rb fred
== DIR TREE OF PROJECT 'fred' ==
fred
fred/bin
fred/bin/fred
fred/CHANGELOG
fred/lib
fred/lib/fred
fred/lib/fred/fred.rb
fred/lib/fred.rb
fred/README
fred/test
fred/TODO

% cat fred/bin/fred 
#!/usr/bin/env ruby

require 'rubygems'
require 'commandline
require 'fred'

class FredApp < CommandLine::Application
  def initialize
  end

  def main
  end
end#class FredApp

Wow! It worked! And with not much effort.

Summary

I work on many projects that require a rather detailed control flow description. For every project, this used to make me pause and consider how to get all this detailed configuration data into the application. Now, Ruby as a DSL is near the top of the list of possibilities, and usually solves the problem quickly and efficiently.

When I was doing Ruby training, I would take the class through a problem solving technique where we would describe the problem in plain English, then in pseudo code, and then in Ruby. But, in some cases, the pseudo code would be valid Ruby code. I think that the high readability quotient of Ruby makes it an ideal language for use as a DSL. And as Ruby becomes known by more people, DSLs written in Ruby will be a favorable way of communicating with an application.

Code listing for project ProjectBuilder DSL:

% cat project_builder.rb 
require 'fileutils'

class ProjectBuilder
  PROJECT_TEMPLATE_DSL = "project_template.dsl"

  attr_reader :name

  TEMPLATES = {
      :exe =>
<<-EOT
#!/usr/bin/env ruby

require 'rubygems'
require 'commandline
require '%name%'

class %name.capitalize%App < CommandLine::Application
  def initialize
  end

  def main
  end
end#class %name.capitalize%App
EOT
    }

  def initialize(name)
    @name          = name
    @top_level_dir = Dir.pwd
    @project_dir   = File.join(@top_level_dir, @name)
    FileUtils.mkdir_p(@project_dir)
    @cwd = @project_dir
  end

  def create_project
    yield
  end

  def self.load(project_name, dsl=PROJECT_TEMPLATE_DSL)
    proj = new(project_name)
    proj = proj.instance_eval(File.read(dsl), dsl)
    proj
  end

  def dir(dir_name)
    old_cwd = @cwd
    @cwd    = File.join(@cwd, dir_name)
    FileUtils.mkdir_p(@cwd)
    yield self if block_given?
  ensure
    @cwd = old_cwd
  end

  def touch(*file_names)
    file_names.flatten.each { |file| 
      FileUtils.touch(File.join(@cwd, "#{file}")) 
    }
  end

  def create_rb_file(file_names)
    file_names.each { |file| touch(file + ".rb") }
  end

  def create_from_template(template_id, filename)
    File.open(File.join(@cwd, filename), "w+") { |f|
      str = TEMPLATES[template_id]
      str.gsub!(/%[^%]+%/) { |m| instance_eval m[1..-2] }
      f.puts str
    }
  end
end#class ProjectBuilder

# Execute as:
# ruby create-project.rb project_name

Resources

[0] BlankSlate is a Ruby class designed to create method-free objects.
http://onestepback.org/index.cgi/Tech/Ruby/BlankSlate.rdoc

[1] Jim Weirich is the creator of BlankSlate, as well as other notable Ruby tools and libraries.
http://onestepback.org

Talk back!

Have an opinion? Readers have already posted 3 comments about this article. Why not add yours?

About the author

Vlad Gilbourd works as a computer consultant, but wishes to spend more time listening jazz :) As a hobby, he started and runs www.corporatenews.com.au website.