Ruby Buzz Forum - Planting the seeds

Recently, the DHH announced that 3.0 would include a standardized way for loading seed data into your application so that you didn’t have to clutter your database migrations. We’ve tried a variety of approaches over the past several years and decided that we’d look into mirroring what 3.0 would include, but we’ve been working on a 2.3.x application the past month.

Instead of rolling our own solution, we decided to take a peak at what Rails 3.0 would include and realized it pretty much consisted of just one rake task that loaded a ruby file that we could execute some code in. Simple enough.

So, Carlos and I copied the following from Rails 3.0 and tossed it into lib/tasks/seed.rake.

namespace :db do
  desc 'Load the seed data from db/seeds.rb'
  task :import_seed_data => :environment do
    seed_file = File.join(Rails.root, 'db', 'seeds.rb')
    load(seed_file) if File.exist?(seed_file)
  end
end

This just looks for a file named seeds.rb in the db/ directory. We opted to not conflict with the 3.0 naming of the task file for now and went with db:import_seed_data for whenever we decided to upgrade the application we could test it out before switching to db:seed.

Populating Seed Data Approaches

The db/seeds.rb file is your playground. We’ve been evolving our seed file on a new project and it’s been great at allowing us to populate a really large data. Here are a few approaches that we’ve taken to diversify our data so that when we’re working on UI, we can have some diversified content.

Use the names of real people

We’re using the Octopi gem to connect to github, collect all the names of people that follow me there, and using their names to seed our development database.

@robby_on_github = Octopi::User.find('robbyrussell')

# add a bunch of semi-real users
@robby_on_github.followers.each do |follower|
  github_person = Octopi::User.find(follower)
  next if github_person.name.nil?

  # split their name in half... good enough (like the goonies)
  first_name = github_person.name.split(' ')[0]
  last_name = github_person.name.split(' ')[1]
  new_person = Person.create(:first_name => first_name, :last_name => last_name, :email => Faker::Internet.email,
                             :password => 'secret', :password_confirmation => 'secret',
                             :github_username => follower, :website_url => github_person.blog)
  # ...
end

Use Faker for Fake data

You may have noticed in the previous code sample, that I used Faker in that code. We are using this a bunch in our seed data file. With Faker, you can generate a ton of fake data really easy.

person.links.create(:title => Faker::Lorem.words(rand(7)+1).join(' ').capitalize,
                    :url => "http://#{Faker::Internet.domain_name}/",
                    :description => Faker::Lorem.sentences(rand(4)+1).join(' '))

We might toss something like that into a method so that we can do the following:

@people = Person.find(:all)

500.times do
  generate_link_for(@people.sort_by{rand}[0])
end

...and we’ll get 500 links added randomly across all of the people we added to our system. You can get fairly creative here.

For example, we might even wanted random amounts of comments added to our links.

def generate_link_for(person)
  link = person.links.create(:title => Faker::Lorem.words(rand(7)+1).join(' ').capitalize,
                             :url => "http://#{Faker::Internet.domain_name}/",
                             :description => Faker::Lorem.sentences(rand(4)+1).join(' '))

  # let's randomly add some comments...
  if link.valid?
    rand(5).times do
      link.comments.create(:person_id => @people.sort_by{rand}[0].id,
                           :body => Faker::Lorem.paragraph(rand(3)+1))
    end
  end
end

It’s not beautiful, but it gets the job done. It makes navigating around the application really easy so that we aren’t having to constantly input new data all the time. As mentioned, it really helps when we’re working on the UI.

Your ideas?

We’re trying a handful of various approaches to seed our database. If you have some fun ways of populating your development database with data, we’d love to hear about it.


	Web Artima.com