The Artima Developer Community
Sponsored Link

Ruby Buzz Forum
Create A Sitemap For Your Rails Application

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
John Topley

Posts: 233
Nickname: johntopley
Registered: Jul, 2003

John Topley is embarking on a journey to become a J2EE master.
Create A Sitemap For Your Rails Application Posted: Mar 10, 2010 10:23 AM
Reply to this message Reply

This post originated from an RSS feed registered with Ruby Buzz by John Topley.
Original Post: Create A Sitemap For Your Rails Application
Feed Title: John Topley's Weblog
Feed URL: http://johntopley.com/posts.atom
Feed Description: John Topley's Weblog - some articles on Ruby on Rails development.
Latest Ruby Buzz Posts
Latest Ruby Buzz Posts by John Topley
Latest Posts From John Topley's Weblog

Advertisement

The Sitemap protocol was introduced by Google in 2005, but is now supported by all of the major search engines. Unrelated to a traditional website sitemap navigation page, it defines an XML schema for listing the URLs within a site, including metadata such as when a URL as last updated, therefore allowing search engines to crawl the site more intelligently.

Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site. It’s easy to add support for a dynamically generated Sitemap to a Rails application. This post documents how I went about it for this site, whereby blog posts are stored in instances of a Post model. Obviously you’ll likely need to adapt some of these instructions for your own application.

The first step is to set up a dedicated route and controller for the sitemap. Add the following route towards the bottom of config/routes.rb:

map.sitemap '/sitemap.xml', :controller => 'sitemap'

This routes all requests for /sitemap.xml to a controller dedicated to serving the sitemap. Next, create app/controllers/sitemap_controller.rb:

class SitemapController < ApplicationController
  layout nil
  
  def index
    headers['Content-Type'] = 'application/xml'
    latest = Post.last
    if stale?(:etag => latest, :last_modified => latest.updated_at.utc)
      respond_to do |format|
        format.xml { @posts = Post.sitemap.published }
      end
    end
  end
end

The index action gets the latest Post model instance and then checks the HTTP request for staleness using ActionController’s stale? method, which does so by checking the HTTP ETag and Last-Modified headers. This ensures that the Sitemap is only served to the client if it contains fresh content, otherwise an HTTP 304 Not Modified status is returned. The @posts instance variable is set to the result of executing the chained sitemap and published named scopes within the Post model. This is how those named scopes are defined in app/models/post.rb:

class Post < ActiveRecord::Base
  named_scope :published, :conditions => { :published => true }
  named_scope :sitemap, :select => 'slug, created_at, updated_at',
              :limit => 49999 # +1 for About page to make 50,000
end

The sitemap named scope only selects the slug, created_at and updated_at columns because they’re all that’s required within the generated XML. I also limit it to 49,999 results because as you’ll see shortly the view template includes a hard-coded reference to my site’s static About page. The Sitemap protocol specifies that each Sitemap file contain no much than 50,000 URLs, hence the limit. I’ll worry about how to handle more than 50,000 posts in the extremely unlikely event that I write that many!

The final piece of the puzzle is the view template. Although originally using Builder for view generation, I switched to using Haml (HTML Abstraction Markup Language) because it’s simpler and faster. Haml is based on the idea of removing all duplication from markup and of using meaningful indentation to describe structure. This is what the app/views/sitemap/index.xml.haml file looks like:

- base_url = "http://#{request.host_with_port}"
!!! XML
%urlset{:xmlns => "http://www.sitemaps.org/schemas/sitemap/0.9"}
  - for post in @posts
    %url
      %loc #{base_url}#{post.permalink}
      %lastmod=post.last_modified
      %changefreq monthly
      %priority 0.5
  
  -# About page
  %url
    %loc #{base_url}/about
    %lastmod 2009-08-28
    %changefreq monthly
    %priority 0.5

This small quantity of Haml generates a Sitemap XML file that looks like the extract below. Job done!


  
  
    http://johntopley.com/2010/02/02/the-apple-ipad
    2010-02-02
    monthly
    0.5
  
  
    http://johntopley.com/2010/01/14/the-best-of-twitter-2009
    2010-01-22
    monthly
    0.5
  
  ...
  
    http://johntopley.com/about
    2009-08-28
    monthly
    0.5
  

Read: Create A Sitemap For Your Rails Application

Topic: The Best Of Twitter 2009 Previous Topic   Next Topic Topic: Stuart Halloway talks to RubyLearning's Clojure Course Participants

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use