Archive for July, 2007

Rails fast test suite

Tuesday, July 31st, 2007

Rails convention suggests Model classes that traditionally extend ActiveRecord::Base with the corresponding unit tests depending on the database. I prefer to separate the business from the data access layers by having a few ActiveRecord children handling persistence (not unlike repositories) with the bulk of the application logic residing in classes that are unaware of the database.

I find it useful to add a separate test suite for handling the tests for those classes by creating a new directory, e.g. test/fastunit and a Rake test task (lib/tasks/fastunit.rake) that runs the tests in it.

namespace :test do
  Rake::TestTask.new('fastunit') do |t|
    t.pattern = 'test/fastunit/*_test.rb'
  end
end

Rake::Task[:test].prerequisites << 'test:fastunit'

Adding test:fastunit to the prerequisites of the main test task ensures the suite will be ran as part of the full test build.

One of the advantages to this approach is the instant feedback of running rake test:fastunit - these tests are inherently faster. I run these tests often whilst developing to ensure things are going smoothly and only run the full build before checkins.

This technique renders the built-in test:units task slightly ambiguous as ActiveRecord tests are now presented more like functional tests, which is not entirely incorrect, because they do, after all, hit the database.

Unobtrusive AJAX with jQuery and Rails

Sunday, July 29th, 2007

Whilst having become one of the de facto practices for rich web based user experience, AJAX presents a valuable method for web application performance optimization. In this article, I will be discussing using jQuery alongside Rails in an effort to create fast, responsive AJAX operations, while keeping the javascript as unobtrusive to the application’s mark up as possible.

Let’s start by creating a new Rails project with one Model, Bookmark that has one property, link.

rails -d sqlite3 bookmarks
cd bookmarks
script/generate model Bookmark link:string
script/generate controller Bookmarks
rake db:migrate

We create app/views/layouts/bookmarks.rhtml, the layout for the Bookmarks Controller where we can include the javascript libraries needed by the application. These are jQuery, the jQuery Form Plugin and application.js which will contain any custom javascript we will be writing.

<html>
<head>
  <meta http-equiv="Content-type" content="text/html; charset=utf-8">
  <title>index</title>
  <script type="text/javascript" src="/javascripts/jquery-1.1.3.1.pack.js"></script>
  <script type="text/javascript" src="/javascripts/jquery.form.js"></script>
  <script type="text/javascript" src="/javascripts/application.js"></script>
</head>
<body>
<%=yield%>
</body>
</html>

Next, we add app/views/bookmarks/index.rhtml and one partial, app/views/bookmarks/_bookmarks_list.rhtml which will contain the list of bookmarks that will be updated with AJAX calls to the controller’s methods.

<form method="post" action="/bookmarks/add" id="add-bookmark">
  <label for="bookmark-link">Bookmark:</label>
  <input type="text" name="bookmark[link]” id=”bookmark-link”/>
  <input type=”submit” value=”Add”>
</form>
<div id=”bookmarks-list”>
  <%= render :partial => ‘bookmarks_list’%>
</div>
<% unless @bookmarks.empty? %>
<ul>
  <% for b in @bookmarks %>
  <li>
    <a href="<%= b.link %>"><%= b.link %></a>
    <a href="/bookmarks/delete/<%= b.id %>" class="delete">delete</a>
  </li>
  <%end%>
</ul>
<% end %>

Below is a simplified version of the Controller that handles server side support for add and delete operations.

class BookmarksController < ApplicationController
  def index
    @bookmarks = Bookmark.find(:all)
  end

  def add
    if Bookmark.create(params[:bookmark]).valid?
      @bookmarks = Bookmark.find(:all)
      render :partial => “bookmarks_list”
    else
      render :text => “Oops…”, :status => “500″
    end
  end

  def delete
    Bookmark.destroy(params[:id])
    @bookmarks = Bookmark.find(:all)
    render :text => “”
  end
end

By rendering partials we are only updating a desired target sub-section of the mark up, cutting down the response content to a bare minimum and by doing so we should achieve a performance boost. Specifying a 500 HTTP error status code when things go wrong will allow our javascript to interpret a response as problematic.

Finally, here’s the javascript for adding and deleting bookmarks and displaying error messages.

function hijackDeleteBookmarkLinks() {
  $('#bookmarks-list a.delete').bind('click', function() {
    var deleteLink = $(this)
    $.ajax({
      type: 'POST',
      url: deleteLink.attr('href'),
      success: function(){deleteLink.parent().remove()}
    })
    return false
  })
}

function hijackAddBookmarkForm() {
  $('#add-bookmark').submit(function() {
    $(this).ajaxSubmit({
      target: '#bookmarks-list',
      clearForm: true,
      success: hijackDeleteBookmarkLinks,
      error: displayError
    })
    return false
  })
}

function displayError(request, errorType) {
  var msg = '<div class="error">'+request.responseText+'(click to close)</div>'
  $('#bookmarks-list').append(msg)
  $('.error').click(function(){$(this).hide()})
}

$(function() {
  hijackAddBookmarkForm()
  hijackDeleteBookmarkLinks()
})

The hijackDeleteBookmarkLinks function intercepts click events on any link with class delete inside the bookmarks-list div and makes an asynchronous call to the link’s original URL. Subsequent to a successful response, we dynamically remove the link list entry from the mark up.

It is worth noting the value of the url option to any of our AJAX calls. This should allow us to modify the request URL to anything we like, meaning that we can have different actions corresponding to AJAX or non AJAX calls, making the application work as expected even if javascript is not available on the client. I have omitted this step for the sake of simplicity.

The target option in hijackAddBookmarkForm specifies the element to be updated with the contents of the response to the AJAX call. Also, we need to call hijackDeleteBookmarkLinks on the success of the AJAX call to ensure that any newly created links are bound by the function.

Issues to consider

The example has been simplified for demonstration purposes.

The proposed architecture tightly couples the client-side with the server-side implementation of the application. We are writing actions intended to be used solely by javascript and the javascript itself expects partial HTML to be returned by the responses. The API is nowhere near being RESTful.

We could have separate actions responsible for deleting, creating and listing bookmarks. Those actions could also return something more flexible, like JSON. The reason I chose to have create returning the updated list of bookmarks as part of the create response is to avoid the second request-response roundtrip that would incur if creation and listing were separated. I favored partials over JSON to avoid having to operate on the response. This allows for simpler javascript.

It pays to consider the purpose of the API and based on that decide to compromise some core values in favor of others. When writing the piece of code that inspired this article my goal was not to create a public RESTful API. This code was UI driven and the intention was to create a rich, fast user interface that would work as expected both with javascript turned on or off.

As a side-note, I chose not to use Rails’ respond_to method because I prefer actions (methods in general) that are responsible for doing one thing. This might introduce some duplication and any maintenance issues that come along with it, but in my case the actions had enough differences to justify breaking them up. This is a personal preference and not meant to discourage anyone from using respond_to.

Amazon S3 persistent Ruby objects

Saturday, July 7th, 2007

I have occasionally participated in conversations around the subject of the database as a product with an expiry date, destined to eventually be replaced by highly distributed data storage models. Given the current technological state, this sounds much a like science fiction scenario, but services like AWS S3 bring the idea closer to science and further from fiction.

Although S3’s data storage and retrieval model looks presently better suited for larger units of data (e.g. media content), it would be interesting to investigate how it could be applied as an Object persistence service.

In the following example, we will use Ruby’s AWS::S3 library to create a class resembling Ruby on Rails’ ActiveRecord::Base, allowing Objects to be persisted to and retrieved from an S3 Bucket.

Objects need to be somehow serialized and de-serialized in order to be successfully stored and retrieved from S3. YAML is one of the standard means to object serialization in Ruby, so we will be making use of it.

require 'yaml'
require 'aws/s3'

class S3Record
  attr_accessor :id

  def initialize(attrs = {})
    attrs.each { |k, v| instance_eval "self.#{k} = v" }
  end
end

Requiring YAML provides S3Record with, among other functionality, a to_yaml instance method.

Next, we add the ability to persist an instance of S3Record to S3.

def create
  AWS::S3::S3Object.find(@id, self.class.name)
  raise "Object with key [#{@id}] already exists”
rescue AWS::S3::NoSuchKey
  AWS::S3::S3Object.store(@id, self.to_yaml, self.class.name)
end

The first parameter to the AWS::S3::S3Object#find method is the unique identifier by which the Object will be keyed when stored and will be the one used to find the object. The second parameter is the name of the Bucket in which the object will be stored. Here, we use the name of our class as the bucket name. This implies that a bucket with a matching name to this of our class must exist before we can start storing objects.

The AWS API will raise a NoSuchKey error in the case where the specified key does not exist in the specified bucket. We make use of this in order to ensure that we will not be overwriting any existing objects. Also, note the call to self.to_yaml. This is the actual data of the Object as it is being stored in S3.

Next, we provide the ability to retrieve objects.

def self.find(id)
  YAML.load(AWS::S3::S3Object.find(id, self.name).value)
end

def self.find_all(options = {})
  bucket = AWS::S3::Bucket.find(self.name, options)
  bucket.objects.map { |s3_obj| YAML.load(s3_obj.value) }
end

We retrieve one object by its identifier and the name of its bucket (AWS::S3::S3Object.find(id, self.name)) and return it in its de-serialized form. The same applies to finding many objects from one Bucket. The options Hash accepts the following parameters: :max_keys - the maximum number of keys to retrieve, :prefix - restrict the response to contain results that begin with a specified prefix, and :marker - restrict the response to results that occur alphabetically after this value (see find (AWS::S3::Bucket)).

Methods to update, delete and count should be self explanatory.

def update
  AWS::S3::S3Object.store(@id, self.to_yaml, self.class.name)
end

def self.delete(id)
  AWS::S3::S3Object.delete(id, self.name)
end

def self.count
  AWS::S3::Bucket.find(self.name).objects.size
end

In action, we could operate on objects we would like to persist on S3 in a way similar to the following.

class Genre < S3Record
  attr_accessor :name
end

rock = Genre.new(:id => 1, :name => “rock”)
rock.create

rock = Rock.find(1)
rock.name = “heavy rock”
rock.update

#etc…

What about transactions? Indexing? More elaborate querying? All things databases are well established for? Bandwidth issues?

There are probably no definitive answers to any of these questions, although one could suggest that transaction management is not that hard to implement, indexing can happen - often more efficiently - outside the database (see Lucene, Feret) and bandwidth will not be an issue forever.

A reason prohibiting the above example from being realistic is the present S3 billing model ($0.01 per 1,000 PUT or LIST requests, $0.01 per 10,000 GET and all other requests). It does not seem financially preferable for an application that will need to store and retrieve vast numbers of small resources in great frequency.

The afore-mentioned costs are not applicable if the application is hosted on Amazon’s EC2 (Elastic Compute Cloud), as data transferred between Amazon S3 and Amazon EC2 is free of charge.