Also on twitter ( twitter.com/nutrun )

Archive for January, 2009

Live component rotation

Thursday, January 22nd, 2009

Many applications comprise of a number of components, the majority of which are shared by others in the system. Different parts of the system exercise their collaborators in a variety of ways, think of a website where data is periodically processed by jobs and stored in a database while presentation modules handle rendering the data in ways meaningful to end users. Shared resources can yield the unwanted side effect of performance degradation when a given component is being pushed too hard to perform part of its tasks, affecting each piece of the system that depends on it. In the shared database website example, the website might suffer low response times while potentially heavy on the database processing jobs are running.

One way of getting around this problem involves creating more than one instances of the shared resource, one of which is considered “live”, the one the system’s clients interact with, and perform expensive operations on a copy which will itself become live the moment these operations conclude. This solution does not apply to every situation but can be useful in scenarios where real time is not a concern. In the example website’s case, we can create a copy of the database on which we run the processing jobs. The front end components run off the “stale”, live database copy whose performance is not affected by the jobs. Once the jobs complete we can switch databases and repeat the live component rotation process as needed. Live component rotation also nicely lends itself to distribution, as component copies can exist on different physical hosts.

Virtualization and cloud computing make this method all the more interesting. Imagine hosting a database server on Amazon EC2 with its static data stored on an EBS volume. We can snapshot the EBS volume, fire up a new EC2 instance, attach the snapshot to it, run the job and rotate live database instances once the jobs are complete with most parts of the system never having to worry about the costly operations taking place.

Code on demand

Saturday, January 10th, 2009

Code-on-demand on the web is commonly encountered in the form of JavaScript or applets. As we examine the web as a platform for services spanning beyond the typical server/browser interaction, it’d be interesting to further explore the code-on-demand constraint from a service integration perspective.

One of the advantages of offering executable code alongside a service’s data is client simplification by code reuse. For example, we can distribute a library that’s specific to the data on offer, so interested clients can make use of that functionality and avoid having to re-implement it. Another advantage is distributing computational load, which would otherwise have to be handled by the server, to clients.

To put things into perspective, consider a simplistic web API call that lists guitar models. Much like a JavaScript include, the response to http://example.com/guitars contains a line which advertises a guitar model Ruby library available at /libguit.rb.

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
  <head>
    <title>Guitars</title>
    <script type="text/ruby" charset="utf-8" src="/libguit.rb"></script>
  </head>
  <body>
    <ul id="guitars">
      <li>SG</li>
      <li>Les Paul</li>
      <li>Tele</li>
      <li>Strat</li>
    </ul>
  </body>
</html>

The libguit library has one method for iterating over an alphabetically sorted list of guitars.

module LibGuit
  class List
    def initialize(guitars)
      @guitars = guitars
    end

    def each_guitar_alphabetically(&block)
      @guitars.sort.each(&block)
    end
  end
end

Interested clients can load and use the library together with the retrieved data. Code-on-demand is an optional constraint, so clients that cannot interpret the code, Ruby in this case, or are not interested in using the library can safely ignore it without side effects.

require "rubygems"
require "hpricot"
require "open-uri"

doc = Hpricot(open("http://example.com/guitars"))

libguit_address = (doc / 'script[@type="text/ruby"]')[0][:src]
libguit_src = open("http://example.com#{libguit_address}").read
eval(libguit_src)

guitars = (doc / "#guitars li").map { |e| e.html }
LibGuit::List.new(guitars).each_guitar_alphabetically { |g| puts g }

This is a superficial example, but imagine a service which advertises an e-commerce website’s daily updated catalog of products. Instead of clients making queries like /products.xml?category=sports&sort=price, they could once a day download a zipped version of the day’s entire catalog and a library to manipulate its entries, relieving the service from any further requests and at the same time avoid maintenance costs, in case the data’s structure changes, as long as this is well abstracted by the on-demand library.

At this point many would voice well founded, security implication based objections. Although one could propose a security system reminiscent to that of applets, I would opt for a controlled environment where trust is granted, such as inter-department service offer/consumption inside the company. Also, in an Internet where many of us store our private email on Gmail or trust Amazon’s S3 with mission critical data, I wouldn’t have a problem dynamically loading code provided by, say, Amazon. It’s not very difficult to put basic safeguards in place to avoid catastrophic effects and, in any case, every option is viable as long as the benefits outweigh the costs.