Nov 08 2008

Rack cache headers

Rack is an interface between web servers and Ruby web frameworks. The HTTP protocol, amongst other things, defines requirements on HTTP caches in terms of header fields that control cache behavior. The purpose of this article is to demonstrate a possible implementation of a piece of Rack Middleware which enables web application developers to configure a web application's resource cache related headers in a non obtrusive, centralized manner.

Rack supports the notion of Middleware, pieces of code that sit between the HTTP request and response life cycle. Rack::Lint, for example, validates an application's requests and responses according to the Rack specification.

Rack::Handler::Mongrel.run(
  Rack::Lint.new(app), :Host => "0.0.0.0", :Port => 9999
)

Similarly, if we were to implement a cache header producing layer on top of Rack we'd end up with a construct similar to the following.

Rack::Handler::Mongrel.run(
  Rack::Lint.new(
    Rack::CacheHeaders.new(app)
  ), :Host => "0.0.0.0", :Port => 9999
)

Here's a possible way of configuring how an application provides HTTP caching headers based on URL path patterns.

Rack::CacheHeaders.configure do |cache|
  cache.max_age("/rock", 3600)
  cache.expires("/metal", "16:00")
end

Following is a potential implementation for the above.

module Rack
  class CacheHeaders
    def initialize(app)
      @app = app
    end

    def call(env)
      result = @app.call(env)
      header = Configuration[env['PATH_INFO']].to_header
      result[1][header.key] = header.value
      result
    end

    def self.configure(&block)
      yield Configuration
    end

    class Configuration
      def self.max_age(path, duration)
        paths[path] = MaxAge.new(duration)
      end

      def self.expires(path, date)
        paths[path] = Expires.new(date)
      end

      def self.[](key)
        paths[key]
      end

      def self.paths
        @paths ||= {}
      end
    end
    
    class MaxAge
      def initialize(duration)
        @duration = duration
      end
      
      def to_header
        Header.new("Cache-Control", "max-age=, must-revalidate")
      end
    end
    
    class Expires
      def initialize(date)
        @date = date
      end
      
      def to_header
        Header.new("Expires", Time.parse(@date).httpdate)
      end
    end
    
    class Header < Struct.new(:key, :value);end
  end
end

The code below is a minimal Rack based application.

require "rubygems"
require "rack"

app = proc {|env| [200, {"Content-Type" => "text/plain"}, "hello"]}

Rack::Handler::Mongrel.run(
  Rack::Lint.new(
    Rack::CacheHeaders.new(app)
  ), :Host => "0.0.0.0", :Port => 9999
)

In order to observe the caching related headers the application's responses are decorated with we can use curl or something similar, i.e curl -I http://0.0.0.0:9999/rock or curl -I http://0.0.0.0:9999/metal. Output should look something like the following.

air:~ gmalamid$ curl -I http://0.0.0.0:9999/rock
HTTP/1.1 200 OK
Connection: close
Date: Sat, 08 Nov 2008 00:51:23 GMT
Cache-Control: max-age=3600, must-revalidate
Content-Type: text/plain
Content-Length: 5

air:~ gmalamid$ curl -I http://0.0.0.0:9999/metal
HTTP/1.1 200 OK
Connection: close
Date: Sat, 08 Nov 2008 00:51:16 GMT
Content-Type: text/plain
Expires: Sat, 08 Nov 2008 16:00:00 GMT
Content-Length: 5

Understanding and employing HTTP cache configuration not only enables harnessing the power of tools like Varnish or Squid, it also makes good citizens in a diverse ecosystem of HTTP aware browsers and caches outside an application's knowledge or control.