Jul 29 2008

Cache watch

Web frameworks like Merb or Rails provide convenient ways for caching output data to static files or other stores, used for improving a web application's performance. Caching is typically handled inside controller classes. With merb-cache, for example, we can cache an entire page by doing something along the lines of:

class Foo < Merb::Controller
  cache_page :index
end

Expiring cached data is handled with a number of instance methods available to controllers, such as expire_page(key) or expire_all_pages. This implies that cache expiration needs to be put explicitly in place inside actions.

The most common event signifying the need for cache expiration is the modification of the underlying data which has at some point been cached. More often than not, this means some sort of write (insert, update, delete) storage operation, which in turn means that cache expiration is closer to storage aware parts of the application rather than controllers. With this in mind, it would be useful to be able to configure cache expiration in a manner similar to that of cache creation, for example:

class Foo < Merb::Controller
  cache_page :index
  cache_watch :foo_store, :bar_store
end

The cache_watch :foo_store, :bar_store line signifies that any cached artifacts associated with this controller need to be expired whenever a data altering operation takes place in the context of the FooStore or BarStore classes.

Approaching data altering operations as events presents a good case for employing the Observer pattern in order to enable cache expiration when such events take place. ActiveRecord, for instance, offers means for adding hooks to persistent objects' life cycle methods in the form of Observers.

class FooObserver < ActiveRecord::Observer
  def after_save(foo)
    expire_cache
  end
end

Putting it all together, we can create a module that enables configuring cache expiration declaratively inside controllers in a way reminiscent to how cache creation is handled.

module CacheInvalidator  
  def cache_watch(controller, *models)
    models.each {|model| (@entries ||= Set.new) << Entry.new(controller, model)}
  end

  def activate!
    @entries.each do |entry|

      return nil if Kernel.const_defined?(entry.class_name)

      entry.log

      observer = Class.new(ActiveRecord::Observer) do
        include CacheInvalidator
        observe(entry.model)
        define_method(:entry) {entry}
      end

      Kernel.const_set(entry.class_name, observer)
      observer.instance
    end
  end

  module_function :watch
  module_function :activate!

  def after_save(model)
    destroy_cache
  end

  def after_destroy(model)
    destroy_cache
  end

  private

  def destroy_cache
    FileUtils.rm_f(entry.file_path) if File.file?(entry.file_path)
    FileUtils.rm_r(entry.dir_path) if File.directory?(entry.dir_path)
  end

  class Entry

    attr_reader :controller, :model

    def initialize(controller, model)
      @controller, @model = controller, model
    end

    def class_name
      (controller.name.gsub(/\:\:/, '') + model.to_s.camelize + "CacheObserver").intern
    end

    def ==(other)
      controller == other.controller && self.model == other.model
    end

    def file_path
      "#{dir_path}.xml"
    end

    def dir_path
      "#{APP_ROOT}/public/cache/#{@controller.name.underscore}"
    end

    def log
      logger.info "Cache-watching #{model.to_s.camelize} for #{controller}"
    end
  end
end

By including the CacheInvalidator module we can declare cache invalidation rules inside controllers.

class FooController < Merb::Controller
  include CacheInvalidator
  cache_page :index
  cache_watch :FooStore, :BarStore
end

The cache can be activated where app initialization tasks are kept, such as init.rb in Merb.

Merb::BootLoader.after_app_loads do
   CacheInvalidator.activate!
end