Feb 01 2009

State separation

It is usual for web applications to deal with serving content specific to a user's session. This makes web caching harder to implement as we don't want content that is meant to be viewed by a particular user being cached and accidentally offered to others. Some HTTP accelerators like Varnish choose to by default completely ignore responses that contain cookies. However, not all content is always tied to a user's session, and if that content doesn't change in real time, it makes sense to cache the parts that are common to all users in order to improve efficiency. With this in mind, one logical split could be made between parts of the system that are globally cache friendly and ones that aren't.

Consider online retailer websites which usually operate in two modes, one for visitors and one for logged in users. Logged in users are presented with a customized, session specific experience, yet data like the product catalog is essentially the same regardless of whether one is logged in or not and it makes sense for everyone to be accessing the same cached copy of a common resource.

A possible solution involves creating two separate web applications, one entirely dedicated to stateless interactions and one meant for pages that are rendered as part of a user's session. This might seem like overkill, but it clearly enforces the divide between what can and what can't be cached. It also promotes reuse of the system's web caching layer, which now serves content to site "visitors" as well as to the stateful components. The stateful application can delegate requests for potentially cached content to its stateless counterpart via the caching layer and decorate the responses with session specific data.

split_by_state

Web caching presents but one way to cache data that remains static for predefined periods of time. Apart from harnessing proven existing tools, this form of caching comes with the advantage that its policies are universally understood and can significantly improve a website's efficiency in ways beyond the maintainer's control. Retrofitting web caching into an application that hasn't been designed with it mind can be difficult, therefore it is worth to logically separate cacheable and non cacheable resources early on.