Cacheable HTTP search query results
I have worked on a number of web applications which required searching catalogs of data based on filtering criteria. The most common implementation I see involves issuing a GET request to a search service, providing the search criteria as part of the request’s query string.
http://example.com/search?category=music&subcategory=rock&page=7
This approach does not easily lend itself to static resource caching, one of the most effective ways to improve a web app’s performance. Regardless of the level of optimization applied to application code, fine tuning of database queries, even the addition of something like memcached, a request reaching the application server is unlikely to be served more efficiently than if it was handled by a high performance HTTP server like Nginx.
By approaching search queries as RESTful HTTP resources uniquely identified by a URI as opposed to RPC based commands we should be able to cache the results the first time they are processed following a search request.
http://example.com/search_results/someuniqueidentifier
The unique identifier part of the URI can take the form of a hash which, when deserialized, will provide the application with the filter criteria for the search. This assumes that the client and server share a common protocol, one which defines how the hash for the URI is constructed. For example, it is a good idea that there is an expected order for the set of criteria. While searches for {category : music, subcategory : rock} and {subcategory : music, category : rock} will produce the same results, using both combinations will cause the resource to be cached twice under two separate URIs, resulting in a performance penalty.
A potential solution can involve Base64 encoding and decoding a string constructed using a predefined format and comprising of the filter criteria.
CGI.unescape(identifier).unpack('m')[0] # => "music,rock,,,,7,30"
This method will not be useful for plain HTML fronted websites. It requires a potent enough client with the ability to dynamically construct URIs based on filter criteria. JavaScript, ActionScript or generic web service consumer applications are all good candidates.

July 15th, 2008 at 9:00 pm
Hey George,
Why is your first URI “less cacheable” than the second? They’re both just identifiers for resources after all. There’s nothing inherently “more RESTful” about the second than the first.
Jim
July 15th, 2008 at 10:15 pm
Hi Jim,
Granted about the relative RESTfulness. However, the default behavior of most web servers is to ignore the query string if the requested URL corresponds to a file (the cached resource) and not a program. This technique is useful if you want Nginx or Apache to deliver the content from disk and would prefer to avoid getting your hands dirty with URL rewrite rules.