Also on twitter ( twitter.com/nutrun )

Archive for August, 2009

VCS practices over features

Saturday, August 29th, 2009

I’ve often heard people I know and respect say that git is leaps and bounds better than Subversion. I’ve been a relatively early adopter of git, it’s been my VCS of choice for almost two years now. Even though I find it superior to most of the competition I struggle to justify the “leaps and bounds” claim and would rather more modestly call it “a step forward”.

This is probably due to the practices we find benefit our development process. Git puts great emphasis on branching, something we generally tend to avoid (to clarify, I’m not referring to local branching). We concentrate on feedback based on the usage of our applications. This means that we strive to commit as often as possible and, most importantly, deploy to production at a constant rate. Grossly simplified, the process is: identify a small coherent feature, build it, commit it to the master branch and deploy. No part of the codebase is owned by a subdivision of the team, everyone works on everything.

By far the most popular git commands we issue are git pull, git add and git push, not that different to svn update and svn commit.

When I first started using git I was wondering if I had developed a fear of branching because of Subversion’s inefficiencies in that area. In reality, I think that an environment where every developer constantly has an up to date understanding of the codebase and especially a current grasp of the design and overall vision will always be more efficient than working remotely and having merge checkpoints, no matter how cleverly the VCS handles branching. This is why I think a faster, distributed, superior at merging VCS is not something more dramatic than a desirable step forward.

Hello world nginx module

Saturday, August 15th, 2009

Several times over the past few months I made short lived attempts of delving into the mechanics of nginx modules. Although an invaluable resource to anyone seriously interested in the subject, Emiller’s Guide To Nginx Module Development doesn’t at the time of this writing include a quick-start example I could hack together and see in action. Getting something to run as quickly as possible is my preferred way of starting the study of new things and every time I caught myself searching the web for a “Hello world nginx module”.

I will not go into any details, Emiller’s Guide does an excellent job at that, I’m only going to mention the steps I believe are absolutely necessary to write, compile and run an nginx handler module that responds to every request with the string “Hello world”.

There is a minimum of two files required for writing an nginx module, the first should be called config and looks something like this:

ngx_addon_name=ngx_http_hello_world_module
HTTP_MODULES="$HTTP_MODULES ngx_http_hello_world_module"
NGX_ADDON_SRCS="$NGX_ADDON_SRCS $ngx_addon_dir/ngx_http_hello_world_module.c"

The second is the module’s implementation in C and nginx convention suggests a name like ngx_http_modulename_module.c, in this case ngx_http_hello_world_module.c .

#include <ngx_config.h>
#include <ngx_core.h>
#include <ngx_http.h>

static char *ngx_http_hello_world(ngx_conf_t *cf, ngx_command_t *cmd, void *conf);

static ngx_command_t  ngx_http_hello_world_commands[] = {

  { ngx_string("hello_world"),
    NGX_HTTP_LOC_CONF|NGX_CONF_NOARGS,
    ngx_http_hello_world,
    0,
    0,
    NULL },

    ngx_null_command
};

static u_char  ngx_hello_world[] = "hello world";

static ngx_http_module_t  ngx_http_hello_world_module_ctx = {
  NULL,                          /* preconfiguration */
  NULL,                          /* postconfiguration */

  NULL,                          /* create main configuration */
  NULL,                          /* init main configuration */

  NULL,                          /* create server configuration */
  NULL,                          /* merge server configuration */

  NULL,                          /* create location configuration */
  NULL                           /* merge location configuration */
};

ngx_module_t ngx_http_hello_world_module = {
  NGX_MODULE_V1,
  &ngx_http_hello_world_module_ctx, /* module context */
  ngx_http_hello_world_commands,   /* module directives */
  NGX_HTTP_MODULE,               /* module type */
  NULL,                          /* init master */
  NULL,                          /* init module */
  NULL,                          /* init process */
  NULL,                          /* init thread */
  NULL,                          /* exit thread */
  NULL,                          /* exit process */
  NULL,                          /* exit master */
  NGX_MODULE_V1_PADDING
};

static ngx_int_t ngx_http_hello_world_handler(ngx_http_request_t *r)
{
  ngx_buf_t    *b;
  ngx_chain_t   out;

  r->headers_out.content_type.len = sizeof("text/plain") - 1;
  r->headers_out.content_type.data = (u_char *) "text/plain";

  b = ngx_pcalloc(r->pool, sizeof(ngx_buf_t));

  out.buf = b;
  out.next = NULL;

  b->pos = ngx_hello_world;
  b->last = ngx_hello_world + sizeof(ngx_hello_world);
  b->memory = 1;
  b->last_buf = 1;

  r->headers_out.status = NGX_HTTP_OK;
  r->headers_out.content_length_n = sizeof(ngx_hello_world);
  ngx_http_send_header(r);

  return ngx_http_output_filter(r, &out);
}

static char *ngx_http_hello_world(ngx_conf_t *cf, ngx_command_t *cmd, void *conf)
{
  ngx_http_core_loc_conf_t  *clcf;

  clcf = ngx_http_conf_get_module_loc_conf(cf, ngx_http_core_module);
  clcf->handler = ngx_http_hello_world_handler;

  return NGX_CONF_OK;
}

Both config and ngx_http_hello_world_module.c should be placed in the same directory, let’s say /etc/ngxhelloworld. Modules are compiled into the nginx binary. To do so, download the nginx source, uncompress, and in the nginx source directory run:

./configure --add-module=/etc/ngxhelloworld
make
sudo make install

Finally, add a module directive to nginx’s configuration (default is /usr/local/nginx/conf/nginx.conf) to enable the module for a location.

location = /hello {
  hello_world;
}

At this point, we can start nginx and navigating to http://localhost/hello will yield the result of all this labor.

Alongside Emiller’s Guide, I also found reading nginx third party module code helpful.

Asynchronous session content injection

Thursday, August 6th, 2009

Applying a clear distinction between stateless and stateful content when designing a web application is tricky but worth tackling early so that content not specific to user sessions can benefit from web caching. The technique we are trying out for scramble.com reminds me of what I described in State separation and was introduced to me by Mike Jones who was inspired by the Dynamically Update Cached Pages chapter in Advanced Rails Recipes.

asynchronous-session-content-injection

The idea involves serving non session specific resources independent from personalized content and use AJAX calls to inject the page with session specific content.

require 'rubygems'
require 'sinatra'
require 'json'

configure do
  enable :sessions
end

get '/' do
  headers['Cache-Control'] = 'max-age=60, must-revalidate'
  erb :index
end

get '/userinfo' do
  if session[:user]
    JSON.dump(:user => session[:user])
  else
    halt 401
  end
end

get '/login' do
  session[:user] = 'rock'
  redirect '/'
end

get '/logout' do
  session.clear
  redirect '/'
end

Notice some of the headers for '/':

$ curl -I http://localhost:4567/
Cache-Control: max-age=60, must-revalidate
Set-Cookie: rack.session=BAh7AA%3D%3D%0A; path=/

The Cache-Control policy instructs a web cache to keep this version of the resource for 60 seconds before requesting a fresh one. Set-Cookie however will usually cause a web cache to never store the response and always query its back end.

The following configuration tells Varnish to throw away the cookie from any request/response that doesn’ match one of the URLs that require authorization, thus causing it to react to response cache policies.

sub vcl_recv {
  if (req.url !~ "^(/login|/logout|/userinfo)") {
    unset req.http.cookie;
  }
}

sub vcl_fetch {
  if (req.url !~ "^(/login|/logout|/userinfo)") {
    unset obj.http.set-cookie;
  }
}

A snippet from the HTML response for '/':

<h1>Hi</h1>
<div id="nav">
  <a href="/login" class='login-control'>Login</a>
</div>

… and the javascript for asynchronously injecting session data to the page:

$(function() {
  $.getJSON('/userinfo', function(data) {
    $('h1').text('Hi ' + data.user);
    $('#nav .login-control').attr('href', '/logout').html('logout');
  })
})

In summary, it is likely that a website will have significant amounts of content that is intended for everyone without the need for personalization. The performance of serving that content can benefit from web caching, but that becomes difficult as many websites’ user experience depends on the presence of user sessions. Separating stateless from session specific content at the resource level and using a combination of HTTP and AJAX to merge the results of requests for both types of resources will make stateless content cacheable by decoupling it from the unnecessary cookie dependency.

Runnable code example : http://pastie.org/573878