Archive for May, 2008

DataMapper without a database

Thursday, May 29th, 2008

DataMapper is fast becoming a credible contender in the Ruby ORM field. The first - and only at this early stage - thing that temporarily disappointed me was the following scenario.

class Foo
  include DataMapper::Resource

  property :id, Integer, :serial => true
  property :title, String
end

Running this produces ArgumentError: Unknown adapter name: default, suggesting that a database connection needs to be setup in order to use any objects that include the DataMapper::Resource module. This is something I would rather not have to do for my dependency neutral test suite, in which all calls to ORM objects are simulated using mocks.

I soon realized that DataMapper doesn’t require a database connection to be present, but needs to know which adapter to use. If we’re not interested in interacting with the database, using DataMapper::Adapters::AbstractAdapter does the trick.

DataMapper.setup(:default, "abstract::")

class Foo
  include DataMapper::Resource

  property :id, Integer, :serial => true
  property :title, String
end

Foo.new(:title => "metal").title # => "metal"

Synthesis visualizations

Thursday, May 29th, 2008

Synthesized testing is about accurately simulating object interactions and verifying that each end point of every interaction has been tested to work. The end result of a code base tested employing this strategy forms a specification of the application’s ecosystem in terms of object communication.

Danilo has been recently contributing some excellent work around visual representations of the above. The code is being developed on the Synthesis experimental branch on github.

Consider the Synthesis test_project example.

class DataBrander
  BRAND = "METAL"

  def initialize(storage)
    @storage = storage
  end

  def save_branded(data)
    @storage.save "#{BRAND} - #{data}"
  end

  def dont_do_this
    @storage.ouch!
  end
end

class Storage
  def initialize(filename)
    @filename = filename
  end

  def save(val)
    File.open(@filename, 'w') {|f| f < val}
  end

  def ouch!
    raise Problem
  end
end

class Problem < Exception;end

Below are the complete specs for the above implementation.

describe DataBrander do
  it "should save branded to storage" do
    storage = Storage.new("")
    storage.should_receive(:save).with("METAL - rock")
    DataBrander.new(storage).save_branded("rock")
  end

  it "should delegate problem" do
    storage = Storage.new("")
    storage.should_receive(:ouch!).and_raise(Problem.new)
    proc {DataBrander.new(storage).dont_do_this}.should raise_error(Problem)
  end
end

describe Storage do
  it "should save to file" do
    begin
      Storage.new("test.txt").save("rock")
      File.read("test.txt").should == "rock"
    ensure
      FileUtils.rm_f("test.txt")
    end
  end

  it "should raise problem on ouch!" do
    proc { Storage.new("").ouch! }.should raise_error(Problem)
  end
end

A Synthesis run using the DOT formatter produces:

dot-synthesis-passing

Removing the "should save to file" spec will cause the Synthesis task to fail.

dot-synthesis-failing

Below is how a real (relatively small) project looks like.

full-project

I find the ability to inspect our application modeling through such a representation a very appealing added benefit to the confidence in our system Synthesis provides us with. The DOT formatter will become part of the Synthesis gem as soon as we iron out the few remaining glitches.

Using Bazaar with RubyForge

Tuesday, May 27th, 2008

Bazaar is a distributed version control system written in Python, similar to Git. Bazaar places particular focus on usability, it is easy and natural to use, especially for ones visiting or migrating from the world of Git.

One of Bazaar’s striking features is the ability to publish branches with sftp, provided there is a web server available. RubyForge project accounts come with support for both, so publishing a Bazaar branch is as easy as:

bzr push --create-prefix sftp://you@rubyforge.org/var/www/gforge-projects/your-project/bzr

Developers can create their copy of the branch by:

bzr branch http://your-project.rubyforge.org/bzr

Erlang eval and dynamic dispatch

Sunday, May 18th, 2008

Ruby’s Object#send method offers an elegant alternative for invoking methods based on a command translating a symbol to a function dispatch. I was looking for similar functionality in Erlang and here’s what I came up with.

First, let’s see how we can achieve eval functionality in Erlang, i.e. evaluate strings as Erlang code at runtime.

-module (meta).
-export ([eval/2]).

eval(Code, Args) ->
  {ok, Scanned, _} = erl_scan:string(Code),
  {ok, Parsed} = erl_parse:parse_exprs(Scanned),
  Bindings = lists:foldl(fun ({Key, Val}, BindingsAccumulator) ->
    erl_eval:add_binding(Key, Val, BindingsAccumulator)
  end, erl_eval:new_bindings(), Args),
  {value, Result, _} = erl_eval:exprs(Parsed, Bindings),
  Result.

erl_scan is Erlang’s token scanner module. The string function tokenizes a list of characters. erl_parse is the Erlang parser module and the parse_exprs function parses a list of tokens, each Token representing an expression. It returns a list of the abstract forms of the parsed expressions, ready to be used with erl_eval, the Erlang meta interpreter. An arbitrary list of bindings can be provided alongside the parsed expressions to erl_eval:exprs.

With the meta:eval function in place, we can evaluate arbitrary strings of code at runtime.

Eshell V5.6  (abort with ^G)
1> c(meta).
{ok,meta}
2> meta:eval("20 + 30.", []).
50
3> meta:eval(”A + B.”, [{'A', 15}, {'B', 60}]).
75

We can build on meta:eval to achieve Ruby-like dynamic dispatches.

send(MethodName, Args) ->
  ArgNames = lists:foldl(fun ({K, _}, Acc) -> lists:append([K], Acc) end, [], Args),
  Code = atom_to_list(MethodName) ++ “(” ++ atom_join(ArgNames, $,) ++ “).”,
  eval(Code, Args).

The send function takes two arguments, an atom which is the name of the function to be dispatched and a list of tuples with key/value pairs for the arguments to be passed to the function. atom_join joins a list of atoms into one string using the supplied separator.

atom_join([], _Sep) -> [];
atom_join(Items, Sep) -> lists:flatten(atom_join1(Items, Sep, [])).
atom_join1([Head | []], _Sep, Acc) -> [atom_to_list(Head) | Acc];
atom_join1([Head | Tail], Sep, Acc) -> atom_join1(Tail, Sep, [Sep, atom_to_list(Head) | Acc]).

Let’s add a couple of test functions to showcase what has been achieved.

hello() -> "hello, world".
hello(Who) -> "hello, " ++ Who.

Back to Eshell…

Eshell V5.6  (abort with ^G)
1> c(meta).
{ok,meta}
2> meta:send('meta:hello', []).
“hello, world”
3> meta:send(’meta:hello’, [{'Name', "rock"}]).
“hello, rock”

JSynthesis

Wednesday, May 14th, 2008

A big thank you to Chris Barrett who has been taking the time to port Synthesis to Java.

JSynthesis is registered as a GoogleCode project and it will surely be integral to my toolkit next time I work on a Java project.

Confidence as a test code metric

Saturday, May 10th, 2008

With testing occupying a major part of our development process, we have often attempted to quantify test code quality. Like many things, it is worth considering how test code ultimately manifests itself in terms of added value. This is why Stuart and I recently tend to conclude that, stripped from technically granular details, test code must fundamentally contribute in building confidence that the system under test is complete, a proof that what we’ve built is and will continue working as intended.

A working system fulfilling its business objectives can be considered complete enough, but, if not easily extensible and maintainable, will not grant itself to the conclusion of being as good as it can be. Advancements in software development methodologies that assist in delivering working software that is easy to extend and maintain - higher level abstractions, modeling and design - have been driven by the need to reduce technical debt. Technical debt can be viewed as the cost of change.

Test code is code, too. As code bases grow more elaborate, test code also suffers from technical debt, demanding methods to eliminate the factors that hinder its maintainability and extensibility. Present procedures geared towards extensible and maintainable test code are habitually counter-proportional to the amount of confidence they achieve.

The confidence scale

The different categories on the scale are not mutually exclusive, in fact they are commonly combined as members of a suite that exercises the system in various degrees of instrumentation. Walking the scale from left (empty) to right (full), we move from tests that are generally easier to write, understand, run and maintain but at the same time are less representative of the real system with all its components integrated.

Dependency neutral tests with all of the tested component’s dependencies stubbed are disconnected, vaguely describe how the component interacts with its environment and offer minimal proof that the component will work as specified once a member of the application ecosystem.

The fundamental difference between interaction based dependency neutral tests and their stubbed counterparts is the accurate interaction specification of collaborating components through the use of mock objects instead of stubs. Here, we concentrate on specifying the contract of communication between two components. Although much closer to how the actual system operates, these tests are still disconnected. Despite the accurate specification of the interaction, we have don’t have complete proof that the pieces fit. In particular, interaction based dependency neutral tests do not offer proof that the mocked collaborators have been tested to work.

It becomes apparent that the major flaw of interaction based dependency neutral tests is their disconnect from their peers.

As we move towards the “full” side of the confidence scale, tests tend to become larger and more complicated. Wired tests draw a picture much closer to that of the system in its entire form but suffer from poor defect localization (test failures are not always directly related to the intent of the specific test) and disrespect encapsulation (setup code often exposes the behavior of components irrelevant to the context of the current test). The dependency wired tests’ contribution to technical debt is much more significant.

Understanding the importance of confidence in our system and aiming to reduce technical debt, Synthesized Testing suggests a solution that attempts to rectify the disconnect of lightweight, interaction based dependency neutral tests and reduce the need of overarching, prone to technical debt dependency wired tests.

Unambiguous command abbreviation

Wednesday, May 7th, 2008

When using RubyGems from the command line, I almost always type sudo gem i synthesis as opposed to sudo gem install rails, the emphasis targeted at using “i” instead of “install”, of course. The gem executable happily understands what command it is being asked to execute when provided with the first few letters of the command, as long as those letters are not ambiguous, i.e. don’t clash with the names of other commands. So even though sudo gem u foo complains that Ambiguous command u matches [uninstall, unpack, update], sudo gem uni foo will uninstall the specified gem.

Here’s how this is implemented in RubyGems.

def find_command(cmd_name)
  possibilities = find_command_possibilities(cmd_name)
  if possibilities.size > 1
    raise "Ambiguous command #{cmd_name} matches [#{possibilities.join(', ')}]”
  end
  if possibilities.size < 1
    raise “Unknown command #{cmd_name}”
  end

  self[possibilities.first]
end

def find_command_possibilities(cmd_name)
  len = cmd_name.length
  self.command_names.select { |n| cmd_name == n[0,len] }
end

In the same vein, although not strictly a command abbreviation, Danilo pointed out git understands abbreviated revision hashes, so it’s possible to use something like git diff d0a..HEAD even with the hash’s complete representation being d0aa7dd4aa9a95090df1e0b9d0f426d5a5bd56ae.

Less typing is almost always a good option to have. The easy to implement Unambiguous command abbreviation trick adds a subtle usability improvement to command line interfaces and holds a nice treat to the utility’s power users.

Distributed programming with Jabber and EventMachine

Sunday, May 4th, 2008

Jabber and its underlying protocol XMPP are typically associated with instant messaging applications, although the breadth and flexibility of the technology allows for implementations that can span further from traditional online chatting.

ejabberd is a fault tolerant and clusterable Jabber/XMPP server written in Erlang and presents an interesting option as a simple, lightweight and scalable message transport for distributed applications.

EventMachine is a simple and fast library for lightweight concurrency in Ruby. Its use mainly involves, but is not limited to, spawning lightweight processes whose execution can be programatically scheduled, easy and fast socket abstractions and an implementation of the Deferrable pattern as introduced by the Twisted event-driven Python networking engine.

When a Ruby class includes the EventMachine::Deferrable module, it is provided with the ability to accept arbitrary callbacks and errbacks that will get executed when its deferred status changes, in particular when it is set to either :succeeded or :failed. Let’s look at a deferrable Worker class which performs a potentially long running operation.

class Worker
  include EM::Deferrable

  def heavy_lifting
    30.times do |i|
      puts "Lifted #{i}"
      sleep 0.1
    end
    set_deferred_status :succeeded
  end
end

Inside an EventMachine loop, we can add callbacks to a Worker instance and dispatch the expensive operation to a separate thread, or an evented process. The program’s execution will continue, with any callbacks attached to Worker executed once its deferred status is set.

EM.run do
  worker = Worker.new
  worker.callback {p "done!"}
  Thread.new {worker.heavy_lifting; EM.stop}
  puts "resuming remaining program operations"
end

Now, let’s look at combining Worker with Jabber to trigger long running jobs. For Jabber server duties, I am using ejabberd on an old laptop running Debian, but there’s no reason why a mass online Jabber service like Google Talk could not be used for playing around with the example. Also, I’m using the xmpp4r-simple Ruby library, which is a wrapper around xmpp4r.

jabber = Jabber::Simple.new("bot@thrash", "password")
at_exit{jabber.status(:away, "jabot down")}

EM.run do
  EM::PeriodicTimer.new(1) do
    jabber.received_messages do |message|
      case message.body
      when "exit" : EM.stop
      when "lift" :
        EM.spawn do
          worker = Worker.new
          worker.callback {jabber.deliver(message.from, "Done lifting")}
          worker.heavy_lifting
        end.notify
        jabber.deliver(message.from, "Scheduled heavy job...")
      else jabber.deliver(message.from, "Dunno how to #{message.body}")
      end
    end
  end
end

Inside an EventMachine loop, we check for new messages every second. The program understands two commands, exit and lift. The first quits the EventMachine loop and ultimately terminates the program’s execution. When lift is received, we instantiate a new Worker inside a spawned process and add a callback so that the Worker will notify the command issuer when the job has completed. Worth noting is the use of notify to schedule the spawned process. notify returns immediately making work dispatch non-blocking - upon issuing a lift command twice, a “Scheduled heavy job…” message will be sent to the job issuer twice before the first job completes.

I use Adium to send commands to the program - an interesting way of remote controlling or interacting with applications. Of course, the real interest lies in using the setup under discussion for inter-app communication. With multicast options, presence discovery, node status updates and more, there is lot to explore in terms of distributed application development, if simple and lightweight are two keywords to be found on the highest ranks of your list.