Jan 12 2008

Synthesized Testing: A primer

... or, reducing the volume and increasing the value of test code by connecting the dots.

Beside other objectives, test code aims to provide proof and confidence that the application code under test works as expected and as specified.

Tests are often classified under different categories, namely Unit Tests, Functional Tests, Integration Tests, Acceptance Tests, etc, each of which attempts to verify the system components' intended functionality in various degrees of instrumentation.

Unit tests, for example, are employed for testing individual system components in isolation from their peers or environment. Unit tests rarely connect to the database, touch the filesystem or access environment resources.

A typical test code-base will contain a layer of Functional or Integration tests sitting directly above the Unit Test layer. These still don't test the system in its whole, deployed form. They concentrate on testing the functionality of application components with their system dependencies present and wired.

At the functional testing level, testing a Service which accesses a Repository retrieving records from a database will involve opening an actual database connection, setting up and testing against real data.

dependency = DependencyRepositoty.create_dependency('Foo')
record = Repository.create_record('Bar', dependency)
assert_equal :record, Service.report(record.id)

Functional tests are often deemed necessary in order to achieve a sense of confidence that the pieces still work when put together. At the same time, because of their relative complexity, functional tests tend to become long, slow to run, difficult to write and maintain. In essence, the bulk of Functional Tests violates many of the qualities one might attribute to good test code.

The use of Mock Objects is a technique commonly found in Unit Tests aiming to verify interactions between the objects under test. Using Mocks, we concentrate on validating declared expectations of those interactions without relying on every single component to be loaded in order for the test to run.

assert_equal :record, Service.report(1)

By examining the two code examples, it is apparent there is overlap between what they logically test. Both verify the Service's behavior in regards to its communication with the Repository and its handling of the data involved between the call to Service.report and this interaction.

Behind the scenes, the functional test also ensures that the database connection works, the wiring between the Repository and the database adapter is functional, etc. These verifications are irrelevant under the context of what is being tested here. They also end up duplicated in all Functional tests that involve the Repository.

Tests with functional dependencies are brittle and tend to break for the wrong reasons.

Furthermore, it is common for test code-bases to involve a layer of Acceptance tests which are executed against the entire system in - or close to - its deployed form. As a result there is more testing overlap, this time between what the Functional and Acceptance tests are targeting.

The example unit test proves that a call to Service.report(1) will result to a call to Repository.find(1). It further asserts that the value returned by the call to Repository.find(1) is the one returned by Service.report(1).

The association of the Service.report method's concrete implementation to a test attests that the actual implementation of this method has been tested.

This test doesn't offer enough proof the components under test will work as intended as part of the deployed application. In particular, there is no evidence Repository.find has been tested to work.

A code-base with adequate test coverage must contain tests verifying the Repository's concrete implementation's functionality.

A programmer observing that Object A expects to receive a call on method B when method C is called on Object D, having proven that Object A's B method works, mentally processes this information to conclude that the two Object members will work together as expected under the specified interaction.

If we could correlate the verified interaction expectations with tests against their concrete counterparts, we should be able to provide enough evidence that the dots will indeed work together once connected.

By doing so, we can significantly reduce the volume and complexity of the functional tests, achieving a leaner, more meaningful, more robust test code-base.

As proof of concept, Synthesis is a Ruby implementation of the Synthesized Testing theory. It analyzes test code by collecting Mock Object expectations and verifies that their concrete implementations have been tested.