Elwinar

Minimalist by design

Testing with containers

2018.03.05

In this article, I explain why, and how we use containers for testing purposes at Synthesio. For simplicity’s sake, most examples assume the project is written in Go, and use a MySQL database, and we will use Docker as the container application, but the method discussed can be and is used with any language or technology and more complex setups.

Dependencies are complex pieces of software, for a good reason. A database, for example, abstracts away the huge amount of complexity and features so our code doesn’t have to handle it by itself and can focus on business features while being simpler.

Testing code that use such a dependency, however, is a harder job than it seems made unnecessarily complex by most testing frameworks. One of the reasons for this is that when thinking, speaking, reading about testing, we could almost use the term unit-testing interchangeably. And all in all, unit-tests are badly adapted to testing code that directly use a complex dependency. Which isn’t surprising, given that the whole point of unit testing is to not use any dependency.

This is not the solution you’re looking for

In the current state of the art, the most commonly accepted solution for unit-testing a piece of code with a dependency is to fake it. To give the unit under test something that mimics the dependency and will behave as we want, so we can verify that the code is reacting as expected. This method, called mocking, is actually a bad solution to the problem at hand.

To be perfectly honest, faking the dependency is a good solution for a certain class of dependency. Unit-testing was born during the rise of object-oriented programming, and is unsurprisingly well-adapted to code that can take as dependency an interface of reasonable complexity, the mock being generally an in-memory, dumb version of the real dependency. A database for example, is generally too complex for this.

Let’s take a concrete example to illustrate the point. Golang has a library for the purpose of mocking a SQL database during tests, sqlmock. Here is an example of code, taken from the library’s Github repository.

func TestShouldUpdateStats(t *testing.T) {
    db, mock, err := sqlmock.New()
    if err != nil {
        t.Fatalf("an error '%s' was not expected when opening a stub database connection", err)
    }
    defer db.Close()

    mock.ExpectBegin()
    mock.ExpectExec("UPDATE products").WillReturnResult(sqlmock.NewResult(1, 1))
    mock.ExpectExec("INSERT INTO product_viewers").WillReturnResult(sqlmock.NewResult(1, 1))
    mock.ExpectCommit()

    // now we execute our method
    if err = recordStats(db, 2, 3); err != nil {
        t.Errorf("error was not expected while updating stats: %s", err)
    }

    // we make sure that all expectations were met
    if err := mock.ExpectationsWereMet(); err != nil {
        t.Errorf("there were unfulfilled expectations: %s", err)
    }
}

If we think about it, the real interface between our code and the database isn’t the actual method called, it’s the query language itself. And it isn’t the query language that is mocked here, but simply the implementation code that is supposed to communicate with the database, leaving to the user the care to do the actual mocking, verification, etc.

The result of this is that the tests that use this kind of library generally do only half of the job, and generally not the part that is actually worth testing. Checking if a query wrapper really call the expected function is actually a near-useless job, akin to a test that verify the value of a constant.

A useful test would check if after calling the method, the value in the database is the expected one. Which would need faking the logic of the database itself and leave the implementation do whatever it wants with it. A useful test wouldn’t need to be completely re-written when the implementation of the method change even so slightly.

The problem is that a library that would implement such a feature would be hugely complex. It would probably have to implement complex logic to parse the SQL syntax, understand the queries, etc. If such a library existed, it would probably resemble more of a full-fledged in-memory SQL database than a mocking library. (I actually did exactly that using in-memory sqlite databases once upon a time.)

The amount of work needed would be huge to say the least. And even then, it wouldn’t be complete. It would have to understand vendor extensions, bugs, take into accounts versions of the target vendor database, etc, to be ultimately useful. Pouring such an amount of time in this is probably not worth it, when we have a simpler and ready solution: why not just use the real database?

This approach has existed for a long time too but it suffered multiple defaults, the most important being the need to install and configure a database for the tests themselves, which generally lead to a project needing a complex setup for something that should be simple and quick. Or you would need to maintain an active database in your organization for the purpose of running tests. And have one for each version used in production. Clumsy at best, not re-usable, and generally too many constraints.

Luckily for us, the situation changed in the last years, with the arrival of containers as an easy, simple and globally accessible solution. With a little not-even-complex tooling, using a real database in a container for testing is surprisingly easy, and lead to clean and concise tests.

Do or do not, there is no try

So, we want to run test that use a database with an actual database. How do we do that ?

First step is to have the database in a container. Let’s spawn a container for that. And throw in a docker-compose.yml for good measure.

version: "2.1"

services:
  mysql:
    image: mysql:5.7
    ports:
      - "3306:3306"

Now, running our tests is a simple matter of running docker-compose up before the tests, and using localhost:3306 as the database address. Dead simple. A little too naive, however.

For example, exporting ports that way is like opening a door to a world of conflicts, ad hoc conventions, etc, which in the long run would be hard to maintain. One solution is to run the tests from another container, linked to the mysql one so it can access the database using the container’s network.

For this, we will simply add an app container in our docker-compose.yml.

version "2.1"

services:
  app:
    image: golang:1.10
    links:
      - mysql
  mysql:
    image: mysql:5.7

Now, the tests must use mysql:3306 as the address, and the command for running the tests becomes docker-compose run --rm app /usr/bin/env bash -c "go test". Better on the operation side, but not something we want to type every time we want to run tests, configure CI, etc.

For simplicity’s sake, let’s put that in a Makefile. And while we’re at it, add a build command too so we can also build in the container.

exec = docker-compose run --rm app /usr/bin/env bash -c

.PHONY: build
build::
    ${exec} "go build"

.PHONY: test
test::
    ${exec} "go test"

Much better. Running the tests is back to a simple make test, the containers are spawned as needed without intervention, multiple projects can coexist without conflict, and the usage of Make or any other build too probably integrate with any complex workflow. All is well, our job here is done.

Until we actually run the tests….

# make test
[…]
--- FAIL: TestApp (0.00s)
 <autogenerated>:1: mysqltest: dial tcp 192.168.0.2:3306: getsockopt: connection refused
[…]

What is happening here? When it creates the containers, Docker is smart enough to wait until the mysql container is running before starting the app container, but most databases need a little warmup before being ready, so when the test code tries to connect to MySQL, the database is still initializing and cannot accept the connection.

One solution is to use a tool like https://github.com/jwilder/dockerize as the app container entrypoint. Dockerize will wait until the configured port is ready before running the container command. There is a little issue here: the golang container doesn’t include dockerize.

Containers to the rescue! The simplest solution is to have a custom image that will do.

FROM golang:1.10.0

COPY entrypoint.sh /usr/local/bin/
ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
CMD ["/usr/bin/env", "bash"]

RUN curl -sSL "https://github.com/jwilder/dockerize/releases/download/v0.5.0/dockerize-linux-amd64-v0.5.0.tar.gz" | tar -xz -C /usr/local/bin

With entrypoint.sh being a file along these lines. (It can probably be replaced by something shorter in the docker-compose.yml’s entrypoint option.)

#!/bin/bash
exec "$@"

Then, you can change the docker-compose.yml file to use your custom image (we will call it custom-golang) and dockerize.

version "2.1"

services:
  app:
    image: custom-golang:1.10
    links:
      - mysql
    entrypoint: dockerize -timeout 20s -wait tcp://mysql:3306 entrypoint.sh
  mysql:
    image: mysql:5.7

OK, we have a database container ready for action, it’s time to code! Let’s do something that resemble an actual test.

func TestFoo(t *testing.T) {
    // Prepare the connection to the database container.
    db, err := sql.Open("mysql", "root:@tcp(mysql:3306)")
    if err != nil {
        t.Fatal("connecting to database server:", err)
    }
    defer db.Close()

    // Create the database and tables necessary.
    _, err = db.Exec(`
        CREATE DATABASE app;
        USE app;
        CREATE TABLE foo ( id INTEGER UNSIGNED PRIMARY KEY );
    `)
    if err != nil {
        t.Fatal("initializing database:", err)
    }

    // Call the tested function.
    result := foo(db)

    // Check the result.
    var expected = "bar"
    if result != "bar" {
        t.Errorf("unexpected output: wanted %s, got %s", expected, result)
    }
}

While encouraging, this code has a number of problems that we need to address before being reliable. The first of them is that it will only work once, as the created base is persistent and the test will fail if the database already exists. Additionally, we won’t be able to run tests in parallel either.

We could add a DROP statement before the database creation, but it would only solve half of the problem. A better solution would be to generate a random name before actually running the query and use this as the database name.

name, _ := random.Alpha(10)

_, err = db.Exec(fmt.Sprintf("
    CREATE DATABASE `%[1]s`;
    USE `%[1]s`;
    CREATE TABLE foo ( id INTEGER UNSIGNED PRIMARY KEY );
", name))

Good enough. Although a little rough, this solution works and solve all the problems at hand. We can refine it further by moving the related code in a dedicated helper so the test itself is cleared of unnecessary code, and each test can use it independently.

func Spawn(t *testing.T, address, schema string) *sql.DB {
    // Prepare the connection to the database container.
    db, err := sql.Open("mysql", fmt.Sprintf("root:@tcp(%[1]s)", address))
    if err != nil {
        t.Fatal("connecting to database server:", err)
    }

    // Create the database and tables necessary.
    name, _ := random.Alpha(10)

    _, err = db.Exec(fmt.Sprintf(" CREATE DATABASE `%[1]s`; USE `%[1]s` ", name))
    if err != nil {
        t.Fatal("initializing database:", err)
    }

    // Load the schema.
    _, err = db.Exec(schema)
    if err != nil {
        t.Fatal("loading schema:", err)
    }

    // Return the database and a cleaning 
    return db
}

func TestFoo(t *testing.T) {
    // Create a firesh database for use in this test.
    db := Spawn(t, "mysql:3306", "CREATE TABLE foo ( id INTEGER UNSIGNED PRIMARY KEY )")
    defer db.Close()

    // Call the tested function.
    result := foo(db)

    // Check the result.
    var expected = "bar"
    if result != "bar" {
        t.Errorf("unexpected output: wanted %s, got %s", expected, result)
    }
}

Another big improvement would be to load the database creation queries directly from a file, so the schema and fixtures can be shared between different tests and won’t pollute the test code.

func Spawn(t *testing.T, address string, fixtures ...string) *sql.DB {
    // Prepare the connection to the database container.
    db, err := sql.Open("mysql", fmt.Sprintf("root:@tcp(%[1]s)", address))
    if err != nil {
        t.Fatal("connecting to database server:", err)
    }

    // Create the database and tables necessary.
    name, _ := random.Alpha(10)

    _, err = db.Exec(fmt.Sprintf(" CREATE DATABASE `%[1]s`; USE `%[1]s`; ", name))
    if err != nil {
        t.Fatal("initializing database:", err)
    }

    for _, fixture := range fixtures {
        Load(t, db, fixture)
    }

    // Return the database and a cleaning 
    return db
}

func Load(t *testing.T, db *sql.DB, fixture string) {
    raw, err := ioutil.ReadFile(fixture)
    if err != nil {
        t.Fatalf("reading fixture %s: %s", fixture, err.Error())
    }

    // Load the schema.
    _, err = db.Exec(string(raw))
    if err != nil {
        t.Fatalf("loading fixture %s: %s", fixture, err)
    }
}

func TestFoo(t *testing.T) {
    db := Spawn(t, "mysql:3306", "testdata/schema.sql")
    defer db.Close()

    // Call the tested function.
    result := foo(db)

    // Check the result.
    var expected = "bar"
    if result != "bar" {
        t.Errorf("unexpected output: wanted %s, got %s", expected, result)
    }
}

Conclusion

This testing method can be refined a bit more by adding a few tricks that won’t be covered here, like cleaning the database after a successful test, or adding templating to the fixtures. It can be adapted to almost any kind of database, or even other types of dependencies like message brokers, other services, etc.

Among the downsides is the fact that it’s actually slower than something like a mock. Spinning up the container is cheap but non-negligible and loading huge fixtures can be long, but all things considered it is often a small price to pay in comparison to the correctness and actual usefulness of the tests that implement this.

Feel free to reach out if you want more details, have questions, or just want to chat. I would love to hear your opinion on the subject.

Et voilà !