Fast, Parallel Database Tests

Here are some of the different strategies you can use to write tests that read or write data from a database, and the benefits and drawbacks of each. I'm usually writing Go code, so the examples here are for Go, but the notes should generalize to any language that runs tests against Postgres, MySQL, or any other database.

At a high level my goal is for the tests to take a minimum of programmer time. Programmer time is spent on the tests in a few different places:

Writing tests.
Running the tests.
Troubleshooting test flakes.
Troubleshooting written tests that don't behave the way you expect.

Our goal is to minimize the amount of time spent across all four of these areas, from which I have the following principles for a test suite:

Tests should be fast. Each test should do only what it has to do, which for database tests primarily means it should only create the objects it needs to test the behavior it needs to test, and no more.
Tests, and test packages, should run in parallel, which usually helps the test suite complete more quickly.
Tests should be predictable; if tests return unpredictable results, or interact in bad ways, this leads to wasted time tracking down why the tests are failing.

Obviously these are in conflict - parallel tests are less predictable than sequential tests - but there are some things we can do to make tests fast and predictable, and some common techniques that make them slower and unpredictable.

Don't: Reuse the same fixture objects in each test

You might have a set of "standard" test objects that get written to the database — Alice and Bob, and maybe Alice is a power user and Bob just signed up. If you try to create them in a test and they exist in the database already, the tests will upsert them.

This is fine, and you can do it, but it tends to break down in two ways:

"Default" fixtures tend to accumulate new database objects any time you create a new table, like a snowball rolling downhill. This means any new table makes all of your tests slower.
If you run tests in parallel, any test that updates or deletes any fixture that other tests rely on can cause unexpected behavior.

Because we are optimizing for programmer time - fast tests, and no unexpected behavior during tests - I tend to want each test to create the objects it needs, and make them each unique.

Don't: Reuse the same set of ~500 names

When you use a library like "Faker" to create objects that are sort of similar but have a common set of variations, for example, a last name derived from the same set of 500 last names, you are bound to run into trouble. It's way too easy to accidentally create two test objects that have the same birthday or same last name, when you didn't expect that to happen, and as a result, end up with one result when you wanted two, or otherwise get test failures that only occur once every 500 runs.

I've previously discussed why Faker is a bad idea, read the post for more.

Do: Use UUID's for anything that needs a "random" or unique input

Instead of creating "Alice" with 1 of 500 pre-seeded last names, or always with the email address "alice@example.com", use UUID's (or a part of a UUID), anywhere you need unique data. So "Alice Jones-a9706630", with email "alice-7cce0183@example.com". The first 8 characters of a UUID have 4294967296 different values; these objects will not conflict with each other during your lifetime.

Don't: Truncate the database after each test

At the end of each test, you could issue a command to truncate or delete all database tables. So each test is something like:

t.Run("TestAccountCreate", func(t *testing.T) {
    assertEquals(t, user.ID, 7) // or whatever
    cleanup(t)
})

The advantage to doing this is that each test cleans up after itself - you can create the same objects in multiple tests even if they might fail a uniqueness constraint, since it gets destroyed between each test instance. The disadvantage to doing this is that you can't run tests in parallel, because one test might issue a truncate right when the other test is writing to the database.

Further, it's not uncommon for tests to hang, or crash, while they run. If that happens, your truncate-after-test runner will not actually run, and the next test will encounter an inconsistent state. These types of failures can be tough to troubleshoot because the failure was actually in the previous test, which your test output might not make clear.

Don't: Truncate the database before each test

This has most of the benefits and drawbacks of the previous section, except you need to truncate everything in the database, because you don't know what test ran before this one. In theory, if you delete after, you only need to delete from tables that you wrote to.

Don't: Truncate the database after each package

You could enclose all of the tests in a parent test, something like this:

t.Run("Package", func(t *testing.T) {
    setUp(t)
    t.Run("Test1", func(t *testing.T) {
    })
    t.Run("Test2", func(t *testing.T) {
    })
    t.Run("Test3", func(t *testing.T) {
    })
    cleanup(t)
})

This way, you only pay the penalty of database setup and cleanup once per package.

However, the default for Go tests is to run as many package tests in parallel as you have CPU's available. So one package may be just finishing on one CPU (and truncating the database) when another package is starting on a different CPU, which causes failures that are hard to troubleshoot. You can set -p 1 to disable package parallelism, but the test suite gets a lot slower if you can only run one package's tests at a time.

Don't: Truncate the database before each package

This has most of the benefits and drawbacks of the previous section, except you need to truncate everything in the database, because you don't know what test ran before this one.

Maybe: Run all tests in their own transaction, then roll it back

The idea here is that you start a transaction in your test suite, then you run the application code, then you invoke a ROLLBACK in the test suite, restoring the database to its pristine state. The application code needs a pointer to the transaction, so the objects it creates exist inside of that.

Conceptually, there's nothing wrong with this method, but if you have any application code that starts a transaction, you will get in trouble with Postgres and MySQL, and you may not have a great time with other database flavors. In theory, you could make BEGIN and COMMIT a no-op in application code called from tests, but you are adding a lot of complexity there and the cost starts to outweigh the benefit.

Do: Avoid truncating the database at all

If each test is creating objects that it needs, and those objects don't conflict with anything else, you basically don't need to truncate the database ever. Sure you can add a command to clean it out if it gets too unwieldy, but you don't need to automatically do it between test runs.

This method is very consistent - tests behave the same way every time, they write the objects they need to the database. It also allows you to run all of your tests in parallel.

This requires a small change to the way you write tests - you can't make any assertions about the total number of objects in a given table, for example. If you want to test upserts by trying to write two records and then running SELECT COUNT(1) FROM table and ensuring only one record exists, that will fail because all of your tests are going to be writing to that table simultaneously. Instead, check for the number of records scoped to an account that's created for that test, or similar.

Maybe: Run each test in its own schema

Before running each test, copy the schema from a 'default' schema, and then run the test against the new schema. When the test completes, drop the schema. There's an example function to clone schema available on the Postgres wiki.

This is certainly a technique that provides you full isolation for each test, which allows you to run tests in parallel and ensure each test has a clean database. However, you have to pay the schema clone cost for each test, plus you have to create a new connection for each test. In my experiments on a 2.5 GHz core i7 Macbook Pro, cloning the database schema took about 200ms, and cloning the entire database took about 400ms. To me, these are too high of startup costs to pay to run a single test.

You could imagine having a background job with the sole purpose of creating enough schemas that you don't have to incur the 200ms latency when you actually run the tests - you could just find one of the already created schemas, though you'd need to wipe created schemas when you ran a migration, and that's getting pretty complex.

Assorted notes

If each test is creating the objects it needs, and tests don't rely on a clean database state, then you should be able to invoke t.Parallel() at the top of each database test. t.Parallel() will truly try to run each test in parallel, so you may have too much parallelism, but that's a better problem to have than too little. Experiment with the number of connections you have open to the database (or add benchmarks) to find a number that makes the most sense.
Prepared queries run about 30-40% faster than normal (fast) unprepared queries, because you don't have to pay the overhead of parsing the query statement. The Go *sql.DB object is thread safe, so you want to reuse it across all of your tests, instead of creating a new one in each test and paying the overhead to establish a connection to the database.

A library like sqlc will help make it easy for you to get prepared queries.

Thanks to Phips Peter, Kyle Conroy, Alan Shreve, and David Strauß for reviewing drafts of this post.

Liked what you read? I am available for hire.

KEVIN BURKE