I was pairing with a friend of mine the other night and we had an impromptu discussion about how to structure tests. Imagine that we’re building a birthday party hosting application. One of our business requirements is that the invite list can only be seen by the user who’s hosting the party. In addition, they must have set up their birthday profile in order to be able to access the invite list. Consider this set of tests:
describe('birthdayInvites', () => {
describe('given a user has not set up their birthday profile', () => {
it('fails when user attempts to execute operation', () => {})
})
describe('given a user has set up their birthday profile', () => {
it('fails when user is not the host of the party', () => {})
describe('given the user is the host of the party', () => {
it('returns a list of birthday invitations for the party when operation is invoked', () => {})
})
})
})
These tests feels hard to grok, for a few reasons:
- There are a lot of extraneous words, like ‘when user attempts to execute operation’. In this case we don’t need to say that – executing the operation is a given in the context of a test.
- It mixes situational context into both the describe and it blocks, e.g. ‘given a user has set up their birthday profile’ on L6 and ‘when user is not the host of the party’ on L7.
- It nests describe blocks, forcing the reader to maintain an extra layer of context when reading through the tests.
Here’s an edit we might consider, written with an eye towards conciseness and easy communication to the reader:
describe('birthdayInvites', () => {
describe('when user is missing a birthday profile', () => {
it('throws an error', () => {})
})
describe('when user is not the party host', () => {
it('throws an error', () => {})
})
describe('when user is both the party host and has a birthday profile', () => {
it('returns a list of invitations for the party', () => {})
})
})
In this version, we’ve stripped out all of the extra words. In addition, we’ve structured the tests so that the describe blocks define the context that this test is running in, and the it blocks define the behavior. Splitting context and behavior in this way reduces cognitive load and makes the tests easier to reason about.
One thing in particular that I like about this change is that it limits the context in the test description to only that which triggers the behavior. Before, for the case of a user who was not the party host, we included information indicating they had in fact set up their birthday profile (L6&7 in the first test example). However, the fact that they’ve set up their profile is irrelevant, since the reason why the error is thrown has to do with their not being the host.
I generally prefer to limit the test description to the absolutely necessary context, and let the reader glean the rest of the context from the test setup inside the it block.
Additionally, we’ve removed the nesting so that each describe block stands on its own at the root level. Lastly, we’ve more accurately described the behavior we expect when something goes wrong: rather than just ‘failing’, we explicitly state that we expect an error to be thrown.
In the conversation with my friend, we both agreed that we liked the second version of these tests more. They felt more intuitive and clear, resulting in a better experience for us as we worked on them, and hopefully a clearer understanding for the next engineers who might come along and look at this code.