haletothewood/behavioural-tdd

Execute a strict Red-Green-Refactor TDD cycle — one requirement at a time — in any language or framework.

1.11x

Quality

100%

Does it follow best practices?

Impact

94%

1.11x

Average score across 5 eval scenarios

Good and Bad Tests

Name: haletothewood/behavioural-tdd
Rating: 0.976 (1 reviews)
Author: haletothewood

Anti-Pattern: Stacking Tests

Do not write tests for multiple requirements before implementing any of them. Complete one requirement fully before starting the next.

WRONG — batch all tests, then batch all implementation:
  write test for requirement 1
  write test for requirement 2
  write test for requirement 3
  ...then implement all three

RIGHT — one requirement at a time, with phase gates between each step:
  requirement 1: RED → (confirm) → GREEN → (confirm) → REFACTOR
  requirement 2: RED → (confirm) → GREEN → (confirm) → REFACTOR

Per-Cycle Checklist

Before moving to the next requirement, verify:

Test describes behaviour, not implementation
Test uses public interface only
Test would survive an internal refactor unchanged
Implementation is minimal — no speculative features added
Behavioural test is still GREEN after any refactor

Core distinction

Good tests verify behaviour through public interfaces. Bad tests are coupled to implementation — they break when you refactor, even when nothing observable changed.

The database example

// BAD: bypasses the interface to verify internal state
test('createUser saves to database', async () => {
  await createUser({ name: 'Alice' });
  const row = await db.query('SELECT * FROM users WHERE name = ?', ['Alice']);
  expect(row).toBeDefined();
});

// GOOD: verifies through the public interface
test('createUser makes user retrievable', async () => {
  const user = await createUser({ name: 'Alice' });
  const retrieved = await getUser(user.id);
  expect(retrieved.name).toBe('Alice');
});

The bad test couples the test to the database schema and storage mechanism. Swap the DB engine or change the schema and the test breaks — even though the behaviour is identical. The good test only knows about createUser and getUser. The storage is an implementation detail.

Red flags — your test is testing implementation if:

It mocks internal collaborators (classes/modules you own)
It tests private methods directly
It asserts on call counts or invocation order of internals
It breaks when you refactor without changing observable behaviour
The test name describes HOW, not WHAT ("calls paymentService.process" instead of "confirms order after successful payment")
It queries the database, filesystem, or external state directly instead of going through the interface

Good test characteristics

Uses the public API only
Describes what the system does, not how
Survives internal refactors unchanged
One logical assertion per test
Reads like a specification

Install with Tessl CLI

npx tessl i haletothewood/behavioural-tdd

evals

references