CtrlK
BlogDocsLog inGet started
Tessl Logo

haletothewood/behavioural-tdd

Execute a strict Red-Green-Refactor TDD cycle — one requirement at a time — in any language or framework.

97

1.11x

Quality

100%

Does it follow best practices?

Impact

94%

1.11x

Average score across 5 eval scenarios

Overview
Skills
Evals
Files

tests.mdreferences/

Good and Bad Tests

Anti-Pattern: Stacking Tests

Do not write tests for multiple requirements before implementing any of them. Complete one requirement fully before starting the next.

WRONG — batch all tests, then batch all implementation:
  write test for requirement 1
  write test for requirement 2
  write test for requirement 3
  ...then implement all three

RIGHT — one requirement at a time, with phase gates between each step:
  requirement 1: RED → (confirm) → GREEN → (confirm) → REFACTOR
  requirement 2: RED → (confirm) → GREEN → (confirm) → REFACTOR

Per-Cycle Checklist

Before moving to the next requirement, verify:

  • Test describes behaviour, not implementation
  • Test uses public interface only
  • Test would survive an internal refactor unchanged
  • Implementation is minimal — no speculative features added
  • Behavioural test is still GREEN after any refactor

Core distinction

Good tests verify behaviour through public interfaces. Bad tests are coupled to implementation — they break when you refactor, even when nothing observable changed.

The database example

// BAD: bypasses the interface to verify internal state
test('createUser saves to database', async () => {
  await createUser({ name: 'Alice' });
  const row = await db.query('SELECT * FROM users WHERE name = ?', ['Alice']);
  expect(row).toBeDefined();
});

// GOOD: verifies through the public interface
test('createUser makes user retrievable', async () => {
  const user = await createUser({ name: 'Alice' });
  const retrieved = await getUser(user.id);
  expect(retrieved.name).toBe('Alice');
});

The bad test couples the test to the database schema and storage mechanism. Swap the DB engine or change the schema and the test breaks — even though the behaviour is identical. The good test only knows about createUser and getUser. The storage is an implementation detail.

Red flags — your test is testing implementation if:

  • It mocks internal collaborators (classes/modules you own)
  • It tests private methods directly
  • It asserts on call counts or invocation order of internals
  • It breaks when you refactor without changing observable behaviour
  • The test name describes HOW, not WHAT ("calls paymentService.process" instead of "confirms order after successful payment")
  • It queries the database, filesystem, or external state directly instead of going through the interface

Good test characteristics

  • Uses the public API only
  • Describes what the system does, not how
  • Survives internal refactors unchanged
  • One logical assertion per test
  • Reads like a specification

Install with Tessl CLI

npx tessl i haletothewood/behavioural-tdd

SKILL.md

tile.json