Execute a strict Red-Green-Refactor TDD cycle — one requirement at a time — in any language or framework.
97
Quality
100%
Does it follow best practices?
Impact
94%
1.11xAverage score across 5 eval scenarios
Do not write tests for multiple requirements before implementing any of them. Complete one requirement fully before starting the next.
WRONG — batch all tests, then batch all implementation:
write test for requirement 1
write test for requirement 2
write test for requirement 3
...then implement all three
RIGHT — one requirement at a time, with phase gates between each step:
requirement 1: RED → (confirm) → GREEN → (confirm) → REFACTOR
requirement 2: RED → (confirm) → GREEN → (confirm) → REFACTORBefore moving to the next requirement, verify:
Good tests verify behaviour through public interfaces. Bad tests are coupled to implementation — they break when you refactor, even when nothing observable changed.
// BAD: bypasses the interface to verify internal state
test('createUser saves to database', async () => {
await createUser({ name: 'Alice' });
const row = await db.query('SELECT * FROM users WHERE name = ?', ['Alice']);
expect(row).toBeDefined();
});
// GOOD: verifies through the public interface
test('createUser makes user retrievable', async () => {
const user = await createUser({ name: 'Alice' });
const retrieved = await getUser(user.id);
expect(retrieved.name).toBe('Alice');
});The bad test couples the test to the database schema and storage mechanism.
Swap the DB engine or change the schema and the test breaks — even though the
behaviour is identical. The good test only knows about createUser and
getUser. The storage is an implementation detail.