Design test strategy using Beck's Test Desiderata — which properties matter, which tradeoffs to make. Use when the user asks "how should I test this", "what tests do I need", "review my test strategy", "is this well-tested", or when planning tests for a new feature or refactor.
96
Quality
95%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Test strategy, not test generation. Treat test design as an act of specification — articulate the contract, find the boundaries, surface hidden assumptions. Use Beck's Test Desiderata to make testing tradeoffs deliberate instead of accidental.
Every test balances these properties. No test maximizes all twelve. The skill is knowing which to prioritize.
| Property | Definition | Tension |
|---|---|---|
| Isolated | Same results regardless of run order | vs. speed (shared setup) |
| Composable | Test dimensions of variability separately | vs. writability (more tests) |
| Deterministic | Same results if nothing changes | vs. realism (real services) |
| Fast | Run quickly | vs. predictiveness (integration) |
| Writable | Cheap to write relative to code cost | vs. thoroughness |
| Readable | Comprehensible, invokes motivation | vs. conciseness |
| Behavioral | Sensitive to behavior changes | vs. structure-insensitivity |
| Structure-insensitive | Unaffected by refactoring | vs. behavioral sensitivity |
| Automated | No human intervention needed | vs. exploratory testing |
| Specific | Failure cause is obvious | vs. breadth of coverage |
| Predictive | Passing means production-ready | vs. speed and isolation |
| Inspiring | Passing builds confidence to deploy | vs. all other properties |
See references/desiderata.md for application guidance.
Before writing any test, answer:
If you can't answer these, the code's contract is unclear. Fix that first.
Every contract has edges. Test them:
Match the approach to what you're testing:
Example-based tests — specific inputs and expected outputs. Best for known contracts with clear boundaries.
Property-based tests — invariants that hold for all inputs. Best for algorithms, parsers, serialization (encode/decode roundtrip), and sorting.
Integration tests — multiple components together. Best for verifying wiring, data flow, and contracts between modules.
Snapshot tests — output matches recorded baseline. Best for rendering, serialization, and configuration.
Kent C. Dodds' priority order:
┌──────┐
│ E2E │ Few, slow, high confidence
┌┴──────┴┐
│Integra-│ Most tests here
│ tion │
┌┴────────┴┐
│ Unit │ Many, fast, specific
┌┴──────────┴┐
│ Static │ Types, linters, formatters
└────────────┘"The more your tests resemble the way your software is used, the more confidence they can give you."
Ask of each test:
| Smell | Symptom | Fix |
|---|---|---|
| Testing implementation | Breaks on refactor, behavior unchanged | Test outputs, not internals |
| Tautological test | Repeats production logic in assertions | Test observable behavior |
| Happy path only | No error/boundary cases | Add boundary analysis |
| Flaky | Passes sometimes, fails sometimes | Fix nondeterminism or mark explicitly |
| Giant arrange | 30 lines of setup for 1 assertion | Simplify the interface or use builders |
| Invisible assertion | expect(result).toBeTruthy() | Assert specific values |
| Test per method | One test per function, misses integration | Test use cases, not methods |
## Contract
[function name]: [input types] → [output type]
- Promises: [what it guarantees]
- Requires: [what inputs must satisfy]
## Test Cases
- [ ] Empty/zero input
- [ ] Single valid input
- [ ] Multiple valid inputs
- [ ] Boundary values
- [ ] Invalid inputs (error cases)
- [ ] Properties that hold for all inputs## Contract
[METHOD /path]: [request] → [response]
- Auth: [required/optional/none]
- Idempotent: [yes/no]
## Test Cases
- [ ] Happy path (valid request → expected response)
- [ ] Validation failures (400)
- [ ] Auth failures (401/403)
- [ ] Not found (404)
- [ ] Concurrent requests
- [ ] Rate limiting## Contract
[Component]: [props] → [rendered output + interactions]
## Test Cases
- [ ] Renders with required props
- [ ] Renders with all optional props
- [ ] User interactions trigger callbacks
- [ ] Loading/error/empty states
- [ ] Accessibility (keyboard nav, screen reader)When designing test strategy:
## Test Strategy for [feature/module]
### Contract
[What this code promises and requires]
### Priority Properties
[Which Desiderata properties matter most and why]
### Test Plan
1. [Test case] — [what it verifies] — [approach]
2. [Test case] — [what it verifies] — [approach]
### Tradeoffs Accepted
- [Property sacrificed] because [reason]
### Not Testing
- [What's deliberately excluded and why]After designing the test suite, ask: "If all these tests pass, would you deploy with confidence?" If no, identify what's missing. If yes, stop — more tests beyond confidence are waste.
/debugging — Test failures trigger debugging; debugging reveals missing tests/review — Reviews assess test coverage alongside code qualityskills/FRAMEWORKS.md — Full framework indexRECIPE.md — Agent recipe for parallel decomposition (2 workers)96a72fa
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.