Use before implementing or refactoring software. Contains two skills: (1) Modular Software Design — for designing module boundaries, APIs, layers, abstractions, services, repositories, adapters, or architecture, helping reduce total system complexity by creating deep modules, hiding implementation knowledge, avoiding leakage and pass-through APIs, comparing alternative designs, documenting interfaces before coding, and critiquing existing architecture; and (2) Software Testing — for writing unit tests, integration tests, or end-to-end tests, creating mocks/stubs/fakes, designing a testing strategy, doing TDD, reviewing test quality, fixing flaky tests, or refactoring test suites, generating risk-focused test plans, picking appropriate test levels, choosing between mocks/fakes/real dependencies, and applying Arrange-Act-Assert patterns with concrete examples.
88
88%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Use this skill whenever you are designing, writing, extending, or reviewing tests. The goal is to produce the smallest suite that gives high confidence about the most likely and most costly failures, not to maximize coverage numbers.
Design tests that:
Start from behaviour and risk, not from code shape. Derive tests from requirements, input space, expected outputs, and likely failure modes before looking at coverage reports.
Treat coverage as feedback, not as the goal. Coverage reveals blind spots but does not prove tests are meaningful. Use it to ask "what behaviour did we forget?" not "how do we force this number up?"
Prefer observable behaviour over implementation detail. Verify returned values, externally visible state, emitted events, persisted effects, or calls to out-of-process dependencies. Do not assert private methods or internal choreography.
Put most tests at the cheapest level that can expose the risk. Use low-level tests for combinatorial behaviour and boundaries. Use higher-level tests to validate wiring, contracts, and critical journeys.
Use doubles to control boundaries, not to duplicate object graphs. Prefer fakes over heavy mocks. Use mocks only when the outgoing interaction itself is the behaviour you care about.
Make behaviour easier to observe instead of giving tests special privileges. If tests are hard to write because outcomes are hidden, redesign the code to expose results through stable outputs, domain objects, events, or adapters.
Let tests support refactoring by testing seams and contracts. Small units, explicit dependencies, stable public interfaces, and clear boundaries make tests less fragile.
Stop adding tests when the next test adds little new information. Once major paths, boundaries, invariants, and interfaces are covered, marginal confidence gain drops sharply.
def test_discount_switches_at_loyalty_threshold():
# Arrange
customer_999 = Customer(points=999)
customer_1000 = Customer(points=1000)
basket = Basket(total=Decimal("100.00"))
# Act & Assert
assert price(customer_999, basket) == Decimal("100.00")
assert price(customer_1000, basket) == Decimal("95.00")Why this works: Tests the business rule (threshold boundary) through public outputs. No mocks of internal collaborators. Would fail if the discount logic regresses.
# Using a fake (preferred for behaviour tests)
class FakePaymentGateway:
def __init__(self):
self.captured = []
def capture(self, amount, currency):
self.captured.append((amount, currency))
return PaymentResult(success=True, transaction_id="fake-123")
def test_order_completion_charges_correct_amount():
gateway = FakePaymentGateway()
service = OrderService(gateway)
service.complete_order(order_id="ord-1")
assert gateway.captured == [(Decimal("49.99"), "USD")]# Using a mock (only when the interaction itself is the behaviour)
class MockEmailGateway:
def __init__(self):
self.sent = []
def send_welcome(self, email):
self.sent.append(email)
def test_registration_sends_welcome_email():
gateway = MockEmailGateway()
service = UserService(gateway)
service.register("alice@example.com")
assert "alice@example.com" in gateway.sentRule of thumb: Use fakes when you care about outcomes; use mocks only when you care that a specific side effect happened.
def test_repository_persists_order_with_line_items(db_session):
# Arrange
repo = OrderRepository(db_session)
order = Order(customer_id="cust-1", items=[
LineItem(sku="SKU-123", qty=2, price=Decimal("10.00"))
])
# Act
repo.save(order)
# Assert
loaded = repo.get_by_id(order.id)
assert len(loaded.items) == 1
assert loaded.items[0].sku == "SKU-123"Why this works: Tests the real persistence boundary (ORM mapping, query, serialization) rather than mocking the database.
| Level | Use When |
|---|---|
| Unit-style | Domain rules, calculations, branching logic, state transitions, parsing, input combinations |
| Narrow integration | Translation across boundaries: repository queries, ORM mappings, serialization, framework wiring, adapters |
| Contract | Two deployable components evolve independently; risk is interface mismatch |
| End-to-end | Critical user journeys, cross-system smoke checks; keep sparse and fast |
Do mock at true boundaries: out-of-process, slow, nondeterministic, shared, costly, or unsafe dependencies.
Do not mock internal collaborators that are merely part of the unit's implementation. If several small classes together realise one cohesive behaviour, test them together.
Preferences:
Architecture rule: Mock or fake types you own. Wrap third-party libraries behind your own adapter before substituting.
Use when example lists become unwieldy and behaviour is better expressed as an invariant: ordering preservation, idempotence, reversibility, monotonicity, conservation, parser round-trips, commutativity, state machine properties.
Avoid when the property is too weak, too broad, or secretly re-implements the algorithm under test.
Default to public APIs or public seams. Tests are just another client.
Testing internal behaviour is justified only after extracting a meaningful abstraction that now has its own public contract. Do not make private members public solely for tests.
Stop when:
Continue when:
See references/test-smells.md for detailed symptoms and fixes for:
See references/ for deeper material:
See templates/ for reusable formats: