Use before implementing or refactoring software. Contains two skills: (1) Modular Software Design — for designing module boundaries, APIs, layers, abstractions, services, repositories, adapters, or architecture, helping reduce total system complexity by creating deep modules, hiding implementation knowledge, avoiding leakage and pass-through APIs, comparing alternative designs, documenting interfaces before coding, and critiquing existing architecture; and (2) Software Testing — for writing unit tests, integration tests, or end-to-end tests, creating mocks/stubs/fakes, designing a testing strategy, doing TDD, reviewing test quality, fixing flaky tests, or refactoring test suites, generating risk-focused test plans, picking appropriate test levels, choosing between mocks/fakes/real dependencies, and applying Arrange-Act-Assert patterns with concrete examples.
93
94%
Does it follow best practices?
Impact
92%
1.12xAverage score across 5 eval scenarios
Passed
No known issues
Use this skill whenever you are designing, writing, extending, or reviewing tests. The goal is to produce the smallest suite that gives high confidence about the most likely and most costly failures, not to maximize coverage numbers.
Design tests that:
def test_discount_switches_at_loyalty_threshold():
# Arrange
customer_999 = Customer(points=999)
customer_1000 = Customer(points=1000)
basket = Basket(total=Decimal("100.00"))
# Act & Assert
assert price(customer_999, basket) == Decimal("100.00")
assert price(customer_1000, basket) == Decimal("95.00")Why this works: Tests the business rule (threshold boundary) through public outputs. No mocks of internal collaborators. Would fail if the discount logic regresses.
Use references/test-doubles.md for fake/mock examples and selection rules.
def test_repository_persists_order_with_line_items(db_session):
# Arrange
repo = OrderRepository(db_session)
order = Order(customer_id="cust-1", items=[
LineItem(sku="SKU-123", qty=2, price=Decimal("10.00"))
])
# Act
repo.save(order)
# Assert
loaded = repo.get_by_id(order.id)
assert len(loaded.items) == 1
assert loaded.items[0].sku == "SKU-123"Why this works: Tests the real persistence boundary (ORM mapping, query, serialization) rather than mocking the database.
| Level | Use When |
|---|---|
| Unit-style | Domain rules, calculations, branching logic, state transitions, parsing, input combinations |
| Narrow integration | Translation across boundaries: repository queries, ORM mappings, serialization, framework wiring, adapters |
| Contract | Two deployable components evolve independently; risk is interface mismatch |
| End-to-end | Critical user journeys, cross-system smoke checks; keep sparse and fast |
Use references/test-doubles.md for double selection rules. Short version: substitute true boundaries and owned adapters; avoid mocking internal choreography.
Use references/property-based-testing.md when example lists are unwieldy and behaviour is better expressed as an invariant.
Default to public APIs or public seams. Tests are just another client.
Testing internal behaviour is justified only after extracting a meaningful abstraction that now has its own public contract. Do not make private members public solely for tests.
Stop when:
Continue when:
See references/test-smells.md for detailed symptoms and fixes for:
See references/ for deeper material:
See reusable formats: