Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
77
Quality
72%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/core/agent-ops-testing/SKILL.mdWorks with or without aoc CLI installed. Issue tracking can be done via direct file editing.
Provide structured guidance for test design, execution, and analysis that goes beyond baseline capture. This skill covers test strategy during planning, incremental testing during implementation, and coverage analysis.
# Python (uv/pytest)
uv run pytest # Run all tests
uv run pytest tests/ -v # Verbose output
uv run pytest tests/ -m "not slow" # Skip slow tests
uv run pytest tests/ --tb=short -q # Quick summary
uv run pytest --cov=src --cov-report=html # Coverage report
# TypeScript/Node (vitest/jest)
npm run test # Run all tests
npm run test -- --coverage # With coverage
# .NET (dotnet test)
dotnet test # Run all tests
dotnet test --collect:"XPlat Code Coverage" # With coverage| Operation | How to Do It |
|---|---|
| Create test issue | Append to .agent/issues/medium.md with type TEST |
| Create bug from failure | Append to .agent/issues/high.md with type BUG |
| Log test results | Edit issue's ### Log section in priority file |
.agent/issues/.counterWhen aoc CLI is detected in .agent/tools.json, these commands provide convenience shortcuts:
| Operation | Command |
|---|---|
| Create test issue | aoc issues create --type TEST --title "Add tests for..." |
| Create bug from failure | aoc issues create --type BUG --priority high --title "Test failure: ..." |
| Log test results | aoc issues update <ID> --log "Tests: 45 pass, 2 fail" |
Tests must NEVER create, modify, or delete files in the project folder.
unittest.mock.patch for Path, open(), file operationspytest tmp_path fixture (auto-cleaned)# ❌ NEVER do this - pollutes project
Path(".agent/test.md").write_text("test")
Path("src/data/fixture.json").write_text("{}")
open("tests/output.log", "w").write("log")
# ✅ Always use tmp_path
def test_example(tmp_path):
test_file = tmp_path / "test.md"
test_file.write_text("test") # Auto-cleanedtmp_path or mocks.agent/constitution.md exists with confirmed test command.agent/baseline.md exists (for comparison)Identify test levels needed:
Define test cases from requirements:
Document in task/plan:
## Test Strategy
- Unit: [list of unit test cases]
- Integration: [list of integration scenarios]
- Edge cases: [specific edge cases to cover]
- Not testing: [explicitly excluded with rationale]After each implementation step:
# Run specific test file
<test-runner> path/to/test_file.py
# Run tests matching pattern
<test-runner> -k "test_feature_name"
# Run with coverage
<test-runner> --coverage
# Run failed tests only (re-run)
<test-runner> --failedActual commands must come from constitution.
Coverage requirements scale with confidence level:
| Confidence | Line Coverage | Branch Coverage | Enforcement |
|---|---|---|---|
| LOW | ≥90% on changed code | ≥85% on changed code | HARD — blocks completion |
| NORMAL | ≥80% on changed code | ≥70% on changed code | SOFT — warning if missed |
| HIGH | Tests pass | N/A | MINIMAL — existing tests only |
Rationale:
Enforcement:
🎯 COVERAGE CHECK — {CONFIDENCE} Confidence
Required: ≥{line_threshold}% line, ≥{branch_threshold}% branch
Actual: {actual_line}% line, {actual_branch}% branch
[PASS] Coverage meets threshold
— OR —
[FAIL] Coverage below threshold — must add tests before completionFor LOW confidence failures:
| Metric | Target | Notes |
|---|---|---|
| Line coverage | ≥80% for new code | Not a hard rule; quality over quantity |
| Branch coverage | Critical paths covered | Focus on decision points |
| Uncovered lines | Document rationale | Some code legitimately untestable |
When tests fail unexpectedly, invoke agent-ops-debugging:
Apply systematic debugging process:
Categorize the failure:
| Category | Evidence | Action |
|---|---|---|
| Agent's change | Test passed in baseline | Fix the change |
| Pre-existing | Test failed in baseline | Document, create issue |
| Flaky | Intermittent, no code change | Fix test or document |
| Environment | Works elsewhere | Check constitution assumptions |
Handoff decision:
🔍 Test failure analysis:
- Test: {test_name}
- Category: {agent_change | pre_existing | flaky | environment}
- Root cause: {diagnosis}
Next steps:
1. Fix and re-run (if agent's change)
2. Create issue and continue (if pre-existing)
3. Deep dive with /agent-debug (if unclear)After test activities, update:
.agent/focus.md: test results summary.agent/baseline.md: if establishing new baselineAfter test analysis, invoke agent-ops-tasks discovery procedure:
Collect test-related findings:
BUG (high)TEST (medium)CHORE (medium)REFAC (low)TEST (medium)Present to user:
📋 Test analysis found {N} items:
High:
- [BUG] Flaky test: PaymentService.processAsync (failed 2/10 runs)
Medium:
- [TEST] Missing coverage for error handling in UserController
- [TEST] No edge case tests for empty input scenarios
Low:
- [REFAC] Tests have excessive mocking in OrderService.test.ts
Create issues for these? [A]ll / [S]elect / [N]oneAfter creating issues:
Created {N} test-related issues. What's next?
1. Start fixing highest priority (BUG-0024@abc123 - flaky test)
2. Continue with current work
3. Review test coverage report6213d1a
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.