Enable Supervisor API background mode for long-running agent tasks. Use when: (1) Agent needs to run tasks longer than HTTP timeout limits, (2) User says 'background mode', 'long-running', 'supervisor api', (3) Converting from streaming to background polling pattern, (4) Agent needs resilience to connection drops during execution.
83
80%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./agent-openai-agents-sdk-multiagent/.claude/skills/supervisor-api-background-mode/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a well-structured description with excellent completeness and distinctiveness. The explicit 'Use when' clause with four numbered scenarios provides clear trigger guidance. The main weakness is that the 'what' portion could list more concrete actions beyond just 'Enable' to better convey the full scope of the skill's capabilities.
Suggestions
Expand the capability description to list more specific actions, e.g., 'Enable Supervisor API background mode, configure polling intervals, handle task status checks, and manage reconnection for long-running agent tasks.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain (Supervisor API background mode) and a key action ('Enable'), but it doesn't list multiple concrete actions beyond enabling. The 'Use when' clauses describe scenarios rather than additional specific capabilities like configuring polling, handling reconnection, or setting timeouts. | 2 / 3 |
Completeness | Clearly answers both 'what' (enable Supervisor API background mode for long-running agent tasks) and 'when' with an explicit numbered list of four trigger scenarios. The 'Use when' clause is explicit and well-structured. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would actually say: 'background mode', 'long-running', 'supervisor api', 'connection drops', 'streaming', 'background polling pattern', and 'HTTP timeout'. These cover multiple natural variations of how a user might describe this need. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche around Supervisor API background mode specifically. The combination of 'supervisor api', 'background mode', 'HTTP timeout limits', and 'background polling pattern' creates a very specific trigger profile unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
70%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill excels at actionability with fully executable code and clear workflow sequencing including error handling and validation. However, it suffers significantly from being a monolithic document — two complete Python modules, an architecture diagram, a comparison table, gotchas, and testing instructions are all inline, making it very token-heavy. The content would benefit greatly from splitting code into bundle files and keeping SKILL.md as a concise overview with references.
Suggestions
Move the complete Python code for utils.py and agent.py into bundle files and reference them from SKILL.md (e.g., 'See [agent_server/utils.py](agent_server/utils.py) for the polling implementation')
Trim the comparison table and architecture diagram to essential differences only — Claude can infer most of the base vs background mode distinctions from the code
Consider moving the testing section (curl commands and expected output) into a separate TESTING.md file referenced from the main skill
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly long with extensive inline code that could potentially be referenced from separate files. The architecture diagram, comparison table, and logging details add useful context but contribute to a large token footprint. Some explanatory comments within the code are helpful but could be trimmed. | 2 / 3 |
Actionability | The skill provides fully executable, copy-paste ready Python code for both utils.py and agent.py, complete curl commands for testing, and concrete expected log output. Every step has specific, runnable code with real library imports and API calls. | 3 / 3 |
Workflow Clarity | The workflow is clearly sequenced (Step 1: utils.py, Step 2: agent.py), with explicit prerequisites, a 'Before Starting' user interaction step, detailed gotchas covering error scenarios (incomplete items, MCP approval flow), and testing/validation steps with expected output. The polling loop itself has built-in error recovery (retry on retrieve failure). | 3 / 3 |
Progressive Disclosure | The skill is a monolithic wall of text with ~300+ lines of inline code that could be split into separate referenced files. There are no bundle files provided, yet the content includes two complete Python modules inline. The architecture diagram, comparison table, gotchas, and testing sections are all crammed into a single file with no references to supporting documents. | 1 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (505 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
dfeb4ac
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.