Enable Supervisor API background mode for long-running agent tasks. Use when: (1) Agent needs to run tasks longer than HTTP timeout limits, (2) User says 'background mode', 'long-running', 'supervisor api', (3) Converting from streaming to background polling pattern, (4) Agent needs resilience to connection drops during execution.
62
73%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./agent-openai-agents-sdk-multiagent/.claude/skills/supervisor-api-background-mode/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a well-structured description with excellent trigger coverage and completeness, featuring an explicit 'Use when' clause with four distinct trigger scenarios. Its main weakness is that the 'what' portion could be more specific about the concrete actions performed (e.g., configuring polling, handling reconnection logic, setting up background job submission). Overall it's a strong description that would perform well in skill selection.
Suggestions
Expand the 'what' portion with more concrete actions, e.g., 'Configures background job submission, implements polling loops, handles reconnection logic' to improve specificity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain (Supervisor API background mode) and the general action (enable background mode for long-running tasks), but doesn't list multiple concrete actions like 'configure polling intervals, handle reconnection, set timeout parameters'. The specific actions are more about when to use it than what it concretely does. | 2 / 3 |
Completeness | Clearly answers both 'what' (enable Supervisor API background mode for long-running agent tasks) and 'when' with an explicit numbered list of four trigger conditions. The 'Use when' clause is explicit and well-structured. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms: 'background mode', 'long-running', 'supervisor api', 'streaming', 'background polling pattern', 'connection drops', 'HTTP timeout'. These cover a good range of terms a user would naturally say when needing this skill. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche around Supervisor API background mode specifically. The triggers are specific enough (background polling, supervisor API, HTTP timeout limits) that this is unlikely to conflict with general API or task management skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
57%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill excels at actionability with complete, executable code and thorough testing instructions. However, it suffers from being monolithic — the full implementations of two Python files are inlined, making it very long and token-heavy. Adding explicit validation checkpoints between steps and splitting the code into referenced bundle files would significantly improve both workflow clarity and progressive disclosure.
Suggestions
Move the full utils.py and agent.py code into bundle files and reference them from SKILL.md, keeping only key snippets inline to illustrate the critical differences from the base supervisor-api pattern.
Add explicit validation checkpoints between steps, e.g., 'After creating utils.py, verify the import works: `python -c "from agent_server.utils import create_supervisor_client"`' before proceeding to Step 2.
Restructure the skill with a concise overview section showing the key API changes (background=True, polling pattern) and move the detailed gotchas and testing sections to a separate reference file.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly long with extensive inline code that could potentially be referenced from separate files. The comparison table and architecture diagram add value, but the verbose logging throughout the code examples and some explanatory text (e.g., explaining what background mode does when the description already covers it) add unnecessary tokens. However, it mostly avoids explaining concepts Claude already knows. | 2 / 3 |
Actionability | The skill provides fully executable, copy-paste ready Python code for both utils.py and agent.py, complete curl commands for testing, and specific configuration values. The code is complete with imports, type hints, and error handling — not pseudocode. | 3 / 3 |
Workflow Clarity | The steps are clearly sequenced (Step 1: utils.py, Step 2: agent.py, then testing), and the gotchas section addresses important edge cases. However, there are no explicit validation checkpoints between steps — no 'verify the polling works before proceeding' or 'validate your utils.py imports correctly before updating agent.py' steps. For a multi-step process involving async polling and streaming conversion, validation/verification steps between stages would be important. | 2 / 3 |
Progressive Disclosure | The skill is a monolithic wall of content with ~300+ lines of inline code. The full implementation of utils.py and agent.py are embedded directly rather than being referenced as separate files. There are references to the 'supervisor-api' skill for prerequisites, but the skill itself has no internal progressive disclosure — everything is dumped inline with no separation of overview from detailed implementation. | 1 / 3 |
Total | 8 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (505 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
1c88215
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.