Name: supervisor-api-background-mode
Rating: 83 (1 reviews)
Author: databricks

supervisor-api-background-mode

Enable Supervisor API background mode for long-running agent tasks. Use when: (1) Agent needs to run tasks longer than HTTP timeout limits, (2) User says 'background mode', 'long-running', 'supervisor api', (3) Converting from streaming to background polling pattern, (4) Agent needs resilience to connection drops during execution.

Quality

80%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./agent-openai-agents-sdk-multiagent/.claude/skills/supervisor-api-background-mode/SKILL.md

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-structured description with excellent completeness and distinctiveness. The explicit 'Use when' clause with four numbered scenarios provides clear trigger guidance. The main weakness is that the 'what' portion could list more concrete actions beyond just 'Enable' to better convey the full scope of the skill's capabilities.

Suggestions

Expand the capability description to list more specific actions, e.g., 'Enable Supervisor API background mode, configure polling intervals, handle task status checks, and manage reconnection for long-running agent tasks.'

Dimension	Reasoning	Score
Specificity	The description names the domain (Supervisor API background mode) and a key action ('Enable'), but it doesn't list multiple concrete actions beyond enabling. The 'Use when' clauses describe scenarios rather than additional specific capabilities like configuring polling, handling reconnection, or setting timeouts.	2 / 3
Completeness	Clearly answers both 'what' (enable Supervisor API background mode for long-running agent tasks) and 'when' with an explicit numbered list of four trigger scenarios. The 'Use when' clause is explicit and well-structured.	3 / 3
Trigger Term Quality	Includes strong natural trigger terms users would actually say: 'background mode', 'long-running', 'supervisor api', 'connection drops', 'streaming', 'background polling pattern', and 'HTTP timeout'. These cover multiple natural variations of how a user might describe this need.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive with a clear niche around Supervisor API background mode specifically. The combination of 'supervisor api', 'background mode', 'HTTP timeout limits', and 'background polling pattern' creates a very specific trigger profile unlikely to conflict with other skills.	3 / 3
	Total	11 / 12 Passed

Implementation

70%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill excels at actionability with fully executable code and clear workflow sequencing including error handling and validation. However, it suffers significantly from being a monolithic document — two complete Python modules, an architecture diagram, a comparison table, gotchas, and testing instructions are all inline, making it very token-heavy. The content would benefit greatly from splitting code into bundle files and keeping SKILL.md as a concise overview with references.

Suggestions

Move the complete Python code for utils.py and agent.py into bundle files and reference them from SKILL.md (e.g., 'See [agent_server/utils.py](agent_server/utils.py) for the polling implementation')

Trim the comparison table and architecture diagram to essential differences only — Claude can infer most of the base vs background mode distinctions from the code

Consider moving the testing section (curl commands and expected output) into a separate TESTING.md file referenced from the main skill

Dimension	Reasoning	Score
Conciseness	The skill is fairly long with extensive inline code that could potentially be referenced from separate files. The architecture diagram, comparison table, and logging details add useful context but contribute to a large token footprint. Some explanatory comments within the code are helpful but could be trimmed.	2 / 3
Actionability	The skill provides fully executable, copy-paste ready Python code for both utils.py and agent.py, complete curl commands for testing, and concrete expected log output. Every step has specific, runnable code with real library imports and API calls.	3 / 3
Workflow Clarity	The workflow is clearly sequenced (Step 1: utils.py, Step 2: agent.py), with explicit prerequisites, a 'Before Starting' user interaction step, detailed gotchas covering error scenarios (incomplete items, MCP approval flow), and testing/validation steps with expected output. The polling loop itself has built-in error recovery (retry on retrieve failure).	3 / 3
Progressive Disclosure	The skill is a monolithic wall of text with ~300+ lines of inline code that could be split into separate referenced files. There are no bundle files provided, yet the content includes two complete Python modules inline. The architecture diagram, comparison table, gotchas, and testing sections are all crammed into a single file with no references to supporting documents.	1 / 3
	Total	9 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (505 lines); consider splitting into references/ and linking	Warning

	Total	10 / 11 Passed

Repository: databricks/app-templates
Commit: dfeb4ac

Reviewed: about 19 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.