CtrlK
BlogDocsLog inGet started
Tessl Logo

supervisor-api-background-mode

Enable Supervisor API background mode for long-running agent tasks. Use when: (1) Agent needs to run tasks longer than HTTP timeout limits, (2) User says 'background mode', 'long-running', 'supervisor api', (3) Converting from streaming to background polling pattern, (4) Agent needs resilience to connection drops during execution.

83

Quality

80%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./agent-openai-agents-sdk-multiagent/.claude/skills/supervisor-api-background-mode/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-structured description with excellent completeness and distinctiveness. The explicit 'Use when' clause with four numbered scenarios provides clear trigger guidance. The main weakness is that the 'what' portion could list more concrete actions beyond just 'Enable' to better convey the full scope of the skill's capabilities.

Suggestions

Expand the capability description to list more specific actions, e.g., 'Enable Supervisor API background mode, configure polling intervals, handle task status checks, and manage reconnection for long-running agent tasks.'

DimensionReasoningScore

Specificity

The description names the domain (Supervisor API background mode) and a key action ('Enable'), but it doesn't list multiple concrete actions beyond enabling. The 'Use when' clauses describe scenarios rather than additional specific capabilities like configuring polling, handling reconnection, or setting timeouts.

2 / 3

Completeness

Clearly answers both 'what' (enable Supervisor API background mode for long-running agent tasks) and 'when' with an explicit numbered list of four trigger scenarios. The 'Use when' clause is explicit and well-structured.

3 / 3

Trigger Term Quality

Includes strong natural trigger terms users would actually say: 'background mode', 'long-running', 'supervisor api', 'connection drops', 'streaming', 'background polling pattern', and 'HTTP timeout'. These cover multiple natural variations of how a user might describe this need.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with a clear niche around Supervisor API background mode specifically. The combination of 'supervisor api', 'background mode', 'HTTP timeout limits', and 'background polling pattern' creates a very specific trigger profile unlikely to conflict with other skills.

3 / 3

Total

11

/

12

Passed

Implementation

70%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill excels at actionability with fully executable code and clear workflow sequencing including error handling and validation. However, it suffers significantly from being a monolithic document — two complete Python modules, an architecture diagram, a comparison table, gotchas, and testing instructions are all inline, making it very token-heavy. The content would benefit greatly from splitting code into bundle files and keeping SKILL.md as a concise overview with references.

Suggestions

Move the complete Python code for utils.py and agent.py into bundle files and reference them from SKILL.md (e.g., 'See [agent_server/utils.py](agent_server/utils.py) for the polling implementation')

Trim the comparison table and architecture diagram to essential differences only — Claude can infer most of the base vs background mode distinctions from the code

Consider moving the testing section (curl commands and expected output) into a separate TESTING.md file referenced from the main skill

DimensionReasoningScore

Conciseness

The skill is fairly long with extensive inline code that could potentially be referenced from separate files. The architecture diagram, comparison table, and logging details add useful context but contribute to a large token footprint. Some explanatory comments within the code are helpful but could be trimmed.

2 / 3

Actionability

The skill provides fully executable, copy-paste ready Python code for both utils.py and agent.py, complete curl commands for testing, and concrete expected log output. Every step has specific, runnable code with real library imports and API calls.

3 / 3

Workflow Clarity

The workflow is clearly sequenced (Step 1: utils.py, Step 2: agent.py), with explicit prerequisites, a 'Before Starting' user interaction step, detailed gotchas covering error scenarios (incomplete items, MCP approval flow), and testing/validation steps with expected output. The polling loop itself has built-in error recovery (retry on retrieve failure).

3 / 3

Progressive Disclosure

The skill is a monolithic wall of text with ~300+ lines of inline code that could be split into separate referenced files. There are no bundle files provided, yet the content includes two complete Python modules inline. The architecture diagram, comparison table, gotchas, and testing sections are all crammed into a single file with no references to supporting documents.

1 / 3

Total

9

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (505 lines); consider splitting into references/ and linking

Warning

Total

10

/

11

Passed

Repository
databricks/app-templates
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.