Push decisions to Arbiter Zebu for async human review. Use when you need human input on plans, architectural choices, or approval before proceeding.
59
70%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./public/skills/5hanth/arbiter/SKILL.mdQuality
Discovery
75%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is functional with a clear 'what' and 'when' structure, and its reference to a specific system (Arbiter Zebu) makes it distinctive. However, it could benefit from more specific concrete actions and broader trigger term coverage to help Claude match it in more varied user requests.
Suggestions
Add more specific concrete actions, e.g., 'Creates review requests, tracks pending decisions, formats proposals for human reviewers'
Expand trigger terms to include natural variations like 'need feedback', 'waiting for sign-off', 'decision review', 'human approval', or 'blocked on decision'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (async human review) and some actions (push decisions, get human input on plans/architectural choices/approval), but doesn't list multiple concrete actions like creating review requests, tracking decisions, or formatting proposals. | 2 / 3 |
Completeness | Clearly answers both what ('Push decisions to Arbiter Zebu for async human review') and when ('Use when you need human input on plans, architectural choices, or approval before proceeding') with an explicit 'Use when' clause. | 3 / 3 |
Trigger Term Quality | Includes some relevant terms like 'human input', 'plans', 'architectural choices', 'approval', but misses natural variations users might say such as 'review', 'decision', 'sign-off', 'feedback', 'block on human', or 'wait for approval'. The term 'Arbiter Zebu' is a proper noun that users would need to know. | 2 / 3 |
Distinctiveness Conflict Risk | The specific mention of 'Arbiter Zebu' as a named system and the focus on async human review/approval creates a clear niche that is unlikely to conflict with other skills. The combination of decision-making, human review, and async workflow is quite distinctive. | 3 / 3 |
Total | 10 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, actionable skill with excellent concrete examples and CLI commands that are copy-paste ready. Its main weaknesses are moderate verbosity (installation options, usage guidance sections) and a lack of explicit workflow sequencing with validation steps. The content would benefit from being split into a concise overview and a detailed reference file.
Suggestions
Add an explicit workflow sequence (e.g., '1. Push plan → 2. Poll status → 3. Get answers → 4. Process notifications') with validation checkpoints at each step.
Move the detailed field reference tables and return value schemas into a separate REFERENCE.md to keep SKILL.md as a concise overview.
Remove or significantly trim the 'When to Use' / 'Do NOT use for' section — Claude can infer appropriate usage from the description and examples.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly well-structured but includes some unnecessary content like the 'When to Use' / 'Do NOT use for' sections that explain judgment calls Claude can make itself, and the installation section is verbose with three different methods. The tables and examples are useful but could be tighter. | 2 / 3 |
Actionability | Excellent actionability — every tool has a concrete, copy-paste-ready CLI command with full JSON examples, clearly documented fields with required/optional markers, and explicit return value JSON. The usage examples are complete and executable bash scripts. | 3 / 3 |
Workflow Clarity | The push → status → get workflow is implicitly clear from the examples, but there's no explicit sequenced workflow with validation checkpoints. Example 2 shows a check-and-proceed pattern, but there's no explicit error recovery loop for common failure modes (e.g., what to do if push fails, if the queue directory doesn't exist). | 2 / 3 |
Progressive Disclosure | The content is well-organized with clear sections and tables, but it's a long monolithic document (~200 lines) that could benefit from splitting detailed API reference (field tables, return schemas) into a separate REFERENCE.md. The 'See Also' links point to external repos but no bundle files support the skill. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
metadata_field | 'metadata' should map string keys to string values | Warning |
Total | 9 / 11 Passed | |
f45fcb5
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.