arbiter

Push decisions to Arbiter Zebu for async human review. Use when you need human input on plans, architectural choices, or approval before proceeding.

Quality

70%

Does it follow best practices?

Run evals on this skill

Adds up to 20 points to the overall score

View guide

Securityby

Passed

No findings from the security scan

Fix and improve this skill with Tessl

tessl review fix ./public/skills/5hanth/arbiter/SKILL.md

Quality

Content

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill with excellent concrete examples and CLI commands that are copy-paste ready. Its main weaknesses are moderate verbosity (installation options, usage guidance sections) and a lack of explicit workflow sequencing with validation steps. The content would benefit from being split into a concise overview and a detailed reference file.

Suggestions

Add an explicit workflow sequence (e.g., '1. Push plan → 2. Poll status → 3. Get answers → 4. Process notifications') with validation checkpoints at each step.

Move the detailed field reference tables and return value schemas into a separate REFERENCE.md to keep SKILL.md as a concise overview.

Remove or significantly trim the 'When to Use' / 'Do NOT use for' section — Claude can infer appropriate usage from the description and examples.

Dimension	Reasoning	Score
Conciseness	The skill is fairly well-structured but includes some unnecessary content like the 'When to Use' / 'Do NOT use for' sections that explain judgment calls Claude can make itself, and the installation section is verbose with three different methods. The tables and examples are useful but could be tighter.	2 / 3
Actionability	Excellent actionability — every tool has a concrete, copy-paste-ready CLI command with full JSON examples, clearly documented fields with required/optional markers, and explicit return value JSON. The usage examples are complete and executable bash scripts.	3 / 3
Workflow Clarity	The push → status → get workflow is implicitly clear from the examples, but there's no explicit sequenced workflow with validation checkpoints. Example 2 shows a check-and-proceed pattern, but there's no explicit error recovery loop for common failure modes (e.g., what to do if push fails, if the queue directory doesn't exist).	2 / 3
Progressive Disclosure	The content is well-organized with clear sections and tables, but it's a long monolithic document (~200 lines) that could benefit from splitting detailed API reference (field tables, return schemas) into a separate REFERENCE.md. The 'See Also' links point to external repos but no bundle files support the skill.	2 / 3
	Total	9 / 12 Passed

Description

75%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is functional with a clear 'what' and 'when' structure, and its reference to a specific system (Arbiter Zebu) makes it distinctive. However, it could benefit from more specific concrete actions and broader trigger term coverage to help Claude match it in more varied user requests.

Suggestions

Add more specific concrete actions, e.g., 'Creates review requests, tracks pending decisions, formats proposals for human reviewers'

Expand trigger terms to include natural variations like 'need feedback', 'waiting for sign-off', 'decision review', 'human approval', or 'blocked on decision'

Dimension	Reasoning	Score
Specificity	Names the domain (async human review) and some actions (push decisions, get human input on plans/architectural choices/approval), but doesn't list multiple concrete actions like creating review requests, tracking decisions, or formatting proposals.	2 / 3
Completeness	Clearly answers both what ('Push decisions to Arbiter Zebu for async human review') and when ('Use when you need human input on plans, architectural choices, or approval before proceeding') with an explicit 'Use when' clause.	3 / 3
Trigger Term Quality	Includes some relevant terms like 'human input', 'plans', 'architectural choices', 'approval', but misses natural variations users might say such as 'review', 'decision', 'sign-off', 'feedback', 'block on human', or 'wait for approval'. The term 'Arbiter Zebu' is a proper noun that users would need to know.	2 / 3
Distinctiveness Conflict Risk	The specific mention of 'Arbiter Zebu' as a named system and the focus on async human review/approval creates a clear niche that is unlikely to conflict with other skills. The combination of decision-making, human review, and async workflow is quite distinctive.	3 / 3
	Total	10 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
metadata_version	'metadata.version' is missing	Warning
metadata_field	'metadata' should map string keys to string values	Warning

	Total	9 / 11 Passed

Repository: Demerzels-lab/elsamultiskillagent
Path: public/skills/5hanth/arbiter/SKILL.md
Commit: f45fcb5

Reviewed: about 11 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.