autolab-hermes-delegation

Use Hermes delegate_task cleanly in this repo for planner, reviewer, researcher, reporter, experiment-worker, and memory-keeper roles.

5.00x

Quality

56%

Does it follow best practices?

Impact

100%

5.00x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Fix and improve this skill with Tessl

tessl review fix ./projects/pre-training/.agents/skills/autolab-hermes-delegation/SKILL.md

Quality

Content

79%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, actionable skill that provides concrete delegate_task templates for six distinct roles with appropriate toolset configurations and context strings. Its main weakness is the lack of explicit validation/error-recovery steps in the experiment worker flow, which involves destructive operations (editing train.py, submitting patches). The content is well-organized but could benefit from splitting role templates into a reference file to keep the main skill leaner.

Suggestions

Add explicit validation checkpoints to the Experiment Worker Flow (e.g., 'Verify worktree is clean before editing', 'Validate metric output before running submit_patch.py', 'If submission fails: check error and retry').

Consider splitting the full delegate_task templates into a companion TEMPLATES.md file, keeping only the role defaults table and the experiment worker flow in the main SKILL.md.

Dimension	Reasoning	Score
Conciseness	The skill is lean and efficient. It assumes Claude understands Hermes delegation, avoids explaining what delegation is, and every section delivers actionable configuration or copy-paste templates. No wasted tokens on concepts Claude already knows.	3 / 3
Actionability	Every role has a concrete, copy-paste-ready `delegate_task(...)` Python call with exact parameters including goal, context, toolsets, and max_iterations. The experiment worker flow provides specific CLI commands. This is fully executable guidance.	3 / 3
Workflow Clarity	The experiment worker flow has a clear 4-step sequence, but it lacks explicit validation checkpoints or error recovery steps. For a destructive/batch operation like running experiments and submitting patches, there's no validate-then-proceed feedback loop. The other role templates are single-shot delegations which are clear but the worker flow could benefit from validation gates.	2 / 3
Progressive Disclosure	The content is well-structured with clear sections per role, but it's a moderately long monolithic file (~120 lines of substantive content). The role defaults table and the full templates could potentially be split, and there are references to external files (AGENTS.md, various research/ paths) but no linked companion skill files for deeper details. For its length, inline organization is decent but not optimally layered.	2 / 3
	Total	10 / 12 Passed

Description

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a specific tool (Hermes delegate_task) and enumerates roles, which provides some specificity, but it lacks concrete action descriptions beyond 'use cleanly' and entirely omits a 'Use when...' clause. The language is somewhat jargon-heavy and wouldn't match natural user queries well, and the phrase 'in this repo' is vague and non-informative.

Suggestions

Add an explicit 'Use when...' clause describing trigger scenarios, e.g., 'Use when the user wants to delegate subtasks to specialized agents, coordinate multi-agent workflows, or mentions Hermes task delegation.'

Replace 'Use Hermes delegate_task cleanly' with concrete actions, e.g., 'Delegates subtasks to specialized agent roles (planner, reviewer, researcher, reporter, experiment-worker, memory-keeper) using Hermes delegate_task, managing task handoffs and result aggregation.'

Add natural language trigger terms users might say, such as 'delegate work', 'assign tasks', 'multi-agent coordination', or 'split work across agents'.

Dimension	Reasoning	Score
Specificity	It names a specific tool ('Hermes delegate_task') and lists specific roles (planner, reviewer, researcher, reporter, experiment-worker, memory-keeper), but doesn't describe what concrete actions are performed with those roles beyond 'use cleanly'.	2 / 3
Completeness	It partially addresses 'what' (use Hermes delegate_task with specific roles) but has no explicit 'when' clause or trigger guidance. The description lacks a 'Use when...' statement, which per the rubric should cap completeness at 2, and the 'what' is also weak enough to warrant a 1.	1 / 3
Trigger Term Quality	Includes some relevant keywords like 'Hermes', 'delegate_task', and the role names, but these are fairly technical/tool-specific terms. Missing natural language triggers a user might say like 'delegate work', 'assign tasks', 'multi-agent', or 'orchestrate'.	2 / 3
Distinctiveness Conflict Risk	The mention of 'Hermes delegate_task' and the specific role names provides some distinctiveness, but 'in this repo' is vague and the description could overlap with general task management or orchestration skills.	2 / 3
	Total	7 / 12 Passed

Validation

72%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 8 / 11 Passed

Validation for skill structure

Criteria	Description	Result
metadata_version	'metadata.version' is missing	Warning
metadata_field	'metadata' should map string keys to string values	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	8 / 11 Passed

Repository: huggingface/context-course
Commit: 0448a7c

Reviewed: 26 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.