Use Hermes delegate_task cleanly in this repo for planner, reviewer, researcher, reporter, experiment-worker, and memory-keeper roles.
69
56%
Does it follow best practices?
Impact
100%
5.00xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Fix and improve this skill with Tessl
tessl review fix ./projects/pre-training/.agents/skills/autolab-hermes-delegation/SKILL.mdUse this when Hermes is the parent control plane for this repo.
Hermes children get a fresh context, cannot ask the user for clarification, cannot delegate again, and only return their final summary to the parent. Pass the full role contract in every delegate_task(...) call.
AGENTS.md as the only checked-in rulebook. Do not add .hermes.md.terminal,file,web,skills,delegation,clarifymemory out of the default toolsets so repo markdown stays the durable record.min(gpu_slots, 3) per parent session.planner
["file"]reviewer
["file"]researcher
["web", "file", "skills", "terminal"]reporter
["terminal", "file", "skills"]memory-keeper
["file"]experiment-worker
["terminal", "file", "skills"]delegate_task(
goal="Propose up to 3 fresh Autolab experiments against the current local promoted master.",
context="""Read AGENTS.md, README.md, research/notes.md, research/do-not-repeat.md,
research/campaigns/, research/experiments/, research/results.tsv, research/live/master.json,
and research/live/dag.json.
Return a ranked queue of 1-3 fresh experiments. Each must include:
- short title
- one-sentence hypothesis
- parent master hash
- exact single variable being changed
- expected upside
- reason it is not a duplicate
Do not run commands that mutate the repo. Do not propose multi-change ideas.""",
toolsets=["file"],
max_iterations=20,
)delegate_task(
goal="Review this Autolab plan or result for rule violations and comparability risk.",
context="""Read AGENTS.md and the provided experiment details.
Prioritize:
- hard-rule violations
- stale-master risk
- duplicate experiments
- multi-change patches
- missing benchmark evidence
- incorrect submit or no-submit decisions
Return concise findings with exact file or evidence references.""",
toolsets=["file"],
max_iterations=20,
)delegate_task(
goal="Find up to 3 paper-derived single-change Autolab ideas that map cleanly to train.py.",
context="""Read AGENTS.md, research/notes.md, research/do-not-repeat.md,
research/paper-ideas.md, research/results.tsv, research/live/master.json, and research/live/dag.json.
Use the repo's Hugging Face skills when useful. Reject ideas already present in code or already ruled out.
Return the smallest credible change to test for each idea and the main risk if it fails.""",
toolsets=["web", "file", "skills", "terminal"],
max_iterations=30,
)delegate_task(
goal="Summarize current Autolab fleet status and call out duplicate or stale active jobs.",
context="""Use the repo reporter workflow:
- . ~/.autolab/credentials
- uv run scripts/trackio_reporter.py summary --max-jobs 25
- uv run scripts/trackio_reporter.py sync --project ${AUTOLAB_TRACKIO_PROJECT:-autolab} when needed
Treat Trackio plus HF Jobs metadata as the source of truth.
Do not edit repo markdown or code.""",
toolsets=["terminal", "file", "skills"],
max_iterations=25,
)uv run scripts/hermes_worker.py create <experiment-id> --campaign ... --hypothesis ...uv run scripts/hermes_worker.py delegate <experiment-id>delegate_task(...) block into the parent session.cd into the reserved worktreeAUTOLAB_CAMPAIGN, AUTOLAB_EXPERIMENT_ID, AUTOLAB_WORKER_ID, AUTOLAB_HYPOTHESIS, AUTOLAB_LOG_PATH, and AUTOLAB_EXPERIMENT_NOTEtrain.py onlysubmit_patch.pydelegate_task(
goal="Update the durable Autolab markdown after a worker completed.",
context="""Read AGENTS.md plus research/notes.md, research/do-not-repeat.md,
research/campaigns/, research/experiments/, and the worker's final summary.
Update only the durable markdown in the main checkout.
Preserve:
- hypothesis tested
- parent master hash
- local val_bpb or failure state
- submit decision
- one short interpretation""",
toolsets=["file"],
max_iterations=25,
)0448a7c
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.