experiment-queue

SSH job queue for multi-seed/multi-config ML experiments with OOM-aware retry, stale-screen cleanup, and wave-transition race prevention. Use when user says "batch experiments", "队列实验", "run grid", "multi-seed sweep", "auto-chain experiments", or when /run-experiment is insufficient for 10+ jobs that need orchestration.

Quality

88%

Does it follow best practices?

Run evals on this skill

Adds up to 20 points to the overall score

View guide

Securityby

Passed

No findings from the security scan

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body is highly actionable and well-sequenced with strong validation feedback loops, but trades conciseness for inline migration commentary and keeps detail sections that could be split into references. Scripts are real and correctly referenced.

Suggestions

Move the Layer 0-4 resolver-chain commentary and Phase 3.3/#174/#366 migration notes into a short 'resolution.md' reference; keep only the executable fallback snippet in the body.

Extract the Grid Spec Syntax, Wave Chaining, OOM Handling, and Stale Screen Detection sections into separate reference files (e.g. GRID.md, WAVES.md) linked from a concise overview, so the SKILL.md body stays a lean overview.

Trim the legacy os.execv shim explanations in 'See Also' to a single line pointing at the canonical scripts/ paths, since the resolution chain is already covered in Step 3a.

Dimension	Reasoning	Score
Conciseness	Mostly efficient with executable bash/yaml throughout, but verbose passages — the Layer 0-4 resolver-chain prose and Phase 3.3 / '#174' / '#366' migration commentary — could be tightened or moved to a reference.	2 / 3
Actionability	Provides fully executable, copy-paste-ready bash, yaml, and regex blocks (launch, scp, monitoring jq query, OOM regex), matching the top anchor for concrete guidance.	3 / 3
Workflow Clarity	Five-step workflow is clearly sequenced with explicit validation checkpoints (pre-flight precondition checks) and feedback loops for OOM retry, stale-screen detection, and resume-on-restart.	3 / 3
Progressive Disclosure	Real bundle scripts exist under scripts/ and are referenced, but the ~415-line body keeps Grid Spec, Wave Chaining, OOM Handling, and Stale Screen sections inline that would be better split into one-level reference files.	2 / 3
	Total	10 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is specific, trigger-rich, complete, and clearly differentiated from sibling skills. It names concrete capabilities, includes bilingual natural-language triggers, and supplies an explicit 'Use when' clause.

Dimension	Reasoning	Score
Specificity	Lists multiple concrete actions — 'OOM-aware retry', 'stale-screen cleanup', and 'wave-transition race prevention' — rather than vague language, matching the top anchor.	3 / 3
Completeness	Explicitly answers both what (SSH job queue for multi-seed/multi-config ML experiments) and when ('Use when user says... /run-experiment is insufficient for 10+ jobs'), satisfying the explicit-trigger bar.	3 / 3
Trigger Term Quality	Covers natural terms a user would say, including bilingual variants: 'batch experiments', '队列实验', 'run grid', 'multi-seed sweep', 'auto-chain experiments', with good coverage of common phrasings.	3 / 3
Distinctiveness Conflict Risk	Clear niche (SSH multi-GPU batch orchestration) with distinct triggers and an explicit contrast against '/run-experiment', making wrong-skill triggering unlikely.	3 / 3
	Total	12 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 13 / 16 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning
relative_links	Relative link issues: 1 suspicious	Warning

	Total	13 / 16 Passed

Repository: wanshuiyin/Auto-claude-code-research-in-sleep
Path: skills/experiment-queue/SKILL.md
Commit: c5f3d5b

Reviewed: about 1 hour ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.