CtrlK
BlogDocsLog inGet started
Tessl Logo

experiment-bridge

Workflow 1.5: Bridge between idea discovery and auto review. Reads EXPERIMENT_PLAN.md, implements experiment code, deploys to GPU, collects initial results. Use when user says "实现实验", "implement experiments", "bridge", "从计划到跑实验", "deploy the plan", or has an experiment plan ready to execute.

88

Quality

85%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly defines a specific workflow step in an experiment pipeline. It excels at providing concrete actions, explicit trigger guidance in multiple languages, and a well-defined niche that distinguishes it from other skills. The description is concise yet comprehensive, covering all essential information without unnecessary verbosity.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: reads EXPERIMENT_PLAN.md, implements experiment code, deploys to GPU, collects initial results. These are clear, actionable steps in a defined workflow.

3 / 3

Completeness

Clearly answers both 'what' (reads experiment plan, implements code, deploys to GPU, collects results) and 'when' (explicit 'Use when' clause with specific trigger phrases and a situational trigger). Both dimensions are well-covered.

3 / 3

Trigger Term Quality

Includes strong natural trigger terms in both English and Chinese: '实现实验', 'implement experiments', 'bridge', '从计划到跑实验', 'deploy the plan', plus the contextual trigger 'experiment plan ready to execute'. Good coverage of how users would naturally phrase this request.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with a clear niche: it's specifically a bridge workflow (1.5) between idea discovery and auto review, targeting GPU deployment of experiment plans. The workflow numbering, specific file reference (EXPERIMENT_PLAN.md), and bilingual triggers make it very unlikely to conflict with other skills.

3 / 3

Total

12

/

12

Passed

Implementation

70%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured orchestration skill with excellent workflow clarity and progressive disclosure—the phased approach with validation gates is a strength. Its main weaknesses are moderate verbosity (some display templates and backend rules could be more concise) and limited executable code examples, relying heavily on placeholder commands and checklists rather than copy-paste-ready scripts.

Suggestions

Replace placeholder commands like `/run-experiment [experiment commands]` with at least one concrete, fully executable example showing actual arguments and expected output

Tighten the backend lifecycle rules section—consider moving Vast.ai/Modal/SSH specifics to a referenced file since they add significant length for conditional content

DimensionReasoningScore

Conciseness

The skill is quite long (~300 lines) with some sections that could be tightened—e.g., the backend lifecycle rules, the detailed checkpoint display templates, and the composing section. However, most content is genuinely novel configuration and workflow logic that Claude wouldn't know, so it's not padding with known concepts. It sits between lean and verbose.

2 / 3

Actionability

The skill provides concrete workflow steps, specific file paths, and example output templates, but the actual executable code is minimal—most commands are placeholders like `/run-experiment [experiment commands]` and `spawn_agent:` pseudo-YAML. The implementation guidance in Phase 2 is a checklist rather than executable code. It's actionable at a process level but lacks copy-paste-ready code.

2 / 3

Workflow Clarity

The multi-phase workflow is clearly sequenced (Parse → Implement → Code Review → Sanity → Deploy → Collect → Handoff) with explicit validation checkpoints: sanity check before full deployment, code review gate, success criteria checks, AUTO_DEPLOY checkpoint, and rescue-on-failure loops. Feedback loops for error recovery are well-defined (fix and re-run sanity, re-run code review once for blocking issues).

3 / 3

Progressive Disclosure

The skill is well-structured as an overview with clear references to external files (EXPERIMENT_PLAN.md, FINAL_PROPOSAL.md, shared protocols via links). It delegates to other skills (/run-experiment, /experiment-queue, /monitor-experiment, /ablation-planner, /training-check) without inlining their content. References are one level deep and clearly signaled.

3 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
wanshuiyin/Auto-claude-code-research-in-sleep
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.