Name: roboflow-training-and-evaluation
Rating: 72.8 (1 reviews)
Author: roboflow

roboflow-training-and-evaluation

Use when training Roboflow models, improving accuracy, or setting up a production feedback loop — covers architecture selection, model IDs, checkpoints, evaluation metrics, the iterative improvement playbook, and active learning via the Dataset Upload workflow block.

Quality

—

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

A highly actionable reference skill with concrete model IDs, a clear decision-tree workflow, and feedback loops. Its main weakness is progressive disclosure: the large reference tables live inline rather than in dedicated reference files, and the body could be tightened in places.

Suggestions

Move the exhaustive model ID and architecture tables into a references/ file (e.g. MODEL_IDS.md) and keep SKILL.md as an overview that links one level deep, improving progressive disclosure.

Move the full COCO-80 class list into the model-ID reference rather than inline in the body, or summarize it as 'the standard COCO 80 classes' with a pointer.

Consolidate redundant YOLO version rows (v8/v11/v12/v26 share size/resolution patterns) to tighten conciseness without losing the exact model_id values.

Dimension	Reasoning	Score
Conciseness	The body is dense, factual reference material with no conceptual padding, but the inline COCO-80 class list and exhaustive multi-version YOLO/VLM tables could be tightened — 'mostly efficient but could be tightened' rather than 'every token earns its place'.	2 / 3
Actionability	Exact model_id tables prefixed with 'Do not guess — wrong IDs cause training failures', named MCP tools, and concrete paths like `/{workspace}/{project}/nas-runs/{versionId}` and `?engine=nas` give copy-paste-ready guidance.	3 / 3
Workflow Clarity	The Training Flow plus a numbered 14-step decision tree includes explicit trial→confirm→fallback loops ('User confirms works → Done. Poor results → Step 13'), providing a clear sequence with feedback loops for error recovery.	3 / 3
Progressive Disclosure	Sections are well-organized and the 'Related Pages' are one-level-deep and clearly signaled, but the bulk reference material (model ID/architecture tables, COCO list) is inline with no bundle files to offload it, so content is not appropriately split.	2 / 3
	Total	10 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A strong, third-person description with an explicit 'Use when' trigger and a concrete, comprehensive scope list. It clearly distinguishes itself as Roboflow-specific with natural trigger phrasing.

Dimension	Reasoning	Score
Specificity	Lists multiple concrete scope items — 'architecture selection, model IDs, checkpoints, evaluation metrics, the iterative improvement playbook, and active learning' — with no vague fluff, matching the 'lists multiple specific concrete actions' anchor.	3 / 3
Completeness	An explicit 'Use when...' clause answers when, and the enumerated scope list answers what, satisfying the 'clearly answers both what AND when' anchor; not capped at 2 because the trigger is explicit, not implied.	3 / 3
Trigger Term Quality	The trigger clause 'training Roboflow models, improving accuracy, or setting up a production feedback loop' uses natural phrases a Roboflow user would actually say, giving good coverage rather than just jargon.	3 / 3
Distinctiveness Conflict Risk	Roboflow-specific terms ('Roboflow models', 'Dataset Upload workflow block') carve a clear niche unlikely to trigger for unrelated skills.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository: roboflow/computer-vision-skills
Commit: bf3aefb

Reviewed: about 12 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.