Use when training Roboflow models or improving accuracy - covers architecture selection, model IDs, checkpoints, evaluation metrics, and the iterative improvement playbook.
60
70%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/training-and-evaluation/SKILL.mdQuality
Discovery
75%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description has good completeness with an explicit 'Use when' clause and clear distinctiveness tied to Roboflow model training. Its main weaknesses are that it lists topic areas rather than concrete actions (e.g., 'covers architecture selection' rather than 'selects architectures' or 'recommends architectures') and could include more natural trigger term variations that users might use when seeking help with model training.
Suggestions
Rephrase topic areas as concrete actions, e.g., 'Guides architecture selection, manages model IDs and checkpoints, interprets evaluation metrics, and applies an iterative improvement playbook for Roboflow models.'
Add more natural trigger term variations such as 'fine-tune', 'model performance', 'mAP', 'precision/recall', 'training configuration', or 'retrain' to improve keyword coverage.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (Roboflow model training) and lists several relevant concepts (architecture selection, model IDs, checkpoints, evaluation metrics, iterative improvement playbook), but these are more like topic areas than concrete actions. It doesn't use action verbs like 'select architectures', 'evaluate metrics', or 'configure checkpoints'. | 2 / 3 |
Completeness | Explicitly answers both 'what' (covers architecture selection, model IDs, checkpoints, evaluation metrics, and iterative improvement playbook) and 'when' (Use when training Roboflow models or improving accuracy). The 'Use when...' clause is present and clear. | 3 / 3 |
Trigger Term Quality | Includes some natural keywords like 'training', 'model', 'accuracy', 'Roboflow', 'checkpoints', and 'evaluation metrics'. However, it misses common user variations like 'fine-tune', 'train a model', 'mAP', 'precision', 'recall', 'overfitting', 'underfitting', or 'model performance'. | 2 / 3 |
Distinctiveness Conflict Risk | The description is clearly scoped to Roboflow model training and accuracy improvement, which is a distinct niche. The combination of 'Roboflow', 'model IDs', 'checkpoints', and 'architecture selection' makes it unlikely to conflict with other skills. | 3 / 3 |
Total | 10 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a thorough, highly actionable reference skill for Roboflow training and evaluation. Its greatest strength is the precise model ID tables and decision tree, which eliminate guesswork. Its main weaknesses are length (the document tries to be both a quick guide and an exhaustive reference in one file) and the lack of explicit validation/error-recovery steps in the training workflow.
Suggestions
Add explicit validation checkpoints to the training workflow — e.g., 'After `versions_generate`, verify version status before proceeding to train' and 'If `models_get_training_status` shows failure, check X before retrying'.
Move the exhaustive model_id tables and COCO 80 class list into separate reference files (e.g., MODEL_IDS.md, COCO_CLASSES.md) and link to them from the main skill to improve progressive disclosure and reduce token cost.
Trim the NAS section to essential decision-making info (when to use, model_id, plan requirements) and move detailed explanations of phases and references to a separate file.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is comprehensive and mostly efficient for a reference document, but includes some content Claude already knows (e.g., listing all COCO 80 classes, explaining what PDF-like concepts are in the CV domain). The model ID tables are necessary reference material, but the document is quite long and some sections like the full COCO class list and verbose NAS explanation could be trimmed. | 2 / 3 |
Actionability | Highly actionable — provides exact model_id strings that must be used verbatim, a concrete decision tree for model selection, specific MCP tool names for each action, and clear exclusion criteria for Rapid. The content is directly executable with no ambiguity about what values to use. | 3 / 3 |
Workflow Clarity | The training flow is clearly sequenced at the top, and the model selection decision tree provides a good step-by-step process. However, there are no explicit validation checkpoints or feedback loops — e.g., no guidance on what to check after training starts, how to verify a version was generated correctly, or what to do if training fails. The iterative improvement loop is deferred to a related page. | 2 / 3 |
Progressive Disclosure | The document references one related skill page for model improvement and mentions external links for NAS. However, the document itself is very long (~300+ lines) with extensive inline tables that could be split into separate reference files (e.g., model IDs, COCO classes). Without bundle files to offload reference material, the main skill becomes a monolithic reference document rather than a concise overview with pointers. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
02936d5
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.