Computer vision engineering skill for object detection, image segmentation, and visual AI systems. Covers CNN and Vision Transformer architectures, YOLO/Faster R-CNN/DETR detection, Mask R-CNN/SAM segmentation, and production deployment with ONNX/TensorRT. Includes PyTorch, torchvision, Ultralytics, Detectron2, and MMDetection frameworks. Use when building detection pipelines, training custom models, optimizing inference, or deploying vision systems.
87
78%
Does it follow best practices?
Impact
92%
1.37xAverage score across 6 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./engineering-team/senior-computer-vision/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that is comprehensive, specific, and well-structured. It clearly delineates the domain (computer vision), lists concrete capabilities and tools, and provides explicit trigger conditions. The description uses proper third-person voice throughout and covers both breadth (architectures, frameworks, deployment) and depth (specific model names) without being unnecessarily verbose.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions and technologies: object detection, image segmentation, CNN/Vision Transformer architectures, YOLO/Faster R-CNN/DETR detection, Mask R-CNN/SAM segmentation, deployment with ONNX/TensorRT, and specific frameworks like PyTorch, torchvision, Ultralytics, Detectron2, MMDetection. | 3 / 3 |
Completeness | Clearly answers both 'what' (object detection, segmentation, specific architectures and frameworks) and 'when' with an explicit 'Use when building detection pipelines, training custom models, optimizing inference, or deploying vision systems' clause. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'object detection', 'image segmentation', 'YOLO', 'Faster R-CNN', 'DETR', 'SAM', 'computer vision', 'PyTorch', 'ONNX', 'TensorRT', 'detection pipelines', 'vision systems'. These are all terms a user working in CV would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche in computer vision engineering. The specific mention of detection/segmentation models (YOLO, Faster R-CNN, DETR, Mask R-CNN, SAM) and CV-specific frameworks (Detectron2, MMDetection, Ultralytics) makes it very unlikely to conflict with general ML, NLP, or other AI skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
57%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured skill with good progressive disclosure and comprehensive coverage of computer vision workflows. However, it suffers from verbosity (many reference tables that Claude already knows) and relies heavily on fictional custom scripts that reduce real-world actionability. Adding validation/error-recovery steps and replacing fictional scripts with actual executable code would significantly improve quality.
Suggestions
Replace fictional script commands (scripts/vision_model_trainer.py, etc.) with actual executable code using real libraries (e.g., Ultralytics Python API, Detectron2 API) to make workflows truly actionable.
Remove or significantly trim the 'Core Expertise', 'Tech Stack', and architecture comparison tables — Claude already knows these; keep only the decision-guiding tables that map requirements to specific choices.
Add explicit validation checkpoints with error recovery in workflows, e.g., 'If mAP < 0.3 after 20 epochs, check: (1) learning rate, (2) dataset quality, (3) annotation correctness' and 'If ONNX export fails, verify input/output shapes match'.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is quite long (~350 lines) with several tables and sections that are informational rather than instructional. The 'Core Expertise' and 'Tech Stack' sections list things Claude already knows, and the architecture selection tables, while useful, are verbose. The CNN vs ViT trade-offs table and performance targets add bulk without being directly actionable. | 2 / 3 |
Actionability | Commands reference custom scripts (e.g., scripts/vision_model_trainer.py, scripts/inference_optimizer.py, scripts/dataset_pipeline_builder.py) that presumably don't exist, making them not truly executable. Some commands like the ONNX verification and yolo CLI commands are real and copy-paste ready, but the core workflow steps depend on fictional tooling, reducing practical actionability. | 2 / 3 |
Workflow Clarity | The three workflows have clear step sequences and are well-structured with numbered steps. However, validation checkpoints are weak — Step 3 of Workflow 2 has an ONNX verification check, but there are no explicit feedback loops for error recovery (e.g., what to do if training diverges, if export fails, or if dataset cleaning removes too many images). The 'expected output' blocks are helpful but are illustrative rather than actual validation gates. | 2 / 3 |
Progressive Disclosure | The skill has a clear table of contents, well-organized sections progressing from quick start to detailed workflows, and appropriately references external files (references/reference-docs-and-commands.md, references/computer_vision_architectures.md, etc.) for deeper content. Navigation is straightforward with one-level-deep references. | 3 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
f567c61
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.