Use when training Roboflow models or improving accuracy - covers architecture selection, model IDs, checkpoints, evaluation metrics, and the iterative improvement playbook.
60
70%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/training-and-evaluation/SKILL.mdFor agents — source-of-truth: This skill is authored in
roboflow/computer-vision-skillsand shipped with the Roboflow plugin. If your client has loaded the plugin (you'll seeroboflow:<name>skills in your available skills list), use those local skills — they're read fresh from disk every session. The same content served as MCP resources atroboflow://skills/<name>/...is a fallback for clients without the plugin and may lag this repo. Don't callReadMcpResourceToolforroboflow://skills/...URIs when a localroboflow:<name>skill is available.
Upload/Annotate Images
→ Generate Dataset Version (preprocessing + augmentation + train/val/test split)
→ Pick Model Architecture + Size
→ Pick Checkpoint (COCO, Universe model, or previous version)
→ Train
→ Evaluate (auto-runs for paid users)Version = frozen snapshot. Changes to the project after version creation do not affect it. Configure preprocessing (resize, contrast, etc.) and augmentation (flip, rotate, mosaic, etc.) during version generation.
| Architecture | Sizes | Default Resolution | Notes |
|---|---|---|---|
| RF-DETR | Pico, Nano, Small, Base, Medium, Large, XL, 2XL | 384-880 (varies by size) | Best accuracy, recommended default |
| Roboflow 3.0 | Fast, Accurate, Medium, Large, XL | 640x640 | YOLOv8-based. Medium+ require paid plan |
| YOLO26 | n/s/m/l/x | 640x640 | Also supports seg + pose |
| YOLOv12 | n/s/m/l/x | 640x640 | OD only |
| YOLOv11 | n/s/m/l/x | 640x640 | Also supports seg + pose |
| YOLOv8 | n/s/m/l/x | 640x640 | Also supports seg + pose |
| YOLO-NAS | Small, Medium | 640x640 | |
| YOLOLite CPU | n/s/m/l/x | 640x640 | Edge-optimized, beta |
| YOLOLite GPU | n/s/m/l/x | 640x640 | Edge-optimized, beta |
| Roboflow Instant | single | N/A (no resize) | Few-shot, free, OD only |
| Architecture | Sizes | Default Resolution |
|---|---|---|
| RF-DETR Seg | Nano, Small, Medium, Large, XL, 2XL | 312-768 (varies) |
| Roboflow 3.0 Seg | Fast, Accurate, Medium, Large, XL | 640x640 |
| YOLO-seg | v8/v11/v26 (n/s/m/l/x each) | 640x640 |
| SAM 3 (Segment Anything 3) | Large | 1008x1008 |
| Architecture | Sizes | Default Resolution |
|---|---|---|
| DeepLabV3+ | Base | >=512x512 |
| Architecture | Sizes | Default Resolution |
|---|---|---|
| ViT | Base | 224x224 |
| ResNet | 18/34/50/101 | 224x224 |
| DINOv3 | Base, Small | 224x224 |
| Architecture | Sizes | Default Resolution |
|---|---|---|
| YOLO-pose | v8/v11/v26 (n/s/m/l/x each) | 640x640 |
| Architecture | Sizes | Default Resolution |
|---|---|---|
| Qwen3.5 | 0.8B, 2B | 448x448 |
| Qwen3 VL | 2B | 448x448 |
| SmolVLM | 256M, 2B | 384x384 |
| Florence 2 | Base, Large | 768x768 |
| PaliGemma 2 | 3B | 448x448 |
| Qwen2.5 VL | 7B | 448x448 |
Follow this flowchart to pick the right model. Start at Step 1.
sam3/sam3_final, set class_names). Rapid does not support segmentation → Step 10.Use these exact model_id values. Do not guess — wrong IDs cause training failures.
| Family | model_id values |
|---|---|
| RF-DETR (recommended) | rfdetr-pico, rfdetr-nano, rfdetr-small, rfdetr-base, rfdetr-medium, rfdetr-large, rfdetr-xlarge, rfdetr-2xlarge |
| YOLO26 | yolo26n, yolo26s, yolo26m, yolo26l, yolo26x |
| YOLOv12 | yolov12n, yolov12s, yolov12m, yolov12l, yolov12x |
| YOLOv11 | yolov11n, yolov11s, yolov11m, yolov11l, yolov11x |
| YOLOv8 | yolov8n, yolov8s, yolov8m, yolov8l, yolov8x |
| YOLO-NAS | yolo_nas_s, yolo_nas_m, yolo_nas_l |
| YOLOLite CPU | yololite-edge-n, yololite-edge-s, yololite-edge-m, yololite-edge-l, yololite-edge-xl |
| YOLOLite GPU | yololite-n, yololite-s, yololite-m, yololite-l, yololite-xl |
| Family | model_id values |
|---|---|
| RF-DETR Seg (recommended) | rfdetr-seg-nano, rfdetr-seg-small, rfdetr-seg-medium, rfdetr-seg-large, rfdetr-seg-xlarge, rfdetr-seg-2xlarge |
| YOLO26 Seg | yolo26n-seg, yolo26s-seg, yolo26m-seg, yolo26l-seg, yolo26x-seg |
| YOLOv11 Seg | yolov11n-seg, yolov11s-seg, yolov11m-seg, yolov11l-seg, yolov11x-seg |
| YOLOv8 Seg | yolov8n-seg, yolov8s-seg, yolov8m-seg, yolov8l-seg, yolov8x-seg |
| SAM3 | sam3-large |
| Family | model_id values |
|---|---|
| YOLO26 Pose | yolo26n-pose, yolo26s-pose, yolo26m-pose, yolo26l-pose, yolo26x-pose |
| YOLOv11 Pose | yolov11n-pose, yolov11s-pose, yolov11m-pose, yolov11l-pose, yolov11x-pose |
| YOLOv8 Pose | yolov8n-pose, yolov8s-pose, yolov8m-pose, yolov8l-pose, yolov8x-pose |
| Family | model_id values |
|---|---|
| ViT | vit-base-patch16-224-in21k |
| ResNet | resnet18, resnet34, resnet50, resnet101 |
| DINOv3 | vit_base_patch16_dinov3.lvd1689m, vit_small_patch16_dinov3.lvd1689m |
| Family | model_id values |
|---|---|
| DeepLabV3+ | deeplabv3plus |
| Family | model_id values |
|---|---|
| Qwen3.5 VL | qwen3_5-2b-peft, qwen3_5-0.8b-peft |
| Qwen3 VL | qwen3vl-2b-instruct, qwen3vl-2b-instruct-peft |
| SmolVLM | smolvlm2-peft, smolvlm-256m-peft |
| Florence 2 | florence-2-base, florence-2-large, florence-2-base-peft, florence-2-large-peft |
| PaliGemma 2 | paligemma2-3b-pt-224, paligemma2-3b-pt-448, paligemma2-3b-pt-896, paligemma2-3b-pt-224-peft |
| Qwen2.5 VL | qwen25-vl-7b, qwen25-vl-7b-peft |
| Model | model_id | Notes |
|---|---|---|
| SAM3 (zero-shot, workflows) | sam3/sam3_final | Always set class_names; no other props unless user asks |
| Custom / workspace | workspace/model or dataset/version | e.g., construction-safety/2 |
person, bicycle, car, motorcycle, airplane, bus, train, truck, boat, traffic light, fire hydrant, stop sign, parking meter, bench, bird, cat, dog, horse, sheep, cow, elephant, bear, zebra, giraffe, backpack, umbrella, handbag, tie, suitcase, frisbee, skis, snowboard, sports ball, kite, baseball bat, baseball glove, skateboard, surfboard, tennis racket, bottle, wine glass, cup, fork, knife, spoon, bowl, banana, apple, sandwich, orange, broccoli, carrot, hot dog, pizza, donut, cake, chair, couch, potted plant, bed, dining table, toilet, tv, laptop, mouse, remote, keyboard, cell phone, microwave, oven, toaster, sink, refrigerator, book, clock, vase, scissors, teddy bear, hair drier, toothbrush.
| Goal | Recommended |
|---|---|
| Best accuracy, object detection | RF-DETR (Large or XL) |
| Fast inference, object detection | RF-DETR Nano or YOLOv11n |
| Best speed/accuracy tradeoff for specific hardware | RF-DETR NAS (see section below) |
| Best accuracy, instance segmentation | RF-DETR Seg |
| Quick proof-of-concept (<1000 images) | Roboflow Instant |
| Classification | ViT or DINOv3 |
| Multimodal / text prompts | Qwen3.5 or SmolVLM |
Instead of picking a single RF-DETR size manually, NAS trains many variants and reports the speed/accuracy frontier so you can pick the one that fits your hardware budget.
rfdetr-nas) and Instance Segmentation (rfdetr-nas-seg).canTrainNas workspace feature flag. Self-serve plans (basic/starter/sandbox/research/trial) need to upgrade; enterprise/legacy plans need to contact sales.?engine=nas (UI: pick Neural Architecture Search as the training engine). Results land at /{workspace}/{project}/nas-runs/{versionId}.rfdetr-nas / rfdetr-nas-seg, but it's served through the standard inference paths.Do NOT use Rapid when:
| Exclusion | Why |
|---|---|
| OCR / text detection (characters, serial numbers, labels, receipts, license plates) | SAM3 cannot reliably segment individual characters |
| Blueprints, floor plans, schematics, technical drawings | Abstract symbols and line-based elements not handled by SAM3 text prompting |
| More than 5 target classes | SAM3 text prompting accuracy degrades significantly with many classes |
| Fine-grained visual distinctions (correct vs incorrect orientation, pass/fail, subtle defects) | SAM3 cannot differentiate nearly identical objects; fine-tuned model needed |
| High-precision measurement / metrology (distances, dimensions, tolerances) | SAM3 auto-labeling annotation precision insufficient for calibrated measurement |
When Rapid is excluded → recommend custom training with RF-DETR fine-tuning.
| Option | When to use |
|---|---|
| Public Checkpoint (COCO) | First model version, default recommended |
| Universe Checkpoint | Star a Universe project first, then it appears as checkpoint option. Good for domain-specific transfer learning |
| Previous Version | Already have a good model, want to improve with more data (all types except classification and SAM3) |
| Random Initialization | Advanced users only, usually worse results |
Metrics vary by project type:
| Project Type | Metrics Shown |
|---|---|
| Object Detection | mAP@50, Precision, Recall, F1 |
| Classification | Accuracy |
| Instance Segmentation / Keypoint | mAP@50, Precision, Recall |
| Semantic Segmentation | mIoU |
| Multimodal | Perplexity |
Auto-runs after training. Access: Models > click model version > View Evaluation.
| Feature | What it shows |
|---|---|
| Production Metrics Explorer | Precision/Recall/F1 at all confidence thresholds; recommends optimal confidence |
| Model Improvement Recommendations | Actionable suggestions (false negatives, false positives, confused classes, insufficient data) |
| Performance by Class | Correct predictions, misclassifications, false negatives, false positives per class; filterable |
| Confusion Matrix | Ground truth vs predictions grid; click cells to see specific images; adjustable confidence threshold |
| Vector Explorer | Interactive embedding clusters showing where model succeeds/fails |
| Action | Tool |
|---|---|
| Generate version | versions_generate |
| Start training | models_train |
| Check training status | models_get_training_status |
| Get model info | models_get |
| List models | models_list |
roboflow://skills/roboflow-model-improvement/SKILL — diagnostic decision tree, confusion matrix guide, per-class metrics, architecture switching, iterative improvement checklist02936d5
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.