This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.
94
92%
Does it follow best practices?
Impact
99%
1.65xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
SFT job submission with monitoring and persistence
Uses hf_jobs() tool
0%
100%
Inline script (not file path)
100%
100%
PEP 723 header present
0%
100%
Trackio in dependencies
0%
100%
report_to trackio
0%
100%
Trackio space_id default
0%
50%
Descriptive run name
0%
100%
Timeout exceeds 30 minutes
100%
100%
push_to_hub enabled
100%
100%
HF_TOKEN in secrets
0%
100%
Uses max_length not max_seq_length
0%
100%
Correct hardware flavor
100%
100%
DPO dataset validation and training configuration
Dataset validation step
85%
100%
Validation before training
100%
100%
Column mapping for DPO
100%
100%
Uses instruct model
100%
100%
eval_dataset consistency
100%
100%
Uses max_length not max_seq_length
100%
100%
HF_TOKEN in secrets
0%
100%
Timeout adequate for DPO
0%
100%
PEP 723 header in training script
0%
100%
Trackio included
0%
100%
GGUF conversion job setup
Verifies repos before submitting
100%
100%
Build tools installed first
100%
100%
Uses CMake not make
100%
100%
Correct binary path
100%
100%
sentencepiece in dependencies
62%
100%
protobuf in dependencies
62%
100%
Multiple quantization levels
100%
100%
HF_TOKEN in secrets
100%
100%
Appropriate timeout
100%
100%
Uses hf_jobs() not local files
37%
100%
A10G GPU selected
0%
100%
73246ad
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.