hugging-face-model-trainer

This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.

1.65x

Quality

92%

Does it follow best practices?

Impact

99%

1.65x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Evaluation results

97%

62%

Fine-Tune a Customer Support Model on Hugging Face Jobs

SFT job submission with monitoring and persistence

Criteria

Without context

With context

Uses hf_jobs() tool

100%

Inline script (not file path)

100%

PEP 723 header present

100%

Trackio in dependencies

100%

report_to trackio

100%

Trackio space_id default

50%

Descriptive run name

100%

Timeout exceeds 30 minutes

100%

push_to_hub enabled

100%

HF_TOKEN in secrets

100%

Uses max_length not max_seq_length

100%

Correct hardware flavor

100%

38%

Align a Model Using Preference Data

DPO dataset validation and training configuration

Criteria

Without context

With context

Dataset validation step

85%

100%

Validation before training

100%

Column mapping for DPO

100%

Uses instruct model

100%

eval_dataset consistency

100%

Uses max_length not max_seq_length

100%

HF_TOKEN in secrets

100%

Timeout adequate for DPO

100%

PEP 723 header in training script

100%

Trackio included

100%

17%

Convert a Fine-Tuned Model to GGUF for Local Deployment

GGUF conversion job setup

Criteria

Without context

With context

Verifies repos before submitting

100%

Build tools installed first

100%

Uses CMake not make

100%

Correct binary path

100%

sentencepiece in dependencies

62%

100%

protobuf in dependencies

62%

100%

Multiple quantization levels

100%

HF_TOKEN in secrets

100%

Appropriate timeout

100%

Uses hf_jobs() not local files

37%

100%

A10G GPU selected

100%

Repository: huggingface/skills
Commit: 73246ad

Evaluated: about 2 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Fine-Tune a Customer Support Model on Hugging Face Jobs Align a Model Using Preference Data Convert a Fine-Tuned Model to GGUF for Local Deployment

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.