CtrlK
BlogDocsLog inGet started
Tessl Logo

huggingface-local-models

Use to select models to run locally with llama.cpp and GGUF on CPU, Mac Metal, CUDA, or ROCm. Covers finding GGUFs, quant selection, running servers, exact GGUF file lookup, conversion, and OpenAI-compatible local serving.

90

1.25x
Quality

93%

Does it follow best practices?

Impact

73%

1.25x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Evaluation results

42%

5%

Local Model Setup for a Code Review Assistant

Hub-first quant selection and server setup for code workload

Criteria
Without context
With context

Search with apps=llama.cpp

0%

0%

Local-app page consulted

0%

0%

Tree API used

0%

0%

Code-workload quant

0%

33%

Repo-native labels preserved

80%

80%

llama-server -hf command

100%

100%

OpenAI-compatible curl smoke test

50%

60%

No conversion for GGUF repos

100%

100%

CPU thread flag

50%

37%

HF snippet as source of truth

0%

0%

87%

42%

Choosing the Right Model for a Memory-Constrained Edge Device

Memory-constrained model selection with size filter and quant choice

Criteria
Without context
With context

apps=llama.cpp search filter

0%

100%

Parameter size filter

25%

33%

Local-app page opened

0%

100%

Tree API for exact file

0%

100%

Memory-constrained quant selected

66%

66%

Repo-native labels not normalized

100%

100%

llama-cli -hf command

0%

100%

No conversion suggested

100%

100%

URL-first approach

50%

100%

mmproj excluded from main model

100%

100%

90%

-4%

Migrating a Research Team's Model to Local Inference

Transformers-to-GGUF conversion workflow for repo without GGUF files

Criteria
Without context
With context

hf download command

100%

100%

convert_hf_to_gguf.py used

100%

100%

f16 outtype for initial conversion

100%

100%

llama-quantize for quantization

100%

100%

Appropriate NVIDIA quant

100%

100%

llama-server launch command

100%

100%

GPU layers flag

100%

100%

Smoke test curl

100%

100%

Conversion justified by no GGUF

40%

30%

hf auth login mentioned

100%

62%

Repository
huggingface/context-course
Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.