Name: huggingface-local-models
Rating: 90.6 (1 reviews)
Author: huggingface

huggingface-local-models

Use to select models to run locally with llama.cpp and GGUF on CPU, Mac Metal, CUDA, or ROCm. Covers finding GGUFs, quant selection, running servers, exact GGUF file lookup, conversion, and OpenAI-compatible local serving.

1.25x

Quality

93%

Does it follow best practices?

Impact

73%

1.25x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Evaluation results

42%

Local Model Setup for a Code Review Assistant

Hub-first quant selection and server setup for code workload

Criteria

Without context

With context

Search with apps=llama.cpp

Local-app page consulted

Tree API used

Code-workload quant

33%

Repo-native labels preserved

80%

llama-server -hf command

100%

OpenAI-compatible curl smoke test

50%

60%

No conversion for GGUF repos

100%

CPU thread flag

50%

37%

HF snippet as source of truth

87%

42%

Choosing the Right Model for a Memory-Constrained Edge Device

Memory-constrained model selection with size filter and quant choice

Criteria

Without context

With context

apps=llama.cpp search filter

100%

Parameter size filter

25%

33%

Local-app page opened

100%

Tree API for exact file

100%

Memory-constrained quant selected

66%

Repo-native labels not normalized

100%

llama-cli -hf command

100%

No conversion suggested

100%

URL-first approach

50%

100%

mmproj excluded from main model

100%

90%

-4%

Migrating a Research Team's Model to Local Inference

Transformers-to-GGUF conversion workflow for repo without GGUF files

Criteria

Without context

With context

hf download command

100%

convert_hf_to_gguf.py used

100%

f16 outtype for initial conversion

100%

llama-quantize for quantization

100%

Appropriate NVIDIA quant

100%

llama-server launch command

100%

GPU layers flag

100%

Smoke test curl

100%

Conversion justified by no GGUF

40%

30%

hf auth login mentioned

100%

62%

Repository: huggingface/context-course
Commit: 0448a7c

Evaluated: 26 days ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Local Model Setup for a Code Review Assistant Choosing the Right Model for a Memory-Constrained Edge Device Migrating a Research Team's Model to Local Inference

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.