promptfoo-evaluation

Configures and runs LLM evaluation using Promptfoo framework. Use when setting up prompt testing, creating evaluation configs (promptfooconfig.yaml), writing Python custom assertions, implementing llm-rubric for LLM-as-judge, or managing few-shot examples in prompts. Triggers on keywords like "promptfoo", "eval", "LLM evaluation", "prompt testing", or "model comparison".

1.59x

Quality

82%

Does it follow best practices?

Impact

97%

1.59x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

1 medium severity finding. This skill can be installed but you should review these findings before use.

Medium

W012: Unverifiable external dependency detected (runtime URL that controls agent)

What this means

The skill fetches instructions or code from an external URL at runtime, and the fetched content directly controls the agent’s prompts or executes code. This dynamic dependency allows the external source to modify the agent’s behavior without any changes to the skill itself.

Why it was flagged

Potentially malicious external URL detected (high risk: 0.80). The skill's Quick Start and run instructions require running "npx promptfoo@latest …", which fetches and executes the Promptfoo package from the npm registry (e.g., https://registry.npmjs.org/promptfoo) at runtime, meaning remote code is downloaded and executed as a required dependency.

Report incorrect finding

Repository: daymade/claude-code-skills
Commit: bbf87f3

Audited: 3 months ago
Security analysis

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.