Use when the user asks to "improve a metric", "run labs", "leave feedback on a metric", "add to labs", "fix metric accuracy", "review metric results", "find misaligned metrics", or "iterate on metric quality". Covers the metric improvement cycle, the feedback workflow, and the labs pipeline used to refine metric accuracy over time.
68
83%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Security
1 medium severity finding. This skill can be installed but you should review these findings before use.
The skill fetches instructions or code from an external URL at runtime, and the fetched content directly controls the agent’s prompts or executes code. This dynamic dependency allows the external source to modify the agent’s behavior without any changes to the skill itself.
Potentially malicious external URL detected (high risk: 0.90). The skill invokes the labs endpoint POST /test_framework/metric-reviews/process_feedbacks/ at runtime and that endpoint returns an improved "description" (prompt) which would directly modify agent prompts, so it is a runtime dependency that can control instructions.
24ad1d0
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.