Name: mtthwmllr/skill-safety-auditor
Rating: 97.8 (1 reviews)
Author: mtthwmllr

mtthwmllr/skill-safety-auditor

Audits a Claude Code skill for security risks in three modes: before download (from a URL or install command), after download but before install (from a .skill file), or after install (from a local skills directory). Use this skill whenever a user is about to install a skill from any source — including GitHub URLs, git clone commands, npx/npm commands, curl/wget downloads, pip installs, marketplace links, or raw SKILL.md URLs. Also trigger when a user asks "is this skill safe?", "should I trust this skill?", "can you check this before I install it?", "audit this skill", or pastes any link to a skill repository or .skill file. If a user mentions installing ANY skill, proactively offer to audit it first — do not wait for them to ask.

1.28x

Quality

97%

Does it follow best practices?

Impact

99%

1.28x

Average score across 5 eval scenarios

Securityby

Advisory

Suggest reviewing before use

{
  "context": "Tests whether the agent correctly handles Mode 1 (pre-download) auditing: resolving a GitHub URL to a raw SKILL.md, fetching it via WebFetch, producing a report with the correct transparency notice, and using the standard report template with all required sections.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Mode 1 transparency notice",
      "description": "The report contains a transparency notice matching Mode 1: states that content was 'fetched from the URL' or equivalent, treated as data only, and advises the user to verify the URL is trusted",
      "max_score": 12
    },
    {
      "name": "GitHub URL resolved",
      "description": "The audit log documents that a raw.githubusercontent.com URL was used (not the github.com/blob URL), indicating correct URL conversion",
      "max_score": 12
    },
    {
      "name": "Overall verdict present",
      "description": "The report contains a clearly stated overall verdict using one of the three verdict labels: DO NOT INSTALL, PROCEED WITH CAUTION, or APPEARS SAFE",
      "max_score": 10
    },
    {
      "name": "What Was Reviewed section",
      "description": "The report includes a section listing what was reviewed (SKILL.md frontmatter, SKILL.md body, any scripts found)",
      "max_score": 10
    },
    {
      "name": "What Was Not Reviewed section",
      "description": "The report includes a section listing what could not be audited (e.g. runtime behaviour, binary assets, or any unverifiable resources)",
      "max_score": 8
    },
    {
      "name": "Static audit reminder",
      "description": "The report includes a reminder that a clean audit is not a guarantee of safety, referencing supply chain attacks, runtime behaviour, or post-install updates",
      "max_score": 8
    },
    {
      "name": "Security checks applied",
      "description": "The audit log or report documents that frontmatter checks (A-series) and content checks (C-series) were applied, not just a surface read",
      "max_score": 10
    },
    {
      "name": "Frontmatter validated",
      "description": "The report documents whether allowed-tools, name, and description were present and valid in the fetched SKILL.md",
      "max_score": 10
    },
    {
      "name": "Mode documented",
      "description": "The audit log explicitly states which mode was used (Mode 1 / pre-download) and provides a reason",
      "max_score": 10
    },
    {
      "name": "Two output files produced",
      "description": "Both audit-report.md and audit-log.md are present in the output",
      "max_score": 10
    }
  ]
}

mtthwmllr/skill-safety-auditor

criteria.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-1/

criteria.jsonevals/scenario-1/