CtrlK
BlogDocsLog inGet started
Tessl Logo

pantheon-ai/dockerfile-toolkit

Complete dockerfile toolkit with generation and validation capabilities

94

Quality

94%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

criteria.jsonvalidator/evals/scenario-2/

{
  "context": "Agent performs Checkov-equivalent security analysis on a Dockerfile with hardcoded API keys, database credentials, missing USER, missing HEALTHCHECK, and an exposed SSH port.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Hardcoded secrets identified as Critical",
      "description": "Agent identifies ENV API_KEY and ENV DATABASE_URL as hardcoded secrets, classifies them as Critical severity, and explains they will persist in all image layers.",
      "max_score": 25
    },
    {
      "name": "BuildKit secret mount alternative described",
      "description": "Agent describes using --mount=type=secret in a RUN instruction or runtime environment injection as the correct alternative to ENV for secrets.",
      "max_score": 20
    },
    {
      "name": "Missing USER directive identified",
      "description": "Agent identifies the absence of a USER instruction as a High severity finding (container runs as root), and proposes adding a non-root user.",
      "max_score": 20
    },
    {
      "name": "EXPOSE 22 flagged",
      "description": "Agent flags EXPOSE 22 as a security risk (SSH port exposure) and recommends removing it unless SSH access is explicitly required.",
      "max_score": 15
    },
    {
      "name": "Missing HEALTHCHECK flagged",
      "description": "Agent identifies the absence of a HEALTHCHECK directive as a finding and proposes an appropriate healthcheck for a Node.js HTTP service.",
      "max_score": 10
    },
    {
      "name": "Severity categorisation correct",
      "description": "Agent correctly assigns Critical to hardcoded secrets, High to missing USER and EXPOSE 22, and Medium or lower to missing HEALTHCHECK.",
      "max_score": 10
    }
  ]
}

tile.json