CtrlK
BlogDocsLog inGet started
Tessl Logo

pantheon-ai/github-actions-toolkit

Complete GitHub Actions toolkit with generation and validation capabilities for workflows, custom actions, and CI/CD configurations

97

Quality

97%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

criteria.jsonvalidator/evals/scenario-2/

{
  "context": "Tests that the agent identifies and categorizes multiple distinct error types (CRON format, runner label typo, job dependency typo, script injection, outdated action), provides a per-error explanation and fix, and delivers a corrected workflow.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "CRON error identified",
      "description": "audit-report.md identifies the invalid cron value '0 0 * * 8' (weekday out of range 0-6) as an error",
      "max_score": 10
    },
    {
      "name": "CRON error fixed",
      "description": "In release.yml, the cron expression is corrected to a valid value (weekday field is 0-6)",
      "max_score": 8
    },
    {
      "name": "Runner typo identified",
      "description": "audit-report.md identifies 'ubuntu-lastest' as an invalid runner label",
      "max_score": 8
    },
    {
      "name": "Runner typo fixed",
      "description": "In release.yml, 'ubuntu-lastest' is corrected to 'ubuntu-latest'",
      "max_score": 8
    },
    {
      "name": "Job dependency typo identified",
      "description": "audit-report.md identifies 'needs: tset' as referencing a non-existent job",
      "max_score": 8
    },
    {
      "name": "Job dependency typo fixed",
      "description": "In release.yml, 'needs: tset' is corrected to 'needs: test'",
      "max_score": 8
    },
    {
      "name": "Script injection identified",
      "description": "audit-report.md identifies the ${{ github.event.pull_request.title }} interpolation in run: as a script injection risk",
      "max_score": 10
    },
    {
      "name": "Script injection fixed",
      "description": "In release.yml, the untrusted value is routed through an env: variable instead of direct interpolation",
      "max_score": 10
    },
    {
      "name": "Error categories present",
      "description": "audit-report.md categorizes each error by type (e.g., CRON/Schedule, Runner, Job Dependency, Security/Injection, or similar labels)",
      "max_score": 10
    },
    {
      "name": "Outdated action noted",
      "description": "audit-report.md notes that actions/checkout@v3 could be updated to a newer version",
      "max_score": 10
    },
    {
      "name": "Fix code quoted",
      "description": "audit-report.md includes at least one corrected code snippet per error (not just a description)",
      "max_score": 10
    }
  ]
}

tile.json