Generate eval scenarios from repo commits, configure multi-agent runs, execute baseline + with-context evals, and compare results — the full setup pipeline before improvement begins
90
90%
Does it follow best practices?
Impact
91%
3.37xAverage score across 2 eval scenarios
Advisory
Suggest reviewing before use
Security
1 medium severity finding. This skill can be installed but you should review these findings before use.
The skill exposes the agent to untrusted, user-generated content from public third-party sources, creating a risk of indirect prompt injection. This includes browsing arbitrary URLs, reading social media posts or forum comments, and analyzing content from unknown websites.
Third-party content exposure detected (high risk: 0.90). The skill explicitly fetches and inspects repository commits (via `git log` or `gh api`) and downloads/reads scenario files (`task.md`, `criteria.json`) sourced from the specified org/repo, which are untrusted, user-generated web content that the agent reads and uses to choose commits, generate scenarios, and drive eval decisions — enabling indirect prompt injection.