Formal evaluation framework for Claude Code sessions implementing eval-driven development (EDD) principles
39
39%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Security
1 medium severity finding. This skill can be installed but you should review these findings before use.
The skill prompts the agent to compromise the security or integrity of the user’s machine by modifying system-level services or configurations, such as obtaining elevated privileges, altering startup scripts, or changing system-wide settings.
Attempt to modify system services in skill instructions detected (medium risk: 0.60). The prompt explicitly includes a capability eval "Can create new user account," which could direct an agent to create system-level users or otherwise modify machine state, so it poses a non-trivial risk though most other instructions are project-level eval mechanics and non-privileged.