eval-harness

Formal evaluation framework for Claude Code sessions implementing eval-driven development (EDD) principles

2.08x

Quality

24%

Does it follow best practices?

Impact

100%

2.08x

Average score across 6 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./docs/zh-TW/skills/eval-harness/SKILL.md

1 medium severity finding. This skill can be installed but you should review these findings before use.

Medium

W013: Attempt to modify system services in skill instructions

What this means

The skill prompts the agent to compromise the security or integrity of the user’s machine by modifying system-level services or configurations, such as obtaining elevated privileges, altering startup scripts, or changing system-wide settings.

Why it was flagged

Attempt to modify system services in skill instructions detected (high risk: 1.00). The prompt explicitly includes evals and examples that test "create new user accounts" (e.g., "可以建立新使用者帳戶" and "create-user"), which would instruct the agent to modify system user accounts and thus alter the machine state.

Report incorrect finding

Repository: haniakrim21/everything-claude-code
Commit: ae2cadd

Audited: about 1 month ago
Security analysis

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.