Complete bash-script toolkit with generation and validation capabilities
97
97%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Risky
Do not use without reviewing
{
"context": "Evaluate the agent's ability to identify shell security vulnerabilities and produce a hardened rewrite",
"type": "weighted_checklist",
"checklist": [
{
"name": "eval $PLUGIN_CMD identified as command injection",
"description": "Agent identifies that eval with an unvalidated variable allows arbitrary command execution; describes a concrete example attack (e.g., PLUGIN_CMD='rm -rf /')",
"max_score": 25
},
{
"name": "rm -rf with unquoted variable identified",
"description": "Agent identifies that rm -rf /tmp/deploy_$DEPLOY_ENV is dangerous when DEPLOY_ENV is empty or contains spaces, potentially deleting /tmp/deploy_ or unintended paths; explains the data loss risk",
"max_score": 20
},
{
"name": "|| true error suppression anti-pattern identified",
"description": "Agent identifies that check_prereqs || true in the validate function always returns success, hiding failures, and that return 0 compounds this by preventing callers from detecting errors",
"max_score": 15
},
{
"name": "cd without error handling identified",
"description": "Agent identifies that cd $WORK_DIR without || exit (SC2164) means subsequent commands run in the wrong directory if cd fails",
"max_score": 10
},
{
"name": "Hardened script produced",
"description": "Agent produces a corrected script that: removes eval and replaces with a safe alternative (or strict allowlist), quotes the rm -rf variable, removes || true suppression, and adds cd || exit or cd || { ...; exit 1; }",
"max_score": 20
},
{
"name": "Attack vectors explained",
"description": "For each security issue, agent describes a concrete attack scenario showing how the vulnerability could be exploited",
"max_score": 10
}
]
}generator
validator