CtrlK
BlogDocsLog inGet started
Tessl Logo

security-benchmark-runner

Security Benchmark Runner - Auto-activating skill for Security Advanced. Triggers on: security benchmark runner, security benchmark runner Part of the Security Advanced skill category.

38

1.02x

Quality

7%

Does it follow best practices?

Impact

94%

1.02x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./planned-skills/generated/04-security-advanced/security-benchmark-runner/SKILL.md
SKILL.md
Quality
Evals
Security

Evaluation results

99%

SOC2 Compliance Readiness Assessment Tool

SOC2 compliance benchmark automation

Criteria
Without context
With context

Step-by-step structure

100%

100%

Production-ready script

100%

100%

SOC2 trust criteria coverage

100%

100%

Industry standard alignment

100%

100%

Validation against standards

100%

100%

Configuration checks included

100%

100%

Output report generation

100%

100%

Access control checks

100%

100%

Logging and monitoring checks

100%

100%

Error handling

90%

90%

Without context: $0.5579 · 2m 26s · 23 turns · 22 in / 10,713 out tokens

With context: $0.7092 · 2m 58s · 34 turns · 34 in / 9,957 out tokens

100%

Threat Model for a Healthcare Patient Portal

Threat modeling documentation and assessment

Criteria
Without context
With context

Structured methodology

100%

100%

Step-by-step process

100%

100%

Enterprise security domains covered

100%

100%

Threat enumeration

100%

100%

Risk validation

100%

100%

Mitigation recommendations

100%

100%

Industry standard reference

100%

100%

Data flow or trust boundary analysis

100%

100%

Compliance context

100%

100%

Actionable output format

100%

100%

Without context: $0.4627 · 3m 12s · 11 turns · 53 in / 11,352 out tokens

With context: $0.7151 · 4m 11s · 24 turns · 208 in / 13,858 out tokens

83%

4%

External Security Assessment Automation for Pre-Launch Checklist

Penetration testing benchmark runner script

Criteria
Without context
With context

Runnable script produced

100%

100%

Step-by-step organization

100%

100%

Multiple pentesting domains

100%

100%

Industry tool usage

50%

70%

Standards-based validation

20%

30%

Structured results output

100%

100%

Scope safety controls

100%

100%

Compliance tagging

0%

12%

Production-ready quality

100%

100%

No destructive actions

100%

100%

Without context: $0.7116 · 3m 31s · 23 turns · 24 in / 14,552 out tokens

With context: $0.6892 · 3m 14s · 28 turns · 29 in / 12,107 out tokens

Repository
jeremylongshore/claude-code-plugins-plus-skills
Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.