Agent skill for benchmark-suite - invoke with $agent-benchmark-suite
33
0%
Does it follow best practices?
Impact
89%
2.17xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.agents/skills/agent-benchmark-suite/SKILL.mdBenchmark suite configuration and execution structure
Warmup duration default
0%
100%
Cooldown duration default
0%
100%
Default benchmark duration
0%
100%
Default iterations
100%
100%
Core benchmark categories
50%
100%
Extended benchmark categories
0%
100%
Parallel execution mode
0%
100%
Sequential execution mode
100%
100%
Inter-benchmark pause
0%
0%
Warmup phase invoked
62%
100%
Cooldown phase invoked
0%
100%
Baseline comparison support
0%
100%
Performance regression detection with CUSUM and multi-detector aggregation
CUSUM algorithm
0%
100%
Multiple detector types
100%
100%
Parallel detector execution
0%
100%
Aggregate confidence score
100%
100%
Severity classification
100%
100%
Default sensitivity 0.95
0%
100%
Change point timestamp
0%
100%
Change point magnitude
100%
100%
Change point direction
28%
100%
Change point significance
100%
100%
detection-report.json produced
100%
100%
Regression correctly identified
100%
100%
Load and stress test design with SLA thresholds and scalability targets
Load test ramp-up phase
100%
100%
Load test sustained phase
71%
100%
Load test ramp-down phase
0%
100%
Stress test breaking point
100%
100%
Stress test degradation curve
100%
100%
Latency p50 threshold
0%
0%
Latency p90 threshold
0%
0%
Latency p95 threshold
0%
100%
Latency p99 threshold
0%
100%
Scalability load points
0%
0%
Linear coefficient target
0%
100%
Efficiency retention target
100%
100%
test-plan.json produced
100%
100%
f547cec
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.