CtrlK
BlogDocsLog inGet started
Tessl Logo

run-e2e

Run E2E tests locally using the new-e2e framework with Pulumi-based infrastructure

52

Quality

58%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/run-e2e/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

40%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a specific technology stack (new-e2e framework, Pulumi) which makes it distinctive, but it lacks explicit trigger guidance ('Use when...') and doesn't enumerate the concrete actions or scenarios it covers beyond simply running tests. It would benefit from expanded action descriptions and explicit usage triggers.

Suggestions

Add a 'Use when...' clause specifying trigger scenarios, e.g., 'Use when the user wants to run end-to-end tests locally, debug e2e test failures, or set up Pulumi-based test infrastructure.'

Expand the capability list with more concrete actions, e.g., 'Run, debug, and configure E2E tests locally using the new-e2e framework; manage Pulumi-based test infrastructure; troubleshoot test environment setup.'

Include natural keyword variations like 'end-to-end tests', 'e2e testing', 'local test execution', and 'test infrastructure' to improve trigger term coverage.

DimensionReasoningScore

Specificity

Names the domain (E2E tests) and a specific action (run tests locally), and mentions specific technologies (new-e2e framework, Pulumi-based infrastructure), but doesn't list multiple concrete actions beyond 'run tests'.

2 / 3

Completeness

Describes what it does (run E2E tests locally using new-e2e framework with Pulumi) but has no explicit 'Use when...' clause or equivalent trigger guidance, which per the rubric should cap completeness at 2, and the 'what' is also fairly thin, placing this at 1.

1 / 3

Trigger Term Quality

Includes relevant keywords like 'E2E tests', 'Pulumi', and 'new-e2e framework' that a user familiar with this stack might use, but misses common variations like 'end-to-end', 'integration tests', 'test locally', or 'infrastructure testing'.

2 / 3

Distinctiveness Conflict Risk

The combination of 'new-e2e framework' and 'Pulumi-based infrastructure' creates a very specific niche that is unlikely to conflict with other testing or infrastructure skills.

3 / 3

Total

8

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill that provides clear step-by-step instructions for running E2E tests with concrete examples and good workflow sequencing. Its main weaknesses are minor redundancy in path resolution explanations and the fact that all content is monolithic rather than leveraging any bundle files. The troubleshooting guidance for stuck Pulumi stacks is a valuable addition that demonstrates awareness of real failure modes.

Suggestions

Remove the redundant emphasis on relative paths—state the rule once clearly in step 2 and reference it briefly in step 3 rather than restating it

Fix the unclosed code fence in the Examples section (missing closing ``` before the Usage section)

DimensionReasoningScore

Conciseness

The skill is mostly efficient but has some redundancy—the target path resolution rules are repeated (step 2 and step 3 both emphasize the relative path convention), and the 'Available Test Suites' section adds little value beyond 'run ls'. The prerequisites and troubleshooting sections are appropriately brief.

2 / 3

Actionability

The skill provides fully concrete, executable commands with specific flags, clear examples covering multiple scenarios, and explicit instructions for resolving test targets from user input. The examples are copy-paste ready and cover common use cases.

3 / 3

Workflow Clarity

The 7-step workflow is clearly sequenced with explicit validation checkpoints: confirm the command before running (step 5), use background execution with timeout (step 6), summarize results after completion (step 7). The troubleshooting section for stuck Pulumi stacks includes a feedback loop (detect issue in logs → stop early → retry with suffix).

3 / 3

Progressive Disclosure

The content is well-structured with clear sections (Instructions, Prerequisites, Examples, Usage, Output), but it's all in a single file with no references to external documentation. The flags list and examples could potentially be split out, though for this skill size it's borderline acceptable. The missing closing code fence for the examples section is a minor formatting issue.

2 / 3

Total

10

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
DataDog/datadog-agent
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.