CtrlK
BlogDocsLog inGet started
Tessl Logo

temporal-python-testing

Test Temporal workflows with pytest, time-skipping, and mocking strategies. Covers unit testing, integration testing, replay testing, and local development setup. Use when implementing Temporal workflow tests or debugging test failures.

86

1.12x
Quality

Does it follow best practices?

Impact

87%

1.12x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Content

65%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body is actionable with solid executable code examples, but it is weakened by redundancy across sections and by progressive-disclosure references that point to resource files which do not exist in the bundle. Tightening repeated sections and either providing or removing the missing resource files would raise quality.

Suggestions

Remove the duplicate material: 'Key Testing Principles' largely restates 'Testing Philosophy', and 'How to Use Resources' repeats the 'Available Resources' listings — keep each idea in one place.

Create the referenced resource files (resources/unit-testing.md, integration-testing.md, replay-testing.md, local-setup.md) or drop the references, since progressive disclosure only works when the linked files actually exist.

Tighten 'Coverage Targets' and 'When to Use This Skill', which overlap with content already covered in the testing philosophy and resource sections.

DimensionReasoningScore

Conciseness

Mostly efficient with concrete code, but 'Testing Philosophy' and 'Key Testing Principles' restate the same points (time-skipping, mock activities, replay), and 'How to Use Resources' repeats the resource listings already given in 'Available Resources'. Not score 3 due to this redundancy; not score 1 because it does not explain concepts Claude already knows and the code is lean.

2 / 3

Actionability

Provides two complete, copy-paste-ready code examples (a workflow test fixture with Worker + execute_workflow, and an ActivityEnvironment activity test) plus concrete coverage targets. Not score 2 because the code is executable rather than pseudocode and includes the key imports and assertions.

3 / 3

Workflow Clarity

This is an overview/reference skill with no multi-step process to sequence, and there are no validation checkpoints or feedback loops. Per the rubric's simple-skill note a single clear action can score 3, but the content presents several loosely connected sections without a clear testing sequence, so it stays at 2. Not score 1 because the Quick Start gives a runnable starting point.

2 / 3

Progressive Disclosure

References to 'resources/unit-testing.md', 'integration-testing.md', 'replay-testing.md', and 'local-setup.md' are clearly signaled with 'When to load' and 'Contains' blocks — well-structured one-level-deep intent. However, per the judging guideline to score against the actual bundle, none of these resource files exist in references/, scripts/, or assets/, so the progressive disclosure is promised but not delivered. Not score 1 because the in-skill organization and signaling are genuinely good; not score 3 because the referenced files are missing.

2 / 3

Total

9

/

12

Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A strong, concise description that states concrete capabilities and an explicit 'Use when' trigger with natural keywords. It clearly occupies a distinct niche and would reliably surface for the right requests.

DimensionReasoningScore

Specificity

Lists multiple concrete actions — 'unit testing, integration testing, replay testing, and local development setup' plus 'pytest, time-skipping, and mocking strategies' — rather than vague language. Not score 2 because it goes beyond naming a domain to enumerating several specific capabilities.

3 / 3

Completeness

Explicitly answers both what ('Test Temporal workflows with pytest, time-skipping, and mocking strategies') and when ('Use when implementing Temporal workflow tests or debugging test failures'). Not score 2 because the 'when' is explicit, not merely implied.

3 / 3

Trigger Term Quality

Includes natural terms a user would say ('pytest', 'Temporal workflows', 'mocking', 'debugging test failures') alongside a clear 'Use when' trigger. Good coverage of common phrasings; not score 2 because it captures multiple natural variations rather than a single technical term.

3 / 3

Distinctiveness Conflict Risk

Targets a clear niche — Temporal Python workflow testing — with triggers unlikely to fire for unrelated skills. Not score 2 because the scope (Temporal + pytest + time-skipping) is specific enough to avoid overlap with generic testing skills.

3 / 3

Total

12

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository
Dicklesworthstone/pi_agent_rust
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.