CtrlK
BlogDocsLog inGet started
Tessl Logo

pytest-conversational

Test chat bots, voice assistants, and IVR menus with pytest using a small Conversation object and a callable bot adapter. Use when the user wants to write rule-based assertions over multi-turn dialogue without bringing in an LLM dependency, when they have a chatbot reachable as a Python callable or HTTP webhook, when they need to keep per-conversation state across turns and assert on slot filling, when they want pytest-native fixtures and a printable transcript on failure, or when they mention voice-assistant testing, IVR menu testing, conversational AI testing, LLM bot testing (used as the target under test, not as the matcher), expect matchers for bot replies, or multi-turn dialogue tests.

97

1.56x
Quality

Does it follow best practices?

Impact

97%

1.56x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Content

92%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

A tight, actionable skill body with executable examples and a clear quick-start sequence; its only material weakness is progressive disclosure — the body leans hard on a REFERENCE.md that is referenced in four places but is not bundled here, leaving those references dangling.

Suggestions

Bundle REFERENCE.md alongside SKILL.md so the four references to it (full Public API, matcher signatures, adapter contract/error reference, security note + override patterns) resolve to real content rather than dangling.

If REFERENCE.md is intentionally omitted from this distribution, inline the minimum essentials (e.g., the metadata-driven matcher signatures and the full http_webhook request/response contract) so the skill is self-contained, or state explicitly that REFERENCE.md ships with the installed package.

Make the References section link the bundle file as a path (e.g., [REFERENCE.md](REFERENCE.md)) consistent with the inline mentions, so the one-level-deep structure is unambiguous when the file is present.

DimensionReasoningScore

Conciseness

The body is lean and assumes Claude's competence — it never explains what pytest or a chatbot is, and every section pairs a sentence with executable code; it earns its tokens rather than padding concepts. Not the level below because there is no generic concept exposition like the PDF 'Portable Document Format' anchor.

3 / 3

Actionability

Fully executable, copy-paste-ready examples throughout — install command, bot adapter, fixture-driven test, slot-filling flow, HTTP webhook adapter with security options, and matcher calls — each concrete and complete rather than pseudocode.

3 / 3

Workflow Clarity

The Quick start is an unambiguous numbered sequence (install → write adapter → use fixture → run pytest) for a single-purpose skill, and no destructive/batch operation is involved that would require validation checkpoints and cap the score at 2.

3 / 3

Progressive Disclosure

The overview is well-organized and clearly signals one-level-deep references to 'REFERENCE.md' for the full API, matchers, adapter contract, and security note, but REFERENCE.md is not present in the bundle (no references/ dir, no REFERENCE.md file), so the signaled navigation is broken and does not earn the 'easy navigation' anchor at 3.

2 / 3

Total

11

/

12

Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A high-quality description: third-person voice, concrete actions, an explicit 'Use when' trigger clause, and broad natural trigger-term coverage that clearly distinguishes the skill. No first/second-person voice to penalize and no missing-trigger cap applies.

DimensionReasoningScore

Specificity

Lists multiple concrete actions — 'write rule-based assertions over multi-turn dialogue', 'keep per-conversation state across turns and assert on slot filling', 'pytest-native fixtures and a printable transcript on failure', 'expect matchers for bot replies' — rather than vague language.

3 / 3

Completeness

Explicitly answers both 'what' (test chat bots/voice assistants/IVR menus via a Conversation object and callable bot adapter) and 'when' via an explicit 'Use when the user wants...' clause with several triggers, so it is not capped at 2.

3 / 3

Trigger Term Quality

Covers natural user phrasing well: 'chat bots', 'voice assistants', 'IVR menus', 'multi-turn dialogue', 'slot filling', 'voice-assistant testing', 'IVR menu testing', 'conversational AI testing', 'multi-turn dialogue tests' — terms a user would actually say.

3 / 3

Distinctiveness Conflict Risk

A clear niche — pytest-based, rule-based assertions, no LLM dependency on the test side, callable/HTTP webhook bots — with distinct triggers unlikely to collide with other skills.

3 / 3

Total

12

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository
golikovichev/pytest-conversational
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.