CtrlK
BlogDocsLog inGet started
Tessl Logo

golikovichev/pytest-conversational

Test chat bots, voice assistants, and IVR menus with pytest using a small Conversation object and a callable bot adapter. Use when the user wants to write rule-based assertions over multi-turn dialogue without bringing in an LLM dependency, when they have a chatbot reachable as a Python callable or HTTP webhook, when they need to keep per-conversation state across turns and assert on slot filling, when they want pytest-native fixtures and a printable transcript on failure, or when they mention voice-assistant testing, IVR menu testing, conversational AI testing, LLM bot testing (used as the target under test, not as the matcher), expect matchers for bot replies, or multi-turn dialogue tests.

99

1.56x
Quality

100%

Does it follow best practices?

Impact

97%

1.56x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-2/

{
  "context": "Tests whether the agent uses the correct pytest-conversational expect matchers (contains, not_contains, regex, one_of) for a customer support bot test suite, writes a valid BotAdapter callable, and reads the bot reply via convo.last.bot.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "BotAdapter signature",
      "description": "The bot stub is a Python callable with signature (text: str, convo: Conversation) -> str, accepting exactly those two parameters",
      "max_score": 10
    },
    {
      "name": "conversation_factory fixture",
      "description": "Tests use the conversation_factory fixture and pass bot= to create a Conversation, rather than constructing Conversation directly",
      "max_score": 8
    },
    {
      "name": "convo.last.bot usage",
      "description": "At least one assertion reads the bot reply via convo.last.bot (not turn.bot from a stored reference or other attribute)",
      "max_score": 8
    },
    {
      "name": "expect.contains used",
      "description": "expect.contains() is called at least once to assert that a bot reply contains an expected substring",
      "max_score": 12
    },
    {
      "name": "contains case-insensitive default",
      "description": "expect.contains() is called without case_sensitive=True in at least one test (relying on the default case-insensitive behaviour)",
      "max_score": 8
    },
    {
      "name": "expect.not_contains used",
      "description": "expect.not_contains() is called at least once as a leak guard on an error or unexpected reply",
      "max_score": 12
    },
    {
      "name": "not_contains guards internal details",
      "description": "The not_contains call checks that the reply does NOT contain at least one of: 'traceback', 'exception', 'stack', 'error:' (case-insensitive match acceptable), guarding against internal leak",
      "max_score": 8
    },
    {
      "name": "expect.regex used",
      "description": "expect.regex() is called at least once to assert a formatted pattern (e.g. phone number, date, or order number) in a bot reply",
      "max_score": 12
    },
    {
      "name": "regex match object used",
      "description": "The return value of expect.regex() is captured and at least one captured group or match attribute is inspected (e.g. m.group(0) or m.group(1))",
      "max_score": 8
    },
    {
      "name": "expect.one_of used",
      "description": "expect.one_of() is called at least once with a list of acceptable reply variants to handle a bot that returns one of several valid phrases",
      "max_score": 12
    },
    {
      "name": "results.txt present",
      "description": "A results.txt file exists containing pytest -v output showing test results",
      "max_score": 2
    }
  ]
}

CHANGELOG.md

CONTRIBUTING.md

README.md

REFERENCE.md

SECURITY.md

SKILL.md

tessl.json

tile.json