Name: golikovichev/pytest-conversational
Rating: 99.4 (1 reviews)
Author: golikovichev

golikovichev/pytest-conversational

Test chat bots, voice assistants, and IVR menus with pytest using a small Conversation object and a callable bot adapter. Use when the user wants to write rule-based assertions over multi-turn dialogue without bringing in an LLM dependency, when they have a chatbot reachable as a Python callable or HTTP webhook, when they need to keep per-conversation state across turns and assert on slot filling, when they want pytest-native fixtures and a printable transcript on failure, or when they mention voice-assistant testing, IVR menu testing, conversational AI testing, LLM bot testing (used as the target under test, not as the matcher), expect matchers for bot replies, or multi-turn dialogue tests.

1.56x

Quality

100%

Does it follow best practices?

Impact

97%

1.56x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

{
  "context": "Tests whether the agent uses pytest-conversational correctly when testing a context-aware bot with error handling. The focus is on: seeding conversation history with add_user() before calling say(), verifying that adapter exceptions propagate unchanged through say(), inspecting the partial turn after failure, using convo.history as a read-only tuple view, using convo.transcript() for diagnostics, and using expect.contains for bot reply assertions.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "add_user seeding",
      "description": "Uses convo.add_user() to pre-populate conversation history before calling convo.say() — not by driving extra say() turns to set up history",
      "max_score": 18
    },
    {
      "name": "Exception propagation",
      "description": "Uses pytest.raises() to catch the custom exception raised by the adapter when it propagates through convo.say()",
      "max_score": 15
    },
    {
      "name": "Partial turn in turns",
      "description": "After the adapter raises an exception, asserts that convo.turns still contains the failed turn (e.g. checks len(convo.turns) > 0 or accesses the last turn)",
      "max_score": 15
    },
    {
      "name": "Partial turn bot empty",
      "description": "Asserts that the failed turn's bot attribute equals the empty string (turn.bot == '')",
      "max_score": 12
    },
    {
      "name": "convo.history used",
      "description": "Test or adapter logic references convo.history (the list of (user, bot) tuples) — not just convo.turns",
      "max_score": 10
    },
    {
      "name": "history is tuples",
      "description": "Code that reads convo.history treats entries as (user, bot) tuples (unpacking or indexing with [0]/[1])",
      "max_score": 8
    },
    {
      "name": "convo.transcript() called",
      "description": "At least one test calls convo.transcript() and uses or prints the result (e.g. for a failure message or diagnostic assert)",
      "max_score": 10
    },
    {
      "name": "expect.contains used",
      "description": "At least one assertion uses expect.contains() rather than a plain == or 'in' check on the bot reply",
      "max_score": 7
    },
    {
      "name": "conversation_factory fixture",
      "description": "Tests use the conversation_factory fixture to create Conversation objects, passing bot=<adapter>",
      "max_score": 5
    }
  ]
}

golikovichev/pytest-conversational

criteria.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-3/

criteria.jsonevals/scenario-3/