Test chat bots, voice assistants, and IVR menus with pytest using a small Conversation object and a callable bot adapter. Use when the user wants to write rule-based assertions over multi-turn dialogue without bringing in an LLM dependency, when they have a chatbot reachable as a Python callable or HTTP webhook, when they need to keep per-conversation state across turns and assert on slot filling, when they want pytest-native fixtures and a printable transcript on failure, or when they mention voice-assistant testing, IVR menu testing, conversational AI testing, LLM bot testing (used as the target under test, not as the matcher), expect matchers for bot replies, or multi-turn dialogue tests.
99
100%
Does it follow best practices?
Impact
97%
1.56xAverage score across 3 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent correctly uses pytest-conversational to test a multi-turn slot-filling pizza bot. The agent must write a bot adapter as a Python callable with the correct signature, use conversation_factory to wire it in, drive turns with convo.say(), record per-turn metadata and cross-turn state, and apply the library's metadata and latency matchers.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Bot adapter signature",
"description": "Bot adapter is a Python callable with exactly two parameters: a string (user text) and a Conversation object, returning a string reply",
"max_score": 10
},
{
"name": "conversation_factory usage",
"description": "Uses the conversation_factory pytest fixture to create a Conversation, passing the bot adapter as the bot= keyword argument",
"max_score": 10
},
{
"name": "convo.say() for turns",
"description": "Drives conversation turns by calling convo.say(text) rather than constructing Turn objects or calling the adapter directly",
"max_score": 10
},
{
"name": "Turn.metadata intent recording",
"description": "Bot adapter writes the intent label into turn.metadata[\"intent\"] on each turn",
"max_score": 8
},
{
"name": "Turn.metadata slot recording",
"description": "Bot adapter writes extracted slot data into turn.metadata[\"slots\"] (or turn.metadata with a slot key) on each turn",
"max_score": 8
},
{
"name": "Turn.metadata latency recording",
"description": "Bot adapter records elapsed response time in turn.metadata[\"latency_ms\"] in milliseconds",
"max_score": 8
},
{
"name": "convo.state for cross-turn slots",
"description": "Bot adapter reads and writes accumulated slot data to convo.state (not to a local variable or global) so state persists across turns",
"max_score": 8
},
{
"name": "has_intent matcher",
"description": "Test(s) assert on the intent label of a turn using has_intent(turn, intent_name) from pytest-conversational",
"max_score": 8
},
{
"name": "has_slot matcher",
"description": "Test(s) assert on a per-turn extracted slot using has_slot(turn, slot_name) or has_slot(turn, slot_name, value=...) from pytest-conversational",
"max_score": 8
},
{
"name": "has_state matcher",
"description": "Test(s) assert on the conversation-level accumulated state using has_state(convo, state_name) or has_state(convo, state_name, value=...) from pytest-conversational",
"max_score": 8
},
{
"name": "responds_within matcher",
"description": "At least one test asserts latency using expect.responds_within(turn, seconds) — budget expressed in seconds while latency is recorded in milliseconds",
"max_score": 8
},
{
"name": "convo.last.bot usage",
"description": "At least one test reads the most recent bot reply via convo.last.bot",
"max_score": 6
}
]
}.tessl-plugin
evals
src
pytest_conversational
tests