Designing conversational flows for website chatbots and AI agents. Intent recognition architecture, branching logic, fallback handling, escalation to human, conversation analytics. Honest about scripted-bot (rigid trees, fail edge cases), hallucinating-bot (LLM without structure, makes things up), and structured-guided-conversation (LLM-powered with intent architecture and fallback discipline) patterns. Distinguishes chatbot DESIGN (this skill) from chatbot IMPLEMENTATION (engineering and platform work). Triggers on chatbot, conversational AI, AI agent, chat widget, intent design, conversational flow, bot escalation, LLM grounding. Also triggers when a chatbot is hallucinating, when a scripted bot is failing edge cases, or when a chatbot is being scoped for the first time.
53
60%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/chatbot-flow-design/SKILL.mdA senior growth practitioner's playbook for designing conversational flows for website chatbots and AI agents. Intent recognition architecture, branching logic, fallback handling, escalation to human, conversation analytics. The discipline of building a bot that knows what it knows and routes appropriately when it does not.
Most chatbots on the web fail in one of two ways. Scripted bots break the moment a user phrases something the script did not anticipate; the user gets pushed through a decision tree that does not fit their situation. LLM-powered bots without structure hallucinate; they confidently answer questions about pricing, policy, or capabilities and frequently make up answers, creating support burden and trust damage.
The chatbots that work do something different. They have an intent architecture that defines what the bot can and cannot handle. They ground their responses in a knowledge base so they do not invent facts. They have explicit fallback paths for unclear or out-of-scope intents. They escalate to humans cleanly when the bot's job is done. The audience trusts the bot because the bot is honest about its scope.
The voice is the senior growth practitioner who has watched chatbots become trusted brand surfaces and watched them become liability risks. Practical, opinionated about the architecture that distinguishes the two outcomes, willing to call out when a chatbot is the wrong investment or when an existing chatbot needs to be redesigned rather than tuned.
When to use this skill: scoping a chatbot for the first time, auditing a chatbot that hallucinates or fails edge cases, designing the intent architecture and fallback patterns, or deciding when to escalate to humans.
This skill spans chatbot design as conversational flow architecture, not chatbot implementation. The growth-tooling distinctions:
ai-content-collaboration covers AI in content workflows. This skill covers AI in customer-facing conversations.integration-orchestrator covers cross-team coordination for chatbot deployment. This skill is the conversational design itself.pm-spec-writing covers the spec for engineers building the bot. This skill is about WHAT the conversation should be; pm-spec-writing is about communicating it.discovery-research-synthesis covers customer research that informs intent architecture. Input to this skill, not part of it.chatbot-flow-design (this skill) is intent architecture, knowledge-base grounding, fallback patterns, and escalation discipline.The audience: growth marketers and product marketers shipping chatbot growth tooling, in-house teams designing conversational flows for marketing or support contexts, agencies running chatbot work for clients.
Out of scope: AI in content workflows (covered by ai-content-collaboration); the engineering implementation of chatbots (handed off via pm-spec-writing); platform-specific bot configurations (those stay implementation-side); voice agents and IVR flows (different methodology though related principles apply).
Before designing the chatbot, decide whether a chatbot is the right tool.
Chatbots earn deployment when:
Chatbots do NOT earn deployment when:
The decision is not "should we have a chatbot"; it is "is the chatbot the right tool for this specific audience and conversation."
Detail in references/chatbot-decision-criteria.md.
The keystone framing.
Scripted-bot. Rigid decision tree. "Press 1 for X, 2 for Y." Fails the moment a user phrases something the script did not anticipate. The chatbot equivalent of an automated phone tree. Cost: the user's actual question goes unanswered; the bot pushes the user through paths that do not fit; the audience leaves with a worse experience than no bot.
Hallucinating-bot. LLM-powered with no structure. Will confidently answer questions about pricing, policy, capabilities, and frequently make up answers. Liability risk; trust-eroding; support burden when wrong answers reach customers. Cost: the bot's confident wrong answers damage the brand more than no bot would; the team learns about the hallucinations through customer complaints.
Structured-guided-conversation. LLM-powered with intent architecture, knowledge-base grounding, defined fallback paths, and explicit escalation to humans. The bot knows what it knows, knows what it does not, and routes appropriately. Cost: the design effort upfront is significant; the maintenance is real; the audience trusts the bot because the bot is honest about its scope.
The litmus test. Ask the bot a question outside its intended scope. Does it confidently make up an answer (hallucinating), refuse rigidly (scripted), or honestly route the user to a human or alternative resource (structured-guided)? The third response is the goal.
Defining what the bot can and cannot handle.
The principle. The bot has a defined set of intents it can handle. Each intent maps to a conversation pattern (questions to ask, knowledge to ground in, response to provide). Anything outside the intent set falls to fallback.
Intent design patterns.
Intent coverage. The bot's intents should cover 70-90 percent of expected conversations. The remaining percentage falls to fallback. Trying to cover 100 percent often produces bloated intent sets that the bot cannot handle reliably.
Intent maintenance. Intents drift as products evolve, audiences shift, and conversations change. Periodic review surfaces which intents are useful and which need refining.
Detail in references/intent-architecture-patterns.md.
The bot's responses must come from real knowledge, not made-up confidence.
The principle. The bot's response generation should reference a structured knowledge base (documentation, product specs, pricing pages, support articles). The bot does not invent answers; it retrieves and presents.
Grounding patterns.
The hallucinating-bot failure. No grounding. The LLM generates confident-sounding answers from nothing. The team discovers wrong answers through customer complaints.
The structured-guided win. Grounded answers. The bot's responses match the source-of-truth. Customer-facing accuracy is maintained.
Detail in references/knowledge-base-grounding-patterns.md.
How the bot adapts the conversation based on user input.
The principle. The bot's conversation can branch based on user inputs (intent recognized, prior answers, user attributes). Branching makes the conversation feel adaptive.
Branching patterns.
Branching discipline. Each branch should add value. Decorative branching (asking for confirmation when none is needed) adds friction.
Branching limits. Bots that branch too deeply lose users. 3-5 turns is often the practical limit before the user wants resolution.
Detail in references/branching-and-conditional-logic.md.
What happens when intent is unclear or out-of-scope.
The principle. Every conversation has fallback paths. The bot has rehearsed responses for "I do not know," "I am not sure I can help with that," "Let me connect you with a human."
Fallback patterns.
Fallback discipline. Multiple fallback layers. First, try clarification. If unclear after one round, suggest alternatives or escalate. Do not loop the user through 5 clarification attempts.
The fallback-as-honesty principle. A bot that admits it does not know earns more trust than a bot that fakes confidence. Audiences forgive limitations they were told about; audiences punish wrong answers they were given confidently.
Detail in references/fallback-pattern-design.md.
When, how, with what context handoff.
The principle. Some conversations need a human. The bot escalates when its scope is exceeded, when the user requests it, or when the conversation pattern indicates the user is frustrated.
Escalation triggers.
Escalation context handoff. When escalating, the bot passes the conversation history and recognized intent to the human. The human does not start from scratch; they pick up where the bot left off.
The escalation-quality test. Does the human pick up the context smoothly, or do they have to ask the user to repeat everything? The latter signals broken handoff.
Detail in references/escalation-to-human-patterns.md.
Measuring what the bot is and is not doing well.
The principle. Track the bot's performance per intent, per fallback, per escalation. The data informs maintenance and design improvements.
Conversation metrics.
Diagnostic uses.
Detail in references/conversation-analytics-patterns.md.
Rapid-fire. Diagnoses in references/common-chatbot-failures.md.
When designing or auditing a chatbot, walk these 12 considerations.
The output of the framework is a chatbot that knows what it knows, grounds its answers in real knowledge, escalates appropriately, and earns trust by being honest about its scope.
references/chatbot-decision-criteria.md - When chatbots earn deployment and when they do not. The conditions that warrant the build.references/intent-architecture-patterns.md - Defining what the bot can and cannot handle. Named intents, hierarchies, boundaries, coverage.references/knowledge-base-grounding-patterns.md - Retrieval-augmented generation, source-of-truth design, citation discipline, knowledge-base maintenance.references/branching-and-conditional-logic.md - How the bot adapts the conversation. Intent-driven, context-driven, user-attribute, multi-turn branching.references/fallback-pattern-design.md - What happens when intent is unclear or out-of-scope. Multi-layered fallback patterns.references/escalation-to-human-patterns.md - When, how, with what context. Escalation triggers and handoff quality.references/conversation-analytics-patterns.md - Per-intent metrics. Diagnostic uses. The data that informs maintenance.references/chatbot-anti-patterns.md - The patterns that look like chatbots but degrade trust.references/common-chatbot-failures.md - 10+ failure patterns with diagnoses and cures.The chatbots that work as compounding assets are the ones the audience trusts. Not because they answer every question. Not because they are infinitely capable. Because they are honest about their scope, ground their answers in real knowledge, and escalate to humans when the bot's job is done.
That is the bar. Below the bar are scripted-bots (rigid trees that fail edge cases) and hallucinating-bots (LLMs without structure that make things up). Above the bar are structured-guided-conversations where the bot's intent architecture, knowledge-base grounding, fallback discipline, and escalation patterns combine into a tool the audience can rely on.
The discipline is in the design choices. The intents that define what the bot can do. The knowledge-base grounding that prevents hallucination. The fallback patterns that handle the unknown gracefully. The escalation logic that knows when to step aside. The analytics that surface what is working and what is not. The maintenance discipline that keeps the bot in sync with the brand it represents.
8e70d03
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.