Terms to know
The delivery mechanism for the conversation. Sometimes referred to as a virtual assistant or conversational AI.
An experience that can consist of one or multiple responses that answer a customer’s intent. Responses can vary in number, wording, or turn.
A process in which the system clarifies or resolves ambiguity in language. For example, if you say "I need to book a ticket," the bot might ask "Do you mean a movie ticket or a flight ticket?" to disambiguate the meaning of "book" and "ticket" in this context.
A conversation with a series of responses. Generally uses buttons to go to the next response of the conversation. Flow examples include onboarding, collecting data from the customer to do something for them in the system, and contacting an agent.
What the customer wants to achieve. It's also how the conversation is stored in a curated response. For example, “change my address” would be the customer’s intent, and the conversation that would answer it might be stored in an intent called profile.help.change_address.
large language model (LLM)
A model that uses vast amounts of data and finds patterns to predict the best answer to your request. It isn’t trained on any specific utterances. This is often called generative AI (genAI). For guidance on usage, see the word list entry
Specific GenAI tools, such as GenStudio or ChatGPT 3.5 Turbo
The mechanism that manages a certain group of conversations. For example, the reporting plug-in might handle all reporting creation conversations.
The reply generated for a customer.
The creativity level that the LLM uses when generating responses. The higher the temperature, the more variety in responses.
The conversational exchange between the bot and the customer. One customer utterance + one bot response = one turn.
This is an utterance that is used to train natural language understanding (NLU) models with curated intents to identify when a customer’s utterance should go to a specific intent.
What a user says to the bot to express their intent.
Types of bots
We design conversations for two channel types: voice-based and text-based. Here are some special considerations and watchouts for each.
- Humans can speak and respond quickly, but need to listen slowly
- Content that takes too long (like a long list of instructions) may cause a timeout
- Longer numbers, abbreviations, and instructions aren't always clear to the customer
- Attention spans are short and the customer may have external distractions
- Explicit confirmation is required before making any changes to data or routing the customer. For example, “You are calling about payroll, is that right?”
- Bot inflection and tone isn't always clear and understandable
- Human accents, inflection, and mumbling can confuse the model
- Voice shouldn't be used for experiences involving personal data (unless there is a way to authenticate the caller)
- Reading information is quick, but typing out a response can be slow
- Bots are expected to understand the customer perfectly the first time
- Text can stay on the screen for a longer time for the human to re-read
- Numbers, abbreviations, and instructions can be acknowledged quickly and usually implicitly
- Has lots of context about the human
Be brief and clear
Text-based bots are usually in space-constrained chat panels with limited formatting. Long bot responses force the customer to scroll through the message, making it harder to read.
Voice-based bots put the burden on customers to remember what it said, and people can only remember so much.
For both bot types, cut messages down to their most essential. A customer can always ask the bot for more info. For more help, see our tips for writing small.
For example, instead of telling the customer each step to fill out a new invoice, the bot might simple say, “Create a new invoice, fill in your details, and Save.”
Understand the customer’s true goal
A customer’s goal could be a single intent or a series of intents. For example, if a customer wants to change their email address on an invoice, they might want to update that email address everywhere. A response that includes the option to make a change everywhere anticipates the customer's intent—and potentially saves them time.
Context and listening
Context is the info the bot should carry forward or refer back to in a conversation. It considers where the customer is coming from, their product, their history with the product, and their emotional state. Context makes conversations flow more naturally.
When first responding to a customer, use active listening to acknowledge you’ve heard the customer and demonstrate that the bot understands their request. Repeating what customers say as an explicit confirmation is the simplest (but also the most heavy-handed) way to show the bot understands.
Whenever possible, have the responses flex based on the customer. For example, if a customer is migrating from desktop to online products, the bot might say “Items are now called invoices in the online version. To personalize your invoices, go to Sales, open the invoice, and select Preview.”
Listening isn’t just understanding the words a customer enters. Use data to personalize responses.
The bot should recognize the topic being discussed throughout a conversation. If a customer initially asks about bank errors, the bot shouldn’t constantly reconfirm that.
There are 3 basic parts to bot responses:
- Signal understanding (acknowledgement)
- Deliver a response (body)
- Offer an action or next step (pill or question)
Follow this structure to clearly communicate when the bot is finished and it’s time for the customer to act.
Use components like answer pills, cards, and vertically stacked buttons to guide the conversation, but don’t use too many as it may overcomplicate things. See IDS AI patterns for more conversation design guidance.
A bot's response starts with acknowledgements. They show the customer the bot heard them. They also smooth the transition from the customer’s input to the meat of the bot’s response—keeping the bot from feeling too abrupt. You can use acknowledgements to show surprise, concern, or understanding.
got it, thanks, OK, alright, sure, no problem
great, good, wonderful, nice, super, perfect, yep
sorry, hmm, I see, I understand, I hear you
Take advantage of tags like “Sure” or “Alright” to keep things conversational. Only use "OK" for automations when the bot finishes a task. For example, if the customer wants to report fraud and the bot says "OK," the customer might assume it’s been reported.
Cues tell customers what to do next, inviting them to respond to the bot in a way it'll understand. Use them to ask for info, present choices, and confirm actions.
Once the bot has given its response, make it clear when it’s the customer’s turn to do something. There should be no ambiguity about whose turn is next, especially in voice-based bots.
- Obviously signal a customer’s turn to speak obvious and generally presented as a question. If you want the user to do something, ask them explicitly: “Would you like to change your payment now?”
- Only use language that you want to get in return. Customers will reuse words they see from the bot in their responses.
- Keep conversations simple by asking one question at a time. Combined questions, or confusing yes/no questions like, “Do you mind…?” will only confuse both the bot and the customer.
- Offer the highest trafficked or most intuitive instructions, even when there are multiple ways to do the same thing.
- Be clear about the impact or consequences of each option you present—users should know exactly where they will go next.
- Boil info down in a quick recap or re-prompt at the end of a longer sequence. This makes it easier for users to make a choice because they don’t have to rely so much on working memory.
- Upsell only when a customer asks for it (“Is there a way for someone to look at my taxes?”), they need it to complete a task (“I’m having trouble reconciling and need help”), or a model has suggested the customer is ready to upgrade or attach.
- Explicitly resolve one subject before starting another or stopping.
A cue’s scope affects the answer you’ll receive. Use a narrow question to get singular answers and open questions to get many.
- Make the bot listen for many possible answers
- Can freeze the user, but can feel more natural, conversational
- Great for front door responses
- Only listen for a few answers
- Are easier to answer, but aren’t as natural and can feel overly limiting
- Great for workflows and slot-filling responses
- Should only offer a maximum of three choices (“Do you want your 2020 or 2019 tax return?”)
Discourse markers, such as so, first, finally, and so on are similar to acknowledgements, but they happen throughout a conversation instead of at the beginning of a response. They’re less about signaling understanding and more about shaping meaning.
Discourse markers are a great tool for shaping your conversations to make them more comprehensible. This is especially true for voice-based bots.
First you have to select what to upgrade to. Then you have to update your billing info.
Let’s get started!
Types of discourse markers
first, second, finally, now, then, before, after
so, now, let’s start with, by the way, right
because, so, as a result
but, however, except
well, actually, anyway, of course, wow, exactly, you know
Be clear when ending bot responses or conversational threads. At the end of responses:
- Confirm a problem is resolved (or the customer is satisfied with the result)
- Give the customer closure
- Ask if there’s more the bot can do or help with
- Offer next steps or actions, such as pills to other conversations or links to other resources