  1. Local Lingo

For both and text, you must account for the vocabulary used by your intended users. For example, Australian users may greet using “mate” while North American users will either omit addressing the bot in any way or say “dude” in a casual setting.


  1. Pronunciations

With voice, you have to account for different pronunciations that Amazon Alexa or Google Home may or may not pick up. We ran into this issue while a bank bot for customers of Mexican banks.

Alexa had trouble picking up terms like CEDES, a common acronym in Mexico to note certificates of deposit, even with an American pronunciation, Spelling out C-E-D-E-S would fail often as well. This is the reality: while Amazon works on making Alexa accessible to Spanish speakers, we have to design around this to provide the most natural interaction.

After bouncing around a couple ideas, we had Alexa call it “certificates of deposit” while offering an investment update on it, so that the would also ask for it using this term.*

2. Visual Aids (Pictures, Videos, GIFs)

You don’t realize how much of a luxury it is to be able to use pictures and videos until you start designing voice bots.

Imagine this: someone is in Barcelona for a business trip, and they ask the Alexa in their office to tell them the best cafe to meet a client for coffee. To account for how long it would take for them to get to their next meeting across town, they ask how close the cafe is to their next meeting spot.

If they were using a web chatbot, it could show them directions from point A to B in Google Maps.

If they ask Alexa, the best solution may be to tell them the estimated travel time, and offer to send directions to their phone. Alexa telling them to make a left turn at which street would be useless as they wouldn’t retain any of that information.

Here, the designer’s job is to determine the most pertinent information to provide through speech, and the best way to communicate other necessary information.

