assistants are a new, increasingly popular way of interacting with technology — available both on smartphones and on smart speakers such as Amazon’s Echo and Google Home. This trend seems poised to grow even more, as companies such as Bank of America introduce assistants tailored to a specific domain.

Elsewhere we discussed 6 characteristics of intelligent agents that hold promise for this new interaction style, but in usability testing we found that today’s assistants are pretty far away from fulfilling this promise.

6 Traits Necessary for Successful Intelligent Agents

Voice input

Natural language processing

Voice output

Intelligent interpretation

Agency (ability to initiate actions)

Integration of the previous 5 technologies

On the other hand, in a separate critical-incident study, we found that, in spite of the assistants’ limited capabilities, people reported repeatedly engaging with these systems to complete a relatively small number of simple activities — answering trivia facts, getting the weather forecast, or navigating to a destination. Are these functionalities good enough? Do we need more?

In general, before embarking on adding new fancy features to a product, one fundamental question should be: do these features address a real user need? To understand whether there is truly a need for more advanced functionality, we assessed and compared the following:

  1. What do people want a perfect intelligent to do?
  2. What do users currently do with today’s intelligent assistants?
  3. How many of the users’ ideal needs are addressable with today’s assistants (whether people know it or not)?

Research

In order to answer those questions, we ran two separate studies:

  1. A of assistant-related user needs. We recruited 12 participants and asked them to pretend that they had the most intelligent assistant that could be ever built (a perfect version of Siri or Google Assistant, not the current product); this assistant would be available anywhere, at any time, and could help them with anything. For one week, participants logged all their assistant-related needs; for each need, participants filled in a questionnaire about the need and about how they expected the assistant to help them with it; they also recorded if and how they ended up addressing that need. (This part of our study was inspired by work done by Sohn and colleagues, who, well before the smartphones became ubiquitous, ran a similar diary study of mobile information needs to understand how these devices might be used.)

    To further determine how far today’s intelligent assistants are from people’s needs, we fed each need logged by our participants to one of three existing intelligent assistants: Siri, Google Assistant, Alexa; we then recorded whether the query could be completed by the assistant. (If one assistant could not do it, we tried a different one.) If part of the need could be addressed by the assistant, we rated that need as partially addressed. We decided to be as favorable to the assistants as possible and we occasionally changed query formulations to make them acceptable to the existing assistants.

  2. A critical-incident study in which 211 daily users of Alexa, Siri, or Google Assistant reported how they last used their assistant. The results from that study are described in a separate article, but we refer to them here to interpret the results from the diary study.

In the diary study, participants logged 636 needs; out of those 14 were ambiguous and were removed from our analysis. From the remaining 621 needs,193 were “repeat” needs — that is, needs that were logged more than once because the participants had them multiple times during the week. In what follows,we will focus the analysis on the 428 unique needs.

Can Users’ Needs Be Addressed with Today’s Assistants?

We found that existing assistants could have addressed 41% (177) of the ‘ideal’ needs logged in the diary study,and another 21% of these needs could be partially addressed by the existing assistants.

Pie chart: Can Users' Needs Be Addressed with Today's Assistants?Yes=41%, No=38%; Partially=21%
Of the unique needs logged in the diary study, 41% could have been fully addressed and 21% could have been at least partially addressed by at least one of the currently available intelligent assistants.

The ability of current assistants to address many of users’ ‘ideal’ needs seems to be good news for creators of intelligent-assistant interfaces. However, when we looked at how our study participants actually solved these needs, we found that only 7% of the needs were actually addressed using one of these assistants. (Instead of using an assistant, 46% of the needs were addressed through a computer or smartphone, 20% of the needs were addressed through physical means, 4% through a phone call, and 25% were not addressed at all.)

Considering that 62% of the needs could be fully or partially solved by today’s intelligent assistants, users employed their current assistants only in one of the 9 times when they could have used them with some success. Not using an assistant was 8 times more common than using it for those needs that could have been fully or partially solved with the assistant. Even if we don’t consider partial help as being satisfactory and only compare the 41% of fully addressable needs vs. the 7% of use, we still get that it was 5 times more common to not use an assistant than to use one.

This low rate of usage for intelligent assistants speaks to the low expectations that people have about these assistants and to the difficulties that they may have previously encountered when using an assistant. Also, remember that not all assistants are equally good at every task, and, for that given task, not all commands formulations work equally well. For a need to be successfully addressed by a current assistant, two preconditions have to be met: (1) the participant must have the right assistant, and (2) the participant must formulate the “right” command for that need, so that the assistant can answer it. These two requirements help explain why the difference between what’s possible and what’s actually accomplished in real life is so big.

Generally, the needs that people addressed via an agent were simple: “Set an alarm for 8pm tomorrow,” “How late is Palm Bluff course open till,” “Play some morning wake up music,” “Remind me to wash Olivia’s hair tonight about 8pm,” “How is the weather this afternoon,” “Set a timer for 15 minutes,” “Turn on lights,” “What does `plunder’ mean?”

Bar chart: Current Method for Addressing the Needs
Diary-study participants attempted to use an existing digital assistant for only 7% of the unique needs they would want a ‘perfect’ assistant to help with. Most needs (46%) were addressed using a device such as a computer or smartphone, while 25% of the needs remained unaddressed.

Initiating Interactions: Voice-Input Commands vs. Agency

We asked participants how they expected to trigger’s assistant help. A spoken command was the most commonly mentioned trigger (selected for 84% of the needs).Thus, good comprehension of free-form voice input was definitely a highly important assistant characteristic for our participants.

Bar chart: What Would Trigger the Assistant's Help? Spoken command =84%; nonverbal command = 4%; no command = 12%
Spoken commands were the preferred method for interacting with the intelligent assistants. For 12% of the needs, participants expected the assistant to initiate the interaction without receiving any explicit command.

For 4% of the needs, participants said that they would issue an explicit nonverbal command (such as pressing a button or making a specific gesture). For example, one participant would rub her stomach in order to signal to the assistant that she was hungry. Another participant said that locking the door to his house should prompt the assistant to turn off the lights. Yet another one expected the assistant to automatically know to ask her when she would like to wake up as soon as she laid down for a nap.

Some participants also said that they would prefer giving nonverbal commands when the information requested was too complex. For example, a participant reported that she would rather type the name of the restaurant where she wanted to make a reservation, to make sure that no mistake was introduced while dictating the command.

However, in 12% of cases, participants thought that the assistant should know to help without receiving any type of command, based on the participants’ context. (These types of needs are related to the agency component of our list of assistant characteristics.) Some expectations were fairly reasonable and based on explicit data that the assistant was supposed to have — either from prior interactions or from access to calendars, location, or other personal information. Other expectations were based on nuanced, fairly subtle cues that the assistant was supposed to pay attention to, behaving almost like an observant human who followed around and was proactive.

The table below illustrates examples for both these need types.

Situations where the assistant should infer users’ goals with no instruction, based on context

Needs based on explicit data

Needs based on implicit, subtle cues

Notify people of delays for flights in the calendar

Check with the pharmacy for prescription status immediately after a person left the doctor’s office

Reminders to exercise, clean, do laundry regularly without prior setup

Look up a restaurant on Yelp when the name of the said restaurant was casually mentioned in a conversation

Automatically check in for a flight in the calendar, 24 hours before the flight

Monitor health signs for early headache symptoms, and alert the participant to take action

Turn on security alarms upon leaving the house

Automatically set up a flight-price tracker if the user searched for a flight fare on the internet

Warn people to leave items such as knives at home when departing for a destination that required a security check (airport, museums, etc.)

Detect odors of clothing articles and prompt washing

 

Email other affected parties when someone was likely to miss an appointment (e.g., if time was getting close to the appointment and the person was too far away)

 

 

Some of people’s expectations were quite farfetched: for example, they wanted the assistant to access others’ behaviors or data and alert them. One user expected the assistant to figure out when his boss came nearby his office and alert him, and another wanted the assistant to figure out where the car in front of him was going and notify him so that, if that car needs to turn left, he doesn’t get stuck behind it. (Both would be technically possible, but likely to be considered privacy violations by many people.) Another person wanted the assistant to detect that someone else filed taxes in his wife’s name.

Complexity of Needs

Participants logged a variety of needs, ranging in complexity from simple, one-step actions to complicated flows that required assembling information from different sources:

  • Simple actions required usually one step to complete.
  • Multistep needs were similar to an interaction flow on a website or in an app; they required going through several stages to complete a process.
  • Multitask needs involved the use of multiple activities and applications to achieve a goal.
  • Research needs required putting together multiple sources of information and analyzing options.

The table below shows examples of needs in each category.

Simple-action needs

When is my first meeting tomorrow?

How many calories in a serving of chili?

Turn on the shower at 8:05 to 80 degrees.

Remind me to get mom a birthday card.

Where is the nearest Starbucks?

Weather for today

Set morning alarm.

Multistep needs

Find a coffee shop nearest to the gym.

Order coffee from Starbucks.

Start directions to Essex Restaurant at 10am.

Transfer $100 from checking to savings account.

Save the pictures my husband sent.

Create a checklist.

Multitask needs

Find me the best route using carpool lanes.

Dial into the Webex meeting from my calendar, mute myself, and set volume to medium.

Who do I need to prioritize to meet with for the day [based on time since last meeting]? Then put my files in that order.

Back up my photos from Friday to now to Google Drive and send my parents a link to the folder.

Track my produce delivery for tomorrow and send me updates at each step.

Email my next meeting to let them know I am running 10 mins late.

Research needs

Send me some juice recipes that I can make with what’s in my fridge.

I have a runny nose, sore throat, and achy back. What could I have?

Why are there helicopters in SF right now?

What is the best place to stay in Miami taking into consideration everything the hotel has to offer and the price?

Is it healthier to drink smoothies or fresh-pressed juices?

Find me a sweet pie recipe that is highly rated and very unique.

Order me an umbrella.

Examples of needs at various levels of complexity, from simple actions to complex research tasks

Although some of these needs may seem similar, participants often gave additional details that helped us classify them. For example, for the “Order me an umbrella” need, the user wanted the assistant to find some well-rated umbrellas on Amazon, and place an order for one. Because this need involved a research component (finding a good umbrella, as opposed to any umbrella), it was assigned to research. In contrast, the participant who simply wanted to order coffee from Starbucks had a very precise item in mind, so the need was classified as multistep only.

Multitask needs required the assistant to either perform multiple related tasks (“Dial into the Webex meeting from my calendar, mute myself, and set volume to medium”) or to get information from one source, and use it with a different app or in a different context (e.g., “Email my next meeting to let them know I am running 10 mins late” would involve identifying the next meeting from the calendar, then emailing the attendees).

Some needs also required the assistant to program an action into the future& — either at a given time (“Start directions to Essex restaurant at 10am”) or at a time that the assistant needed to determine (“Remind me to call the doctor’s office when it opens”).

The majority of the needs that our participants recorded were simple actions (58% of the unique needs); multistep and research needs were fairly popular (17% each); the least common needs were multitask needs (9%). Thus, a total of 42% of the needs were more complex than just simple, one-step commands.

Bar chart: Percentage of Needs by Complexity
58% of the needs logged by our diary participants were one-step tasks; 42% were more complex.

Interestingly, simple actions are the dominant category of tasks that people do with existing assistants such as Siri, Google Now, and Alexa. Only 26% of the frequent users mentioned that they complete tasks more complex than one step with today’s assistants. Thus there is definitely a gap between the complexity of the needs that people have and that of the tasks they actually do today, using current assistants (42% complex tasks needed vs. 26% done).

Bar chart: Task Complexity for a Perfect Assistant vs. a Real Assistant
This chart compares the complexity of the needs from our perfect assistant diary study with the complexity of the activities reported in our real-assistant critical-incident study. Simple actions were overrepresented in the critical-incident study, while more complex actions were rarely mentioned. However, activities requiring more than one step (labeled as Research, Multitask, and Multistep) represented 42% of the needs tracked by our diary users. (The yellow bars represent the percentage of needs from the diary study that fell in that complexity category, and the green bars represent the percentage of users in the critical-incident study who reported doing tasks of that complexity with today’s agents.)

Note that it’s possible that the mental models that people already have about today’s intelligent assistants (based on their current experiences with Alexa, Siri, or Google Assistant) informed their use of the current assistants. Indeed, currently, people have fairly limited expectations about what these agents can accomplish. So, in theory, assistants today may be actually capable of accomplishing more complex needs, yet people may be unaware of their capabilities.

The complexity of an activity is a major factor in whether today’s assistants can successfully complete it. When we revisited the issue of whether today’s agents can address a need based on the need’s complexity, we found that about half of the simple-action needs could be accomplished with the current agents. Even in that area, there is still a lot of room for improvement. The percentages for the other types of tasks were much lower — about 30% of the multistep and research needs, respectively, and 16% of the research needs could be completed today. (These numbers represent upper bounds — they assume that the query was sent to the best agent in the best possible formulation; in real life, even fewer will actually be successful.)

Bar Chart: Percentage of Needs Addressable with Today's Assistants
The chart shows the percentage of logged user needs that could be addressed with today’s assistants: about half of the simple-action needs could be completed with existing assistants. Higher-complexity needs were less likely to be addressable with Google Assistant, Siri, or Alexa. (These statistics do not include needs partially addressable by the current assistants.)

Users have great difficulty accomplishing advanced tasks with traditional computer systems: only 31% of the adult population in rich countries are capable of performing tasks similar to the multitask and research needs in our table, when using traditional user interfaces. Since more than two thirds of the population don’t have the required computer skills for doing anything advanced with current computers, there’s great potential for helping these many people if the intelligent assistants were in fact good enough to take over the tasks.

What’s Needed vs. What’s Being Done

We identified 12 different types of tasks that users logged, as defined in the table below.

Communication

Communicate explicitly with someone else through text/call

Transaction

Place an order or other type of financial operation

Reminder

Trigger a notification from the assistant/phone

Scheduling

Coordinate with other parties to create a calendar appointment

List

Create a list (of items to purchase, ideas, etc.)

Creation

Create a new document/image/other virtual artifact

Local info

Traffic, weather, directions

Info retrieval

Answer a question that is not about local info

Idea

Suggest information broader than the local info or info retrieval, that does not satisfy a specific criteria, and that cannot be objectively right or wrong

IOT control

Interact with a IOT device (other than phone, computer, or smart speaker) and provide a command (e.g., alarm clock, fridge)

Real-world control

Interact with a physical world object which is currently not smart (e.g., socks, ibuprofen)

Phone control

Start a certain activity on the phone, laptop, smart speaker, car interface (e.g., playing music, a podcast, turn on the phone flashlight)

Taxonomy of needs logged by the participants in the intelligent-agent diary study

The most common tasks were reminders (26% of all unique needs): our participants needed simple reminders (to pay bills, do laundry, take a break, place an order, pick up a child), as well as more sophisticated reminders based on location (e.g., buy a card when close to a drug store) or on an external event (e.g., “A notification when my favorite pizza shop brings out a fresh pie”).

Local information was the next most common type of task (21% of needs): people routinely wanted to know the weather for the day, traffic, or directions to a particular destination.

Information-retrieval tasks (e.g., “What time is my son’s doctor appointment today?,” “Figure out that movie that starred Jim Carrey and Cameron Diaz”), transactions (e.g., “Call an Uber”, “Order Laughing Planet food”), ideas (e.g., “Decide where to get dinner tonight”, “Choose what to wear today”, “What can I cook with the ingredients in my fridge?”), and communication needs (e.g., “Text Kim and tell her I’ll be 10 minutes late”, “Ask Jeff if he wants to go to pho with me and Janet”, “Send this video to mom”) were also fairly popular.

Some needs were rated in multiple categories. For example, “Look up a recipe for zoodles and print it” was rated as both information retrieval (find a recipe) and IOT control (sending it to the printer). “Order my regular order from Tsing Tao for pickup at 5pm” involved calling the restaurant (communication) and placing an order (transaction); “I need to plan what’s for dinner [based on the content of my refrigerator]” involved information retrieval (accessing the items in the fridge) and an idea (for a recipe with those ingredients).

Bar chart: Percentage of Needs by Task Type
Reminders and accessing local information were the most common tasks logged in the perfect-assistant diary study. (Numbers add to more than 100% because some needs were classified under multiple types.)

In terms of task types, the local needs and reminders were most likely to be addressable with an existing assistant. 64% of the local-info needs in our study could be addressed by Siri, Alexa, or Google Assistant, and over 40% of the information-retrieval, list, or reminder needs could also be fulfilled today by one of these assistants. Perhaps not surprisingly, no creation needs and under 10% of IOT-control needs or real-world–control needs could be addressed. But, interestingly, the next least addressed categories were communication and transactions — with less than 20% of those needs being successfully addressable by today’s assistants (these numbers do not include the partially addressable needs).

This chart shows, for each task type, the percentage of logged needs that can be addressed by today’s assistants.

How do these types of user needs compare with the actual activities that people report using their current assistants for? In general, the ‘ideal’ needs logged by diary participants were more diverse than the actual usage reported by users of today’s assistants. Users currently perform a narrower range of tasks (getting local info such as weather, traffic, and directions, fact retrieval, controlling the phone by turning on music or setting an alarm), but they do need help with other types of activities.

Bar chart: Percentage of Reported Activities by Type: What Users Need to Do vs. What They Do with Intelligent Assistants
This chart compares the types of tasks people need to do with a perfect assistant with the types of tasks they report doing with today’s assistants. The yellow bars represent the percentage of needs of that type (out of the total number of needs logged in the perfect-assistant diary). The green bars represent the percentage of frequent users who report engaging in that activity with today’s assistants, as determined by our critical-incident study.

What Do Assistants Need to Know to Satisfy Users’ Needs?

Intelligent interpretation and agency are two assistant characteristics that require a combination of real-world knowledge, personal information about the user, and contextual information about the here and now. Which of these are essential? And what types of knowledge are used by today’s assistants?

To answer those questions, we categorized each need in our diary study according to the type of information (“knowledge”) involved in addressing them:

  • Personal information: Information about the asker that could include:
    • Personal electronic data such as phone, address, current location, contacts, calendar
    • Personal physical information such as content of the asker’s refrigerator or health signs
    • Past history such as prior orders or interactions with various applications, businesses, or people
  • Web: information that could be found through a web search
  • Third-party information: private information about other people or organizations who are not the user (such as another person’s location)
  • No information (e.g., for tasks such as setting an alarm, or other tasks in which all the needed information was contained in the command)

Most needs (65%) required some form of personal information (usually personal electronic data) to be completed, and 44% of the needs required general information available on the web. 22% of all needs were self-contained and could be completed with no additional information.

Bar chart: Percentage of Needs by Knowledge Type
This chart shows the different types of knowledge that were required in order to address the needs logged in our perfect-assistant diary. Most needs required information available on the web or personal information such as phone, address, or current location. 22% of the needs did not require any information in order to be completed. (Numbers add to more than 100% because some needs required multiple types of knowledge.)

We also wanted to understand how knowledge requirements affect the ability of today’s agents to complete the need. Not surprisingly, needs requiring third-party information or physical personal information were unlikely to be addressable with what we have today. Perhaps more interesting is that the needs involving knowledge of the users’ prior interactions were also unlikely to be addressed today — a possible indication of the limited learning abilities of today’s agents. (However, note that the overall number of needs involving past history was small to start with.)

Bar chart: Percentage of Needs Addressable with Today's Assistants by Knowledge Type
This chart shows, for each knowledge type, how many needs could be addressed with today’s assistants. The web-based needs and those that required no information were most likely to be addressable today.

Last but not least, we looked at the knowledge required to complete the tasks that people reported in our critical-incident study. Compared to the information required by users’ ideal needs, most of the tasks that were actually done seemed to rely heavily on the web and on personal electronic information (particularly user’s location and contacts). Like in the previous section, this polarization of today’s task around specific types of information is an indication of the lack of diversity in today’s assistant-related activities.

Bar chart: Info Required: User Needs vs. Users' Actual Activities
This chart compares the knowledge required to address the users’ needs in the perfect-assistant diary with the knowledge required to complete the activities reported by the participants in our real-assistant study. The yellow bars represent the percentage of needs that required each knowledge type out of all needs logged in our diary study; the green bars represent the percentage of users who reported activities involving the same type of knowledge in the critical-incident study.

Conclusion

Our study attempted to understand what user needs could be satisfied by a perfect version of an intelligent assistant, and how far away current assistants are from fulfilling those needs.

Hand-drawn chart that shows two axes: task complexity is the x axis (from simple to complex), task frequency is the y axis (from rarely to often).
The gaps between what’s done today with current intelligent assistants, what’s feasible, and what is needed

We know that usefulness = utility + usability. The above chart shows that the realized usefulness of current intelligent assistants (the green area) is fairly low, especially in the range of more complex tasks. The potential usefulness is much higher, as indicated by the full set of needs mentioned by our users: potential usefulness is represented by the full area below the top line in the chart. However, the usability gap (blue area) and the utility gap (orange area) eat up most of that potential usefulness. The usability gap is caused by features that exist but are too difficult to use, whereas the utility gap is caused by missing features. Both gaps must be closed (or at least narrowed substantially) for intelligent assistants to be truly useful.

We found that even when imagining a perfect assistant who could do anything, people tended to have fairly simple, one-step requests and expected to present them to the assistants mostly verbally, in unrestricted natural language. However, many of the users’ needs required that the assistant have implicit contextual knowledge about the user, and use that knowledge to interpret users’ actions and infer their goals. Some of the needs required the assistant to be proactive and take initiative before the user had given any command.

Although about 41% of the needs could be addressed by one of today’s assistants (if counting generously), only in 7% of the cases did users actually attempt to address a need using Alexa, Google Assistant, or Siri. This difference indicates a gap between users’ expectations of these assistants and what they can actually do. (It also reflects the usability of these systems — people won’t bother trying to use the assistants if it’s easier to address the need by some other means.)

Moreover, the study shows the discrepancy between people’s needs and how they use their assistants today. Frequent users of Alexa, Google Assistant, or Siri tend to focus on a few tasks of limited complexity, requiring very specific types of knowledge. Yet, the universe of needs is much broader, and to address all of them, assistants will have to expand their abilities to more sophisticated, complex tasks and take advantage of knowledge beyond user’s current location and list of contacts.

Reference

Timothy Sohn, Kevin A. Li, William G. Griswold, and James D. Hollan. 2008. A diary study of mobile information needs. CHI ’08.



Source link https://www.nngroup.com/articles/intelligent-assistant-user-needs/

LEAVE A REPLY

Please enter your comment!
Please enter your name here