20 years ago, Jakob Nielsen posited that personalization was overrated, primarily because technology was not sufficiently advanced to create good predictions for what users cared about. Fast forward to today, and (as he predicted) personalization is a growing trend on the web. Individualized personalization, as opposed to role-based personalization, refers to the practice of tailoring content and functionality to a specific user, based on data gathered about that user’s preferences and behaviors.
While personalized user interfaces can include anything from command shortcuts to color schemes, a particularly useful form of individualized personalization is the use of recommendations on websites and mobile apps. Recommended products or content items are particularly common on ecommerce sites, social media, news sites, and streaming video or music services, but can also be found on other genres of sites. So, while a service like YouTube may offer millions of videos, any given user will only be shown a handful of recommended videos and channels when visiting the site’s homepage or after watching a video, based on gathered data about that user.
Individualized recommendations can be based on machine learning or other artificial-intelligence techniques, explicit customization instructions from the user, or some combination of both.
To gain insight into users’ expectations and mental models around the many types of individualized recommendations offered on sites, we ran a remote moderated usability study with 8 participants located across the United States. In each session, participants completed facilitator-assigned tasks on 2–3 websites on which they had accounts and also answered recommendation-related questions in an interview.
Our study participants were highly attuned to the fact that sites commonly track their browsing patterns, purchase histories, and other sources of data to present individually personalized suggestions. Overall, these recommendations were appreciated and seen as instrumental for narrowing down the options available on a site. To reap this benefit, users were willing to sacrifice some privacy; they expected many of their actions to be tracked and analyzed.
Curation via Recommendations
Reactions to recommendations were overwhelmingly positive across all our study participants. Suggested items curated to their interests helped them avoid choice overload and “sift through the muck” to find items of interest faster.
“[Recommendations are nice because they don’t] make you go load many, many products before actually finding the thing you want. They show things you might actually be interested in and then you just have to choose among those. I think it’s pretty good.”
“I’m on Sephora all the time. I use the app, the mobile website, and the computer website. I feel like, like Google looks at everything I look at and then customizes ads for me, and I think it would be really cool if Sephora did that. … That way it would be kind of personalized to me, that would be really awesome. It would save a lot of scrolling.”
“I really appreciate the way it allows me to discover other things that are adjacent to things that I like, that I might not discover on my own. … I don’t always have the patience to listen to a whole bunch of things that I’m not sure I’m going to enjoy, to sift through and find the one thing that I will enjoy. And so, if Spotify can do that for me that would be great. That’s why I tried Stitch Fix the first time too. I sort of like shopping, but I don’t love, ‘oh I need a pair of black pants so I’m going to sort through tons and tons of black pants to try and find the one that’s going to work for me.’ When I could answer some questions and then have Stitch Fix be like, ‘these are nice pants, you might enjoy them, and they’ll look good on you.’ That takes some of the work of looking away, and just provides you with ‘here’s something you might like, give it a shot.’”
Like a personal recommendation from a friend who knows you well, these personalized suggestions were weighed more heavily and considered more relevant than generic promoted content. When given an open-ended task to browse for something that would interest them, participants gravitated toward these recommended items (when they could find them).
“Top picks for me, I would assume, would have a little bit more of a correlation to the types of things I prefer, where popular is just what’s popular for everyone else. … What might be popular might not be very appealing to me. That’s why I normally start with the top picks for me.”
Expectations Around Data Collection and Privacy
When asked what information is used for generating recommendations, people had fairly well-formed ideas. Here are the data types mentioned as probable sources, with those participants were the most confident about listed first:
- History — either past purchases (on ecommerce sites) or content consumed (such as videos watched, songs played on content services)
- User-entered profile data, including demographics such as age, gender, and location, as well as categories of interests or other information specific to the site’s context
- Saved or “favorited” items
- Browsing behavior
- Search history
“I really have no idea how some of this is chosen. Most of it seems to be pretty on point, so I would guess it would be based on my shopping habits: things that I click on, things that I favorite, and things that I buy.”
While a couple of users appreciated that they didn’t need to make a purchase to benefit from personalized recommendations, most wanted their direct activity such as past purchases, saved items, or entered profile information to be weighed more heavily than simple browsing. They felt that not all the content they clicked on was relevant for them and did not want that browsing activity to skew future recommendations.
“Sometimes I click on something because I think it’s going to be great, and then I read the reviews or … like that glitter mask I looked at, that or anything like it I would have no interest in seeing in my Recommended For You. But the things I’ve added to my Favorites list, the brands I’ve purchased from in the past, I would absolutely want to see more of those brands and those types of products in my Recommended For You.”
Users assumed that some recommendations are based not only on their direct or indirect activity, but also on algorithms that identify similar users or items resembling the ones they’ve shown interest in.
“They maybe have some sort of algorithm for ‘if you like this black-and-white plaid button-up shirt then you’d like other black-and-white plaid things or other button-up shirts,’ something like that.”
“Obviously they have some sort of algorithm they’re using based on lots of different data. I’m assuming it’s based on my own personal use of Netflix and what I’ve been watching. And also, probably nonpersonal, meaning based on my age, gender, things like that.”
None of the users in our study were overly concerned about privacy (perhaps partially due to the innocuous nature of the sites we tested), although several assumed that “other people” might be worried. For these participants, having their site usage tracked was accepted as the status quo, and simply a byproduct of using the Internet and living in the modern world. The benefits of getting tailored content outweighed any privacy concerns.
“I know a lot of people are against [tracking], but it’s a beauty product so I don’t care.”
“I’ve resigned myself to it, because I don’t foresee that I’m going to stop using the Internet any time soon. I’m just like, ehhh.”
“It bothers me a little bit, but I know that that’s how things are going to go in this world, they’re not going to change. … It’s just something you expect with the world’s technology and everything else. But it doesn’t bother me like it does some people.”
“I think it’s pretty cool. I mean, some people might be scared if companies or something are trying to catch what your next step would be. But I mean, they’ve always done that with Pew research and stuff like that. I think it’s pretty good, it shows that the company is trying to offer the best service possible.”
Judging Generic vs. Personalized Content
With personalized content becoming increasingly common, how do people determine whether a piece of content is tailored to them? Not surprisingly, our study participants relied primarily on explicit headings (e.g., Recommended For You, Because You Watched). Beside these obvious indicators, indirect clues such as the overall popularity of the item being suggested or the likely business interest in promoting it were also used to determine whether an item was a personalized recommendation.
The structure of the service plays a large role in whether users expect content to be tailored to them. A user of the Hello Fresh meal service didn’t believe that the meal options for each shipment were specific to him, even if he marked certain recipes as interesting, because the company would have to do too much work to deliver recipes personalized to each customer.
A user browsing Netflix questioned whether the show promoted in the top banner area on the homepage was individualized to him. After some deliberation, he determined that Netflix had a clear business reason to promote the show, and thus the content was likely to be presented to all users.
“First thing that comes up when I go to my home screen is Orange is the New Black. I’m not sure if it’s the same for everyone or not. … Maybe a new season just came out and that’s why it’s here and everyone sees this? Or maybe because I’ve seen it before they’re trying to say, ‘hey why don’t you watch more of this?’ … I’m guessing everyone probably sees it, because it’s probably a new season release. Watch Season 6 Now it says. So, I’m sure Netflix is trying to push their own original shows.”
The perceived popularity of a piece of content was also used as a clue for judging whether that content was personalized. Our participants tended to assume that trendy or talked-about items were shown to everybody; in contrast, niche or less mainstream content was more likely to be judged as specific to the individual, as was content clearly related to the user’s past behaviors. For instance, an Eventbrite user was unsure whether the Food and Drink events displayed on the site were personalized: although she often went to such events, she felt that so did everyone else.
“Maybe, again, based on my searching and stuff this is what they think I’d find the most interesting, or, ‘we can tell based on your history you’re the most interested in this.’ But I also think that that’s a common thing that people find interesting.”
Similarly, a Hulu user was not sure whether the medical dramas that he saw on the homepage were popular shows enjoyed by a lot of people or personalized content based on past behavior in his account (he mentioned that his mother sometimes watched those shows on his account when she came to visit). In contrast, he thought that the displayed fantasy dramas were likely personalized, because he believed that those shows are less popular.
“I’m sure these categories are somewhat customized. I don’t think TV Fantasy Dramas are very common for most people to watch, and none of these shows are really the shows I hear people talking about right now. So I’m sure these have something to do with what I’ve been watching.”
Accuracy Dependent on Level of Activity, Limited by Technology
Users in our study were very forgiving of recommendations that missed the mark. Many new or infrequent users of a site commented that, in the absence of sufficient data, they did not expect the site to make accurate predictions of their interests. Our participants acknowledged that recommender systems were often easily skewed when multiple people shared a single account, because there would be multiple competing sources of data the system would need to accommodate.
“I’m also conscious of the fact that if I register [for an event] and don’t go, or if I don’t use the site frequently, then they don’t have a lot of data to go off.”
“Once you start watching more movies I think the recommended suggestions will be better for you. This is the type of thing that gets better once you use it more and more. It learns from what you do and then shows things you might be more interested in.”
“This Hulu account me and my spouse both use so some of the shows, Top Picks for Me, would probably be a little less specific than just for me. … A lot of these shows, I wouldn’t be quick to click on.”
Additionally, users expected recommender systems to be imperfect and to make errors occasionally. Computers aren’t mind readers, and they aren’t expected to be any time soon.
“I also know that it’s not a perfect system, and so my expectations of it are kind of tempered. Even if my Spotify account were only mine, I wouldn’t expect to love every single song it picked out for me. Sometimes it doesn’t know why, sometimes I like a song because of personal memories associated with it and Spotify is never going to understand that.”
Poor Recommendations Are Easy to Ignore
Users didn’t mind uninteresting suggestions and simply continued browsing for items they cared about. Even when prompted to investigate whether there was a method to hide or dismiss poor recommendations, many stated they wouldn’t have thought about doing that. For the majority of the services we tested, the interaction cost to give feedback on less-than-ideal recommendations was too high and thus not worthwhile — it’s less work to ignore a bad suggestion and continue scrolling than to hunt for how to mark that suggestion as irrelevant.
“It’s not the type of thing that really offends me or feels like, ‘oh, Eventbrite doesn’t really know me.’ … It’s just the type of thing that I would be scrolling past.”
Frequency of use did not affect users’ tolerance for uninteresting suggestions. While people expected that individualized recommendations would improve over time as they continued to feed data into the recommender system, they remained very tolerant of bad suggestions.
However, for those services that primarily delivered personalized content or physical goods, people were willing and even driven to interact with the system to help the service get to know them better. That’s because the cost of a bad recommendation (e.g., shipping back a box of unwanted items) was a lot higher than the interaction cost of training the recommender system.
“In the context of the homepage [on Amazon] I care less [about good recommendations] than [about quality recommendations from] something like Spotify or Stitch Fix. Because in a way, with Spotify or Stitch Fix, I’m getting those things. And how much effort I put into managing whether or not I’m going to like the things I’m getting determines how good the stuff I’m getting is going to be. Whereas with Amazon, it’s not going to automatically send me the things on the first page, it’s just a starting point to look for something that I want.”
This finding has implications for the choice of recommender engine. Let’s say you’re running a streaming video site with a page layout that allows you to show only two recommended movies to each user. You have the choice between two recommendation algorithms: Algorithm A selects two movies that (averaged across the users you tested) both score 80 on a 0–100 scale of viewer satisfaction after watching a film. In contrast, Algorithm B selects two movies that on average score 90 and 50. So in case A, average user satisfaction with the recommended movies is 80, whereas in case B, average user satisfaction is only 70. However, Algorithm B is the better choice since users won’t mind some bad recommendations (here a movie they only score 50), but will be more happy with the one recommendation they end up actually watching (90 instead of 80).
As with most other aspects of user experience, users’ expectations for your site are driven by their experience with other sites. Thus, if or when major sites improve the precision of their recommendations, it is likely that users’ tolerance of poor recommendations will decline.
Tracking site usage in order to present personalized content isn’t considered an invasion of privacy to many users, at least for services where users have chosen to create and maintain personal accounts.
On the contrary, personalization such as individualized recommendations are viewed as a feature — a sign that a site attempts to better serve their users by helping them narrow down the overall number of options to consider. Poor suggestions are easily ignored, or, when the benefit of getting good recommendations is strong enough, users may even be willing to engage further with the site to finetune future recommendations.
It’s obvious that personalization is no longer overrated.
Source link https://www.nngroup.com/articles/recommendation-expectations/