I dislike persona projects
Weird, the post was scheduled to send out at 8:05 am and it somehow didn't this morning =\
When you do a bunch of interviews, you're bound to get asked a few common questions. You know the ones, like describe how you work with other teams, or how do you explain statistical significance to an executive who doesn't know anything about stats. Another common one is describing times when you failed at something.
Have a long enough career and there's always things that don't work out, so I've got plenty of stories of the sort. This "job hunting season" my favorite thing to publicly fail about are "persona projects". I'm sure at the mere mention of this, a number of readers will instantly have a PTSD flashbacks, because these projects is fertile grounds for failures to bloom. The reasons are somewhat twofold.
But before I go on a mini rant, let's get some basic definitions in.
By "persona project" today, I'm talking about a family of data requests that aim to create "personas", caricatures of users that are then used by various people in a business to do various things like design marketing campaigns, train salespeople, design interfaces and build products. These personas can be things like "the stay-at-home mom with 2 kids that has $40k of disposable income to spend taking care of the family" or "the busy teacher in a high school that's teaching 5x 30+ student classes every day and has to track all their work and progress". They can be broad, they can be specific, it varies based on the use case.
The assumption is that these different users have different needs and reasons for adopting any kind of product, so an effective company needs to create products that are suited to the most important types of users, and at least consider building different, more suitable products for other personas.
First, ground-up persona building is a mess
Since time immemorial, executives have gone to data teams and asked "can you take all this data we have about our customers and find patterns of behavior we can call personas?" The base assumption is that "different kinds of users" will emerge as distinct patterns of behavior if you just squint at the data from the correct angle.
If you've ever done this before you'll know that there's mind-blowing degrees of freedom in such a request. It's a complete fishing expedition. Out of the dozens, hundreds, millions of potential data points, you need to find some functional subset that somehow separates/clusters users in a meaningful way. For traditional machine learning problems you can do something like throw all the data into k-means and "find" as many clusters as you please. Take a few weeks to wrangle your data into such a model, run it full of hope and expectations, and you'll be faced with a bunch of clusters that are complete uninterpretable mysteries. Why is it that group 2 looks identical to group 5 except for some weird minor difference that seems irrelevant? What does the activity list of group 8 even mean?!?! We picked k=10 for groups but should it be 5, 15, or 100?
On occasion, I've even had qualitative researchers look at the output groups and talk to users that fit certain clusters and even they can't figure out what the differences are.
Second, then why not go top down
Obviously if the bottom-up approach of starting from data is so messy, why can't we define the personas first with maybe the help from a bunch of qualitative research and input from sales teams?
Well, if you have an app that is used by many people for the same purpose, their activity is mainly going to look the same. Imagine your banking app, the most common functions that everyone will use will be the most typical banking operations – checking the balance and moving money around via deposits and payments. Other features like signing up for a credit card, sending an international wire transfer and creating new accounts are rare events. At this point, the original labels on the accounts at creation, business checking accounts versus personal accounts, savings versus CDs are much more informative. Moreover, you don't even have to guess at what they are because they've got explicit labels.
But maybe you have a particularly observant qualitative research team and they come saying "there's a group of users who have this special use case and use your product in this very unique way". Surely, quantitative data can find that since we know what to look for? Well, sometimes. But the reality is that unless you have lots of unique features and paths that different groups of people use, activity tends to follow the same paths because products were literally designed to favor particular paths. As an example, take the banking app example, the vast majority of personal and even small business users will make use of the check balance/deposit/transfer features, it's much easier to tell them apart with things like transaction volume than what buttons they click in a UI. So, you can definitely mine some of these things out in the data.
But at the same time you're by definition not going to discover any behavior that you don't know about already – because all your ideas had to have come from somewhere else.
But what about LLMs?
Ahahaha. Okay it's not completely out of the question.
LLMs have shown some ability to do few-shot and zero-shot classification on blobs of text matched to labels, and persona projects can be framed as a similar kind of problem. But this area is an open area of research and now you've introduced yet another parameter into the meta modeling process – is the LLM model doing what you want? Can you fix/improve/tweak it by adjusting the prompt? Or should you spend more energy on the dataset? And there's still no guarantees whatever labels you're generating are actually real either. So it's definitely not better than the above two methods, it's just adding new complexity in the middle.
Either way, I've personally tried all of these methods over the course of my career. All of them have resulted in various degrees of failure, from spectacular to completely unimpressive. I don't think the problem lies within whatever method I used at the time.
Personas as wishful thinking
The number one time I hear from an organization that they want personas immediately was from teams that want an answer to the question of "what the hell do I build now?" Maybe the past couple of feature launches were met with lukewarm reception or didn't hit the inflated goals that the product team sold everyone on with rosy projections. Maybe teams just didn't do the necessary user research to figure out what customers actually wanted or needed. Whatever the reason, personas are a simplistic solution – give me, the decision-maker, a catalog of personas to flip through and I can use the power of my imagination to determine features that will cater to their needs. We'll be sure to have product-market fit then!
Other times, I've heard personas being brought up because "the UI is too confusing for X, Y, and Z users!" If only the designers had a rolodex of personas to flip through, memorize, and keep in mind while they design our interface, they could magically get do it right the first time! We're going to ignore that our product has only one interface and we can't build a separate "lawyer mode!" into our banking app alongside "caterer mode!" and "weird crypto bro in their mom's basement mode!" We most definitely shouldn't be considering redesigning the UI, or removing irrelevant features and clutter because all these personas need those buttons and checkboxes.
And even other times, I've seen personas put up as attempts to get product managers to stop talking to customers that they have on speed-dial and "build for a broader audience". Except that the guy on speed-dial is the customer who is literally signing a $50 million contract specifically because a weird niche enterprise feature that matter to exactly 18 users in the world (who all would be willing to sign similar contracts). Obviously those customers are important, but we should totally be building for these cardboard cutouts that have been invented with data!
Yeah, this train wreck goes on and on. Easy sounding solutions to very tough, sometimes near intractable problems tend to end up this way. It's not even unique to "personas". I've seen it happen with various frameworks du jour, or that whole "we must do agile!" thing. It's the organizational equivalent of binge drinking after a big disappointment/breakup.
Look, they're not completely useless
I know I've spent all this time ranting about personas but I do want to put it on the record that I don't think they're completely useless. They're great for things like creating a targeted marketing campaign after the persona segments are validated as actually existing in the real world in significant numbers. I'm also a fan of using them as a checklist to make sure that when doing major work, you haven't forgotten to consider use cases of some important segment. And I guess they're ok for brainstorming starting points? Maybe?
Okay, so they're not all that useful in practice, but they're not useless. I just hate having to re-fail in "detecting" these things every couple of years because someone important thinks it's going to make everything better. Because it won't'.
Standing offer: If you created something and would like me to review or share it w/ the data community — just email me by replying to the newsletter emails.
Guest posts: If you’re interested in writing something, a data-related post to either show off work, share an experience, or want help coming up with a topic, please contact me. You don’t need any special credentials or credibility to do so.
"Data People Writing Stuff" webring: Welcomes anyone with a personal site/blog/newsletter/book/etc that is relevant to the data community.
About this newsletter
I’m Randy Au, Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. Counting Stuff is a weekly newsletter about the less-than-sexy aspects of data science, UX research and tech. With some excursions into other fun topics.
All photos/drawings used are taken/created by Randy unless otherwise credited.
- randyau.com — homepage, contact info, etc.
Supporting the newsletter
All Tuesday posts to Counting Stuff are always free. The newsletter is self hosted. Support from subscribers is what makes everything possible. If you love the content, consider doing any of the following ways to support the newsletter:
- Consider a paid subscription – the self-hosted server/email infra is 100% funded via subscriptions, get access to the subscriber's area in the top nav of the site too
- Send a one time tip (feel free to change the amount)
- Share posts you like with other people!
- Join the Approaching Significance Discord — where data folk hang out and can talk a bit about data, and a bit about everything else. Randy moderates the discord. We keep a chill vibe.
- Get merch! If shirts and stickers are more your style — There’s a survivorship bias shirt!