The importance of looking outside of "the data"
Two weeks ago, a post from a former Nike marketing director made the rounds on LinkedIn and floated to my attention via UX folk like Pavel's take on it. It's a pretty long read of a post (that's totally worth it), but the short summary of is that Nike has a huge drop in share price after a very disappointing 2024 Q2 earnings report.
The author of the post places the blame on decisions made by a new CEO in 2020 that involved a big shift in how Nike did business. Essentially, the company was to be data driven, focusing much of it's energy on direct to consumer (DTC) sales via the nike.com site. In service of this goal, there were massive restructurings that did, among other things, demolish the concept of product categories within the company (?!), terminate multiple wholesale distributor relationships, and shift where and how the company spent its marketing dollars.
Once the dust settled four years later, the company had wound up not doing a bunch of things that had helped maintain their dominance in the marketplace. They didn't put as much product onto retailer shelves because they were focused on DTC efforts, so those retail stores had room to put other brands onto shelf spots. Normal customers of athletic shoes weren't as brand loyal as the executives expected, and weren't as willing to buy shoes from a web site – they'd just buy a nice pair of whatever brand shoes at the stores they were used to going to. Inventory built up because the data wasn't giving very strong signals on what would sell well. Margins tanked because in the online marketplace price and discounts are a huge factor while they're significantly less so in a retail setting where the retailer may have their own sales (and cut their margin) independent of what corporate is doing. Wall street was quite displeased and the stock dropped almost 25% overnight after their 2024 Q2 earnings report.
But one callout that caught my attention out of the many things going on in the post. They pointed out that the CMO at the time made a big leap in focus from "demand generation" to "demand retention". While I don't think the executives at the time sold the change exactly as such, that's what happened in practice. At the heart of it, becoming "data driven" and focusing on increasing DTC sales meant you were going to leverage all the data you collect about your direct customers in an effort to understand them and sell those very same people more stuff.
In the end, things fell apart because the strategy actually did succeed in its goal of encouraging a specific segment of existing customers to spend more and more. These were usually people who were buying a constant stream of increasingly niche "limited edition" shoes. A fair chunk of those repeat buyers were likely scalpers of said shoes too. While that increased the general lifetime value (LTV) of such customers and made certain metrics look good, the rest of the strategy wound up screwing up a lot of squishy, hard-to-measure things that turned out to be critically important to the business like Nike's placement in stores, awareness of the brand among people who didn't collect shoes. It even messed up their read on what mass consumers wanted to buy because they had burned a bunch of wholesale relationships that used to order designs by the container-load according to customer demand while removing categories of experts that knew the trends and needs of their particular product lines.
Data-driven death spirals
Being "Data-driven" has become this overused buzzword in the business world. Who would object to making decisions based off data instead of unspecified vibes? It's like saying you only eat food – what alternatives would you eat? Saying you WON'T be data driven is usually going to be more of a problem than not.
As the people who stand literally at one of the major intersections of business functions and data, we data scientists are often placed at the center of such efforts. Between us, data analysts, and business intelligence folk, there's whole armies of people with data skills who'd love more attention and funding for their projects. Thanks to that, we are all highly incentivized to encourage people to use as much data as they can handle because it increases our own visibility, as well as our influence. Most people working with data will consider a CEO saying that we're going to be 100% data driven as a positive direction to cheer for. It's a grand validation of why we even have jobs to begin with.
But as seen here, such a strategy can lead to very long term consequences. I've seen whole niche gaming industry segments essentially paint themselves into a corner of catering to their "core users" for a decade, only for them to realize that they haven't really attracted many "new users" in the same period. Now the whole genre is a shadow of its former self, with most studios bought up or shut down. Nike, being much bigger and seemingly more self-aware, has plenty of time to make course corrections and fix things. But it's going to take work.
If you follow the gaming industry, you'd recognize that they went through a ridiculous boom/bust cycle around the very similar concept of loot boxes and microtransactions. There's still a giant market for free-to-play "gacha" games for mobile devices that derive most of their revenue from "whales" that spend more than enough to pay for all the costs of the rest of the player base. A few of those games are successful, but there's a huge bloody churn of similar games that don't make it past even a year of service. A small (or even not so small) game studio can try to hunt whales amongst a sea of affluent gamers and maybe eke out a profitable existence. I'm much more skeptical of a multi-billion-dollar physical goods operation doing the same thing.
So why do I, and others, point at a focus on "being data driven" as being a major factor in this episode? It's because being data driven is primarily a culture thing, and only culture changes have company-eroding outcomes. You cannot force an organization to be data driven without driving deep cultural change in many systems at once. You also can't fake such a transformation by making people check boxes and run a couple of A/B experiments on autopilot. This change was pushed both top down and adopted bottom up. What gets valued and funded changes. How things are funded changes. Whole teams get spun up, processes get revamped, other teams get destroyed, technology gets procured, and much more.
What happened in the case of Nike is that performance-based marketing and sales on the website are very easy to quantify, analyze, and demonstrate how any changes translates to actual changes to the bottom line. It's an everyday occurance in our field to run such an experiment and proudly declare that "if this experimental lift remains steady, we'll see an $XYZ million lift in annual revenue!". Maybe some of us are self-aware enough to know those claims are mostly lies that likely will never materialize, but we use similar phrasing because it catches people's attention.
You can do similar analyses for internal operations like customer support, production and logistics and claim to have saved millions of dollars there. Optimizing stuff like this is exactly where data scientists can go ham and point how much money we're saving or making while everyone cheers. The tools for this are well known and easily applied – I'm positive you know at least some of the toolkit. Most importantly, even people who don't work with data have a working understanding of this stuff and the results with big dollar amounts can speak for themselves. So if there's a push to fund this sort of work, it's very easy to see a parade of small and big wins for months or years.
As others have highlighted before, the biggest problem is that the vast, vast, vast majority of the data you get are from existing customers. The more they buy from you, the more you know about them and are able to squeeze an upsell out of them. It creates a feedback loop where the organization has more data about a group, so it's better able to cater to them, which creates more data. The obvious problem is that you increasingly pay attention to an ever smaller group of high-spending "core customers" to the exclusion of everyone else.
Think about how much data Nike would have gotten about the purchasing habits and preferences of the weird group of customers who spend thousands of dollars a month on limited edition shoes compared to someone like me who buys a pair of shoes maybe once a year at some random retail store. Then imagine hanging your entire business strategy on any insights gleaned from that dataset. When a company goes all-in on such a strategy to the exclusion of other endeavors, it essentially means they've decided they can't grow any more from finding new customers and so must grow by squeezing more money out of the ones they have. Except, even if you had 100% market penetration and massive brand recognition, you always need to find new customers because your existing ones will, if nothing else, succumb to old age eventually.
So the key point in all this is that we metaphorically have two arms – one that for finding meaning in existing data, and another arm for finding meaning through collecting new data. We keep going to the gym, exercising and flexing the first arm, and half-forgetting we even have the second. We rarely think to use that weak arm because it feels somewhat out of our wheelhouse. If we're charged with optimizing a process and are sitting on tons of data generated by the process already, it's often way outside our scope to worry whether said process should exist to begin with unless our analysis somehow leads us to the conclusion.
Looking outside much more
We, as one of the primary stewards of working with data at an organization, have a duty to make sure that everyone around us relies on data when it makes sense to. We also need to make sure that everyone is aware that the very data-driven process they tout as being objective and neutral is actually very biased and fundamentally unable to see the whole picture. Essentially, they cannot rely on us to get the whole story that's needed to operate a successful business. The gains in operational efficiency, and insights into the pain points of existing customers is all extremely valuable work that quantitative methods can consistently deliver using nothing but internal customer data. But to continuously focus inward is to loose sight of where the money really comes from – outside.
When I worked as a data analyst/scientist earlier in my career, looking "outside" was not something I really did. I had my hands full supporting a gazillion teams and juggling endless ad-hoc requests. Same went for any peer data folk I worked with. We were plenty happy with whatever optimization gains we could get because there was so many places to find improvements. We had years worth of potential work lined up already without volunteering to take more on.
At most we'd occasionally touch upon projects that intersect with the external world. For example, when we need to segment traffic from an advertising campaigns, or someone runs a marketing survey and we're asked to analyze the responses. But we almost never get asked to actively go out and solicit or collect data from external sources. It's not even that we don't have methods for these questions, it's more of a structural blind spot in how our jobs tend to set up.
But working closely with people from a UX background was a big change in those habits. Since UX teams generally have a much stronger history of figuring out what people who aren't current customers thought and wanted from products, they had well developed frameworks for taking a step back and asking what those non-customers wanted and why they chose to not buy our stuff. This is also true of marketing teams because much of marketing's job is convincing new people to spend money on our products. Both of these groups are more likely to consider the existing data we have to be just a starting point, a single piece to a much bigger puzzle.
For me, just simply acknowledging that "the data as seen is not the full story" on a regular basis was a big improvement. Once you accept that, then a lot of the other questions come naturally – what makes our customers special? Why don't people with very similar needs and problems find our product? Answering them is a rather painful, uncomfortable process of needing to communicate with strangers (gasp!) but that's the nature of the research.
But there are lots of data science teams who don't have to cross paths with these more outward-facing job roles. That's the dangerous thing about our jobs. If we're to bear any responsibility for our influence on how our companies make use of data, we really need to be make other people aware the data bubble exists. We can't just put a little asterisk in an appendix that notes "*based on our existing data" and absolve ourselves of asking the very hard questions of whether our data has any value at all towards growing the business. Most people haven't fully thought through the implications of sourcing all our data from a very biased sample of "people who are willing to pay us money".
Since there's not that many people around who can teach people that lesson. We're gonna have to teach them about it ourselves.
Standing offer: If you created something and would like me to review or share it w/ the data community — just email me by replying to the newsletter emails.
Guest posts: If you’re interested in writing something a data-related post to either show off work, share an experience, or need help coming up with a topic, please contact me. You don’t need any special credentials or credibility to do so.
About this newsletter
I’m Randy Au, Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. Counting Stuff is a weekly newsletter about the less-than-sexy aspects of data science, UX research and tech. With some excursions into other fun topics.
All photos/drawings used are taken/created by Randy unless otherwise credited.
- randyau.com — Curated archive of evergreen posts. Under re-construction thanks to *waves at everything
Supporting the newsletter
All Tuesday posts to Counting Stuff are always free. The newsletter is self hosted, so support from subscribers is what makes everything possible. If you love the content, consider doing any of the following ways to support the newsletter:
- Consider a paid subscription – the self-hosted server/email infra is 100% funded via subscriptions
- Share posts you like with other people!
- Join the Approaching Significance Discord — where data folk hang out and can talk a bit about data, and a bit about everything else. Randy moderates the discord. We keep a chill vibe.
- Get merch! If shirts and stickers are more your style — There’s a survivorship bias shirt!