Castles, even ones on Long Island, are presumably not built in a day

Bootstrapping early (organizational) interest in data

Jul 15, 2025

Oh hey, if you haven't submitted your DataBS Conf talk yet, do so before July 31! Any help spreading news of the form will be appreciated because the more submissions we get, the better chances we get really interesting talks on conference day!

For various reasons, I wind up talking to a fair number of people who are "the first data person" in their workplace or team. Maybe they were specifically hired to become the first data person, or maybe they were hired for some other job but they've always been interested in data and found an opportunity to grow into that space.

One very common question I get from people in those positions is how they get teams to start adopting data practices, start investing resources into data infrastructure, and eventually start adopting a more data-driven culture. These folks see all the stories from "more advanced" organizations that have full data teams with funding and maybe even a Chief Data Officer of some sort, and naturally wonder if there will ever be a day where their own organization could have something even slightly similar.

I've got a decent amount of experience advocating for data adoption from the ground up, much less so for advocating for data adoption from the top down, so I can only speak to half the picture, but here we go.

Pushing for data use from the bottom up

The most common situation I see from people who want to get their org to start making better use of their data is from someone who has some data skills and can see an opportunity of some sort. Maybe they see all this data being collected but not being put to any use. Maybe they see that data isn't being collected but would likely help. Either way, the person asking the question is the usually the person who is the most interested in making use of the data.

Everyone else in the organization is presumably very busy with their normal duties already and haven't had the mental bandwidth to implement any use of data. It's important to note that those same people can have varying opinions as to how data should or shouldn't be used to help their work. Some may be against the idea (rare but they exist), while many would love to have the opportunity. They just simply don't have the resources or skills to integrate data into their work.

My advice for getting people to use data from the ground up is to start small and snowball the process up. Instead of trying to change the minds of multiple people at once, find someone to partner up with that already sees the value of using data in their work. Very often there's some kind of engineer, product lead, or executive that wants to run some experiments or do some analysis but has never had the resources to do so. Maybe they've tried to do something in Excel but couldn't access the numbers to make it work. They've likely put a bit of thought into how data could help them, but couldn't make the case to hire a data person to prove it. As a roving person with data skills, this is your opportunity. Go help that person out.

Once you find someone to partner with, the next question is figuring out what to work on. Figuring out the right project is the most important part of this because the difficulty range of questions can range from "10 minutes of Excel work" to "boil the ocean and solve an unsolved academic problem". When starting out, you want the former, not the latter. The easier and faster you can turn around data results, the better.

In this situation solving things in a spreadsheet is actually a good thing because you ideally do not want to use any fancy tools at all. The reason being that fancy tools cost money to acquire, they cost time to set up, you likely have access to none of those if you're just doing data work "on the side". If your organization was writing $10,000/month checks on data analytics infrastructure already, you wouldn't be trying to convince them to use data. So if you can do everything you need straight off your laptop, then you bypass all the headache of having to advocate for and justify any additional expense.

The reason to use bare bones tooling is because you're trying to show as quickly as possible what data can do for teams. There's really no predicting what kind of effect any piece of analysis will have. Like random social media posts, you can't predict what result will "go viral" and transform a decision. Most analyses will have barely any effect at all, and the one you'll least expect will grow legs and spread. Plus, since there's supposed to be very little use of data right now, there should be lots of easy projects that have lots of potential impact available to work on. Volume is a perfectly valid strategy here so pick as many easy ones as you can.

The goal for this early work is to show an example of "what working with data to improve things can accomplish". The story you want to tell is "this team had this problem, we found some data that gave us guidance to consider a solution, and it made things so much (measurably) better". Notice how none of that story has nothing to do with any specific method. It doesn't even have to involve "data science". I've solved scheduling/queuing problems using my old operations management classwork and support case data. Use whatever works. The point is you want to then take this success story and share it as much as you can so that other teams see this example and say "I want that for my team". You want executives to say "I want all my leads to do something similar". You're building demand because demand means getting support and resources to do more data work.

This ad-hoc demand building exercise will take a lot of time and a bit of luck, but if all goes well, as teams start attempting to use data and banging into stumbling blocks, there'll be discussions at some level to start "building infrastructure to support teams using data". That's the next big step up.

Building out your first data infrastructure

The thing about building infrastructure is that they get better when their requirements are well defined. It's much easier to build a system that supports "SQL queries that run in overnight batch over our sales data, powers dashboards, and is at most 24 hours lagged behind production data" than "a system that lets everyone use the data they need". The first description can be fulfilled using an old laptop or a giant datacenter, but you can visualize what it will look like. The later one is useless as design guidance.

But when you're starting out with building your data infrastructure, you're going to see requests that resemble more of the later than the former. People simply do not know what they need. Most importantly, you do not know what you will need. You won't even know what sorts of questions you're going to be asked. At most you have a rough idea.

So the most important thing is resist the urge to overdesign things. I once gave a talk about solving the problems in front of you and it applies here, because you won't know what infrastructure problems you need to solve until you have some basic infrastructure around that gets in your way and helps you learn what infrastructure problems you actually have. Even if you have previous experience at a "data-first organization" and know what all sorts of super powerful data systems look like, it's very likely you need none of that right now. There's no better way to kill enthusiasm for data than spending 8 months building a "real time analytics platform" to answer questions about a business that works at a weekly or monthly cadence.

All this usually means that your very first data infrastructure should be the most ridiculously low effort thing you can cheaply slap together that works. Very often it's a bottom dollar relational database that mirrors the production data you plan on using. It's well understood tech, IT teams will have no problem managing it for you without help, it's dirt cheap to implement, and it's very general purpose and works for most small-to-medium problems. If your goal is still to quickly get more teams to answer the most basic of data questions, sticking to cheap and easy will still be the way. A bit later, once the flaws and requirements become obvious, you can take the time to more formally design a better system. Maybe that will be six months from now, or three years.

It's a marathon, not a sprint

Getting an organization to adopt the use of data is never a quick thing. Using data to make better decisions can happen at almost every level of activity if we allow it. It is a behavioral and cultural shift – such changes are never fast because groups of people never change that quickly. There's going to be weeks and months that feel like you're going nowhere but setting up meetings and running failed projects. You're going to have projects canceled because of new priorities. You're going to find teams that completely miss the point and try to run rigged experiments that only make them look good. You'll be writing training decks, hosting workshops, going on roadshow presentations to show what data can do.

But eventually, sooner or later, the bits of value and best practice will pile up. The hundreds of little bug fixes, infrastructure tweaks, and incident postmortems will polish things up. And before you know it, you're going to be working with people who not only want to use data to help their work, but actually have an idea of how to do it. You'll only realize that you've made it to that point with hindsight because there's always more things to be done. But just remember to start small and keep chipping away at it.


Standing offer: If you created something and would like me to review or share it w/ the data community — just email me by replying to the newsletter emails.

Guest posts: If you’re interested in writing something, a data-related post to either show off work, share an experience, or want help coming up with a topic, please contact me. You don’t need any special credentials or credibility to do so.

"Data People Writing Stuff" webring: Welcomes anyone with a personal site/blog/newsletter/book/etc that is relevant to the data community.


About this newsletter

I’m Randy Au, Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. Counting Stuff is a weekly newsletter about the less-than-sexy aspects of data science, UX research and tech. With some excursions into other fun topics.

All photos/drawings used are taken/created by Randy unless otherwise credited.

Supporting the newsletter

All Tuesday posts to Counting Stuff are always free. The newsletter is self hosted. Support from subscribers is what makes everything possible. If you love the content, consider doing any of the following ways to support the newsletter:

  • Consider a paid subscription – the self-hosted server/email infra is 100% funded via subscriptions, get access to the subscriber's area in the top nav of the site too
  • Send a one time tip (feel free to change the amount)
  • Share posts you like with other people!
  • Join the Approaching Significance Discord — where data folk hang out and can talk a bit about data, and a bit about everything else. Randy moderates the discord. We keep a chill vibe.
  • Get merch! If shirts and stickers are more your style — There’s a survivorship bias shirt!