A long ramp down to the commuter train platform at Grand Central Station, NYC

Different levels of optimization problems

Feb 18, 2025

When you get right down to it, data scientists predominantly rely upon on meta trick to largely justify our existence. We are optimizers. Zoom out a bit from the day to day and we essentially are doing something to max(z) or min(z).

Much of the fancy pants ML and "AI" stuff that exists today is because we figured out algorithms to give us loss functions to minimize. Ordinary Least Squares literally has min(z) in the name. A/B experiments are judged by comparisons against control so that we can incrementally head towards what we hope is a local optimum. All the ancillary stuff like data engineering, data collection, and user research aren't directly optimization actions, but instead things we do to allow us to apply optimizations in the future.

But upon thinking more about our "one simple trick" I'm starting to gain an appreciation for how there's a lot of depth to it as we grow and improve in in our careers and techniques.

Starting point - optimize

Generally we start out in our careers being asked to help make a number go up or down. The most common ones being to maximize revenue or minimize costs. There are lots of these problems to solve for in a business context and the specific techniques that can be applied to the problem range from extremely simple to novel ones at the cutting edge of research.

To be clear, I'm separating the optimization task at hand ("make number go up"), which can be very straightforward, from the specific methods used to achieve that improvement. What I'm trying to say is that these straightforward problems are our bread and butter. They're easy enough to understand that people completely new to the field can be expected to understand what is being asked if them.

There's little subtlety or nuance in these simple formulations. When taken to extremes, you can easily get into pathological, end-stage capitalism situations like the creation of dark patterns, abuse of users, and all sorts of unethical or even illegal practices. The power to exclude consideration of other factors can be very powerful, and be very corrupting.

Despite the potential problems of this way of approaching problems, we are very lucky that there's tons of problems in industry that could be framed under this simple concept of "number go up". We can usually find something to make go up in a way that justifies someone paying us to investigate it.

Leveling up – constrained optimization

But life is never as simple as monotonic loss/reward functions and linear combinations. Many things involve juggling and balancing not only parameters that are antagonistic, but outcome functions that themselves are antagonistic. These problems also abound in industry, and our job here to help find the balanced maximal points.

An example of such a problem is when we are being asked to help maximize profits instead of just revenue. I could theoretically generate more and more revenue for a business by spending increasing amounts on advertising. The obvious problem with such a solutino is that it eventually will cost me more money to acquire a new customer than I can realistically earn off that customer.

Now, yes, I know that you can say that max(profit) it just a plain old optimization problem that encompasses those antagonistic parameters. Mathematically there is little difference. My point is that the awareness that the problem at hand has many hidden constraints that must be articulated is a significant maturity in thinking for a data scientist.

Unlike the first class of optimization, here we are aware that there are tradeoffs being made in the background. It is up to us to identify which factors we are going to explicitly include, and which to exclude from consideration. We're even going to decide whether we're going to investigate whether there are more factors that need consideration or not.

While the work sounds similar, it's actually a pretty big expansion of scope. We now have a say in what is and is not important! Whereas early in our career, we're told explicitly what to value and hold important above all else, we now have enough domain knowledge to have an opinion as to what to value, and where the boundaries and constraints are.

Most people wind up naturally doing this during the course of their work because they've seen things go wrong before and now know enough to stop it ahead of time. Why yes, I've seen efforts to get new customers with large discounts but the retention rates were so abysmal we wound up losing a ton of money on the effort, here's what we should be watch to make sure we stay on track. Yes, we've launched a new feature that dramatically increases retention by forcing users to solve sudoku before terminating their gym membership, but have you seen the rage in the support calls?

The things that refuse to optimize

Finally, the world isn't all hill climbing. We can spend our entire careers optimizing stuff that lends itself to optimization, but there's plenty of problems that are extremely difficult to even formulate as a max(z) problem.

The most recent example from my life is picking a few lighting fixtures for the house. While there's a few baseline minimum conditions that need to be met – a certain range of color temperature, certain size constraints, whether it mounts to the ceiling or wall, etc. almost everything else about it is completely left open to our taste. While I could use the baseline constraints to winnow the tens of thousands of potential choices down somewhat, we were still left with literal thousands of lights to flip through. There simply is no algorithm for maximizing "aesthetics".

Dealing with this was a huge, painful, struggle for myself because I'd catch myself trying to invent some kind of metric to optimize against, like a specific style with a certain color finish, or a specific dimension or material used. But I'd still find lights that looked either ugly, or pretty but completely runs counter to any existing decor in the house. Obviously, all my attempts at systematizing the process failed miserably because I might like 5 different lights but they all may fit, or not fit, the overall aesthetic of the room.

Ultimately I found myself browsing lighting shops online, flipping through every single page just noting down candidates completely by vibes, then when I got things down to the last handful of candidates, I'd show them to my spouse so we could get into a debate over whether we can agree that any one was "good enough" (again, based entirely on vibes). The process was utter torture.

In more serious work settings, I'm often confronted with this issue when it comes to dealing with very ambiguous situations that involve designing a system early on. How can you tell whether project A or B is going to work better? After all the research is done on the potential impact, cost and payoffs, there's still enough unquantified unknowns that plague the decision. So what else can we do but pick one and make adjustments as you go?

I'm starting to see how the more actual decisions you have to make, as opposed to merely making recommendations for what a good decision is, you have to fight with this problem more. Since I don't wield this sort of authority at work all that often, I've only really hit it in my little side projects – like when I'm forced to pick out the countertop for our updated kitchen.

Sometimes though, while working in such an ambiguous problem space, you do stumble upon a mix of metrics and processes that allow you to measure things in a way that allows you to approach it as an optimization problem. Congratulations on making headway into taming a novel problem!

The arc of responsibility

I'm not sure why it is that my conception of optimization skills follows an arc of increasing responsibility. It just appears to me that as you gain more experience about how complex systems are, you transition from taking a simplistic view and then move to a understanding the system more as a whole, and finally you're being asked to make decisions and create new systems.

I'm not sure if this narrative arc I'm telling is complete either. There might be things further along the path to expanding scope that I just haven't seen yet, so perhaps my views will refine over the years. But for now, it seems complete enough.


Standing offer: If you created something and would like me to review or share it w/ the data community — just email me by replying to the newsletter emails.

Guest posts: If you’re interested in writing something, a data-related post to either show off work, share an experience, or want help coming up with a topic, please contact me. You don’t need any special credentials or credibility to do so.

"Data People Writing Stuff" webring: Welcomes anyone with a personal site/blog/newsletter/book/etc that is relevant to the data community.


About this newsletter

I’m Randy Au, Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. Counting Stuff is a weekly newsletter about the less-than-sexy aspects of data science, UX research and tech. With some excursions into other fun topics.

All photos/drawings used are taken/created by Randy unless otherwise credited.

  • randyau.com — Curated archive of evergreen posts. Under re-construction thanks to *waves at everything

Supporting the newsletter

All Tuesday posts to Counting Stuff are always free. The newsletter is self hosted. Support from subscribers is what makes everything possible. If you love the content, consider doing any of the following ways to support the newsletter:

  • Consider a paid subscription – the self-hosted server/email infra is 100% funded via subscriptions
  • Send a one time tip (feel free to change the amount)
  • Share posts you like with other people!
  • Join the Approaching Significance Discord — where data folk hang out and can talk a bit about data, and a bit about everything else. Randy moderates the discord. We keep a chill vibe.
  • Get merch! If shirts and stickers are more your style — There’s a survivorship bias shirt!