career (102)
-
Dashboard rot as org attention grave markers
Dysfunction is still dysfunction, regardless of how it got there.
-
Effective (training) firehose sipping
Just some habits I've built up
-
Are quant and qual UXR melting into one thing?
Employers are being very picky right now and it sucks
-
From emergency to normalcy again
Things are gonna be okay
-
I dislike persona projects
A rant about failure and magical thinking
-
A generalist in a "specialist" job market
Don't see "Full-stack" flying around that often any more
-
Looking for work, surprise edition
I always told my wife this newsletter was my backup plan...
-
Bootstrapping early (organizational) interest in data
The first steps at making things work, in my usual scrappy way
-
Work is full of blank pages
I didn't see them everywhere until I really started looking. I had forgotten they existed.
-
Crime-ing with Data Science
Apparently we're not in a field that makes it easy to get rich quick =\
-
We're not always "back of house"
No one ever thinks of data science as "customer facing", but that doesn't mean we're never...
-
Anyone* could learn data analysis. Probably.
Thoughts on how people should teach and learn analytics based on the question at hand, and not the other way around.
-
I'll be teaching a logs analysis course in April
It'll be in person, and it's gonna be fun! I promise at least that much.
-
What's a Quantitative User Experience (UX) Researcher, 2025 edition
An update for an evolving job title.
-
A year after Substack, good and bad
An update for a year of changes. In hopes anyone wanting to write more can get something out of it
-
(Not) Building for the long term
Planning is hard ot do, especially at fast-paced startups. So how about you spend less time planning and more time breaking things down into useful chunks?
-
Book Review: Data Management in Large-Scale Education Research
Probably a lifesaver if you're in this academic field, especially if you've never collected data before. BUT the later chapters are relevant to people of all skill levels and experience, even in industry.
-
Sometimes, stuff's not worth measuring
A guide to things that aren't worth our time and we need to bap the requests that try to make us work on them.
-
"Working" with inconclusive results
Data work tends to mean getting inconclusive results, which are really inconvenient in industry because they're... nothing... and it's quite inconvenient
-
"Good enough" charts for work
I admire good dataviz, I also know that I'm hopeless at those. So instead I've found a place where it's good enough for the work I do.
-
Technical writing is too important to leave to language models
There are no shortcuts to writing something worth reading
-
Your SQL's probably not that bad (compared to most people)
If you use it all the time at work, you're probably already in the upper percentiles in skill.
-
Being air-dropped into an analysis
Domain knowledge is critical to dat awork, but we really do get thrown into the deep end at times. Why does that even work at all?
-
The call of LLMs is strong, we get to pick up the pieces later
More fields are getting on the whole LLM synthetic data generation bandwagon. Honestly at this point the best career move seems to be in learning how to pick apart and vet LLM systems instead of fighting them all the time...
-
The basics of project management for data folk
It's impossible to explain project management in a single post, but the general framework for understanding why work projects feel so objectively different from school ones is helpful to make sense of it all.
-
Why I enjoy improving enterprise software
Consumer products are fairly straightforward and interesting, but I guess I like the weirder stuff.
-
Don't expect data to change everything like magic
Many people joining the data science field harbor the belief that they're gonna help change how things work, make stuff magically better. Sadly, we all learn that it's much harder than that.
-
Onboarding and fishing for tacit knowledge
Making transitions smoother
-
Book Review: Solve Any Data Analysis Problem
Good for people starting their analyst careers
-
(Human) networking is making friends
TL;DR: Spread love, joy, or at least helpfulness
-
Stepping outside and looking around
Go out, take a breath, and see what other industries are doing
-
It's OK To Declare Email Bankruptcy!
I'll do it. You should do it too!
-
Working in data as a "meh programmer"
Working around myself
-
Finding work at work on your own
During slow stretches
-
Uhoh, I'm doing a PyData NYC 2023 talk
On Nov 02, 1:30pm
-
Pondering on Process Debt
Maybe it's a thing
-
We don't talk about communicating well
Despite it being critical to our job
-
Failure comes for us all
Gotta learn to handle it
-
Most can learn analysis, but won't become analysts
Because its different
-
I am terrible at talking about my achievements
But it's not just me
-
Remember to check your stress levels
Before things spin out of control
-
You are not a method
But you can easily become one if you're not careful
-
DS work doesn't have to be purely discovery work
If you like execution-type work like me
-
Learning to go from creating to editing
Just like code reviews
-
Everyone You'll Ever Meet Knows Something You Don't
The corollary to this, of course, is that you know something that they don't!
-
Working with the rhythms of the business
Going with the flow
-
Staying Sharp in Data Science
Can take a lot of unexpected paths
-
Reinforcing our data friend connections while we can
In case "Data Twitter" becomes just "Data"
-
Working with a non-DS manager
It's either a bit of work, or a lot of work
-
Coming up with a talk proposal
Because the hardest part is starting
-
Optimizing for personal portability
Instead of mastering a thing
-
Stories from the last downturn
Life keeps happening
-
We should phase the "SQL Interview" out
For something more general
-
Let's talk a bit about giving interviews
One day you're going to have to
-
Comparing Quant UX Researchers w/ Data Scientists
It's a mess because everything's undefined
-
You're probably on a cutting edge
Of at least something
-
The gap between data science, and UX research
There's soooo much overlap
-
Reflecting upon reflecting
Mirrors upon mirrors
-
Surviving planning from the bottom up
Not all of us sits at the big table
-
We should treat data science as a craft
And stop obsessing on surface technical skills
-
Ways to become a Quantitative UX Researcher, w/o a PhD
It can be done... with effort
-
Vacation is the power say "no" by not even being there
And that's a good thing
-
Our lives are nested orders of operation puzzles
Practice practicing!
-
Personal growth and brands
Does one even come before another?
-
When scaling yourself goes a bit too far
Because, surprisingly, it can
-
Interview season is upon us
Looking at things from the other side of the table
-
Practical benefits of knowing how obscure systems work
A mini case study
-
Muddling 1/3rd through a career
A bit of reflection
-
Being a broad-spectrum data scientist
Is something of a niche. Heh.
-
Scaling yourself
Because you'll be buried otherwise.
-
You should pick your org chart when looking at a position
We rarely talk about this to newbies
-
Dumb Stonks, Prisoner's Dilemma, and Artists
Sorry, brain's overloaded
-
The only path to be a data scientist is to be human
This job will use everything you are
-
The best parts of data science isn't even the tech
The Human Experience
-
(Maybe) Adopting New Tech
Not everyone's on the bleeding edge
-
Staying afloat as a new-ish solo data scientist
It's hard until you learn to handle yourself, and then it's still hard
-
Navigating working with other teams
Generating value upwards
-
Showing value as a support data scientist
A past experience dump
-
Side gigs and avoiding them
For (your) sanity's sake
-
Questions that make Quant UXRs excited
Everyone tends to complain about their work
-
In search of "Good Enough" data science
I still can't find it at times
-
Dashboards aren't my job, until they are...
It's complicated
-
SME Should Stand for Subject Matter Experience
It's a sliding scale, not an end goal
-
Why we are so tempted to go out of lane?
Work has conditioned us to
-
The Epic Data Fetch Quest
It's dangerous to go alone! Take this.
-
Becoming a Quantitative UX Researcher is messy
Don't expect to land the title easily, heck, don't even go for the title
-
10x data scientist is luckily not "a thing" let's all work to keep it this way
We seriously don't need that sort of nonsense
-
Being the Eyes of Your Organization
Be Aware of the Power You Wield as a Data Person
-
Be Yourself: The Data Scientists You See In Public Are Not Representative
It needs to be said regularly — it’s OK to be nothing like them
-
Emulators, Machine Translation, Self-service skills
Not all AI/ML ethical dilemmas involve life and death…
-
How I wound up being a Quantitative UX Researcher
Not exactly Data Scientist, but close enough you can barely tell the difference. It’s a fairly obscure position. This is, in broad strokes…
community (52)
-
Code review for data (and non-SWE) folks
TL;DR is, your job as a reviewer is to help the reviewee come up with better code than they initially did.
-
DataBS Conference updates!
Updates!
-
No one works with clean slates, we shouldn't write with it either
The audience for messy, one-off examples seems larger than it initially appears. If we are only willing to listen
-
Making "intermediate+" content
It's what the internet needs, and we're the only people who can create it for ourselves and our peers.
-
Discussing failure in public, we should do that more
Social media has got plenty of positive publication bias, and it's a good thing data folks are slightly more open to talking about failures. Slightly.
-
And we keep counting on...
Because I don't know what else to do...
-
We're all the #dataBS, but individually not
A reminder that a community is made of all of us together, including you. We all need to act out the community that we want to be in.
-
Data-Twitter is having a MOMENT on Bluesky right now
Dare I hope Data-bluesky will become a thing?
-
It's 2024. Let's have a WEBRING
Old person revives ancient ritual that connects related sites from data writers. Asks you to submit PR and join.
-
Phew, coming up for air soon
A brief update on what's been going on at the newsletter. Nothing crazy.
-
Summary: Data Mishaps Night 2024
A full summary of all the talks from Data Mishaps Night 2024, lightly anonymized for privacy.
-
Coding... in public... sorta
Who would've thought scheduling random meetings would be hard?
-
(Human) networking is making friends
TL;DR: Spread love, joy, or at least helpfulness
-
Stepping outside and looking around
Go out, take a breath, and see what other industries are doing
-
Calls to actions, "like and subscribe", are (sadly) necessary
However much I hate them, you can't avoid them
-
Data DIY in the age of industrial pickaxes and shovels
Talking about using our tools instead of the tools themselves
-
Everyone You'll Ever Meet Knows Something You Don't
The corollary to this, of course, is that you know something that they don't!
-
Writing posts, maybe as a guest!
Or just for yourself
-
We all make for a good conference experience
Yes, you too
-
Everything is on fire and you should contribute
The Newsletter Version!
-
Reinforcing our data friend connections while we can
In case "Data Twitter" becomes just "Data"
-
Coming up with a talk proposal
Because the hardest part is starting
-
We need to help each other write better
Because knowledge gaps are hard to spot
-
Self-narrating some usability testing for others
A bit of dogfood reporting
-
Building Discords and Community
A mini crash course
-
You're probably on a cutting edge
Of at least something
-
Discussing AI is such a confusing mess
and it's no one's and everyone's fault
-
Celebrating everyone counting things
Nerding out on counting
-
Making space for others
And I'd like to help
-
What Data Folk Were Saying about Zillow
Not a "What happened?" post no one needs to read
-
Research is a team sport, even the analysis bits
Despite it being our primary job
-
DS jargon is just everyone else's jargon
And we're going to absorb even more. MORE!
-
Dumb Stonks, Prisoner's Dilemma, and Artists
Sorry, brain's overloaded
-
Almost a year of data newslettering
Last post of the year!
-
Running small conferences is quirky work
But worth it
-
The only path to be a data scientist is to be human
This job will use everything you are
-
Real Workflows in Data Science
Some of them, at least
-
More Learning in (Semi)-Public
An 3-month dogfooding update
-
I'd like more people to join the broader data community
I want more people to talk to dangit!
-
Making the best of having too many meetings
What else can you do =\
-
Learning as Performance
And as a teaching tool
-
The Many Ways of Learning Git
And other stuff about teaching stuff
-
#Datarant is Tuesday 3/17, Join Us!
It'll be fun, I promise!
-
Drinks ‘n Data Rant this Saturday
A fun little (online) event for data folk
-
Be Yourself: The Data Scientists You See In Public Are Not Representative
It needs to be said regularly — it’s OK to be nothing like them
-
The First Question I Have For Every Data Request
And how I use it to build partnerships with cross-functional teams
data-culture (129)
-
Code review for data (and non-SWE) folks
TL;DR is, your job as a reviewer is to help the reviewee come up with better code than they initially did.
-
Dashboard rot as org attention grave markers
Dysfunction is still dysfunction, regardless of how it got there.
-
Please write docs with a theory of mind
It really shows when people don't do this
-
Velocity straight into a cul-de-sac
Warp speeeeeeeeed
-
Effective (training) firehose sipping
Just some habits I've built up
-
Divorcing data collection from data analysis, slightly
Starting the year with some philosophical wandering!
-
Software interfaces are conversations, measure accordingly
Nothing wrong with building sandcastles, right?
-
Are quant and qual UXR melting into one thing?
Employers are being very picky right now and it sucks
-
From emergency to normalcy again
Things are gonna be okay
-
A generalist in a "specialist" job market
Don't see "Full-stack" flying around that often any more
-
Data-Driven <-> Data-Stuck
We're all just stumbling around doing our best, I guess
-
DataBS Conf attendee registration is open!
Also the first batch of confirmed speakers listed
-
Watching a data team get their credibility nuked in national news
Goodbye BLS as we knew you
-
Something weird is going on and "real-time data" is a topic again somewhere?
Somehow google trends for the -topic- of "real-time data" spiked up in the past year and it's just really odd
-
Bootstrapping early (organizational) interest in data
The first steps at making things work, in my usual scrappy way
-
Transmitting user empathy via data
We can do that?
-
As a mostly ad-hoc'er, I've got workflow issues
Please, let's celebrate doing things the ugly way. For our peers that do things justly as ugly.
-
Crime-ing with Data Science
Apparently we're not in a field that makes it easy to get rich quick =\
-
We're not always "back of house"
No one ever thinks of data science as "customer facing", but that doesn't mean we're never...
-
Not-ignoring daylight savings time for analysis
While we usually want to avoid touching daylight savings time changes in our datasets, when you're working with people, local time actually does matter.
-
Breaking things, fast movingly
Moving fast and breaking things is the name of the startup game. And as data folk we're called upon to provide guidance while neither we nor anyone else have any real idea what we're doing.
-
Helping other people use metrics frameworks
Metrics frameworks are great tools to help teams think more about how their decisions potentially affect important parts of a business. But we often have to help teams get the finer details.
-
Things to consider when modifying your data systems
As analysts, we push data systems until they creak and groan. Then we modify them to not do that. Then repeat. Here I'm just thinking through the process.
-
Discussing failure in public, we should do that more
Social media has got plenty of positive publication bias, and it's a good thing data folks are slightly more open to talking about failures. Slightly.
-
Antagonistic data systems
The default state of data systems is to assume they're cooperating with our goals, if not just neutral. It's worth remembering that there are systems that are downright hostile to our use.
-
We're all the #dataBS, but individually not
A reminder that a community is made of all of us together, including you. We all need to act out the community that we want to be in.
-
Data-Twitter is having a MOMENT on Bluesky right now
Dare I hope Data-bluesky will become a thing?
-
Sometimes, stuff's not worth measuring
A guide to things that aren't worth our time and we need to bap the requests that try to make us work on them.
-
When product teams think anecdotes is research
Sometimes product teams constantly hear a feature request and then think that everyone will adopt it upon launch. They are then very very disappointed.
-
It's 2024. Let's have a WEBRING
Old person revives ancient ritual that connects related sites from data writers. Asks you to submit PR and join.
-
The importance of looking outside of "the data"
Big shoe brand crashes in value due to a big strategy shift that relies heavily on data... so maybe we should pay attention to it a bit.
-
Talk summary: Designing experiments for maximizing getting things done
A text transcript of my Quant UX Con 2024 talk for those who prefer text over video.
-
Democratizing data might not be about skills
I've been on data democratization efforts countless times now, and they've usually failed despite a lot of coaching and guardails... so maybe I've been completely missing the point?
-
Thursday data thought: teaching, tutoring, and upskilling others
For this Thursday's Subscriber post, some thoughts bouncing around my head that might eventually turn into a sane post but needs some thinking-out-loud to see where it actually goes.
-
Data tool design as a sign of field dynamism
Post for subscribers -- I think the way we still want our many data tools to interop via common file formats is a sign of dynamism and optimism of future innovation
-
Don't expect data to change everything like magic
Many people joining the data science field harbor the belief that they're gonna help change how things work, make stuff magically better. Sadly, we all learn that it's much harder than that.
-
Simplifying (reasoning about) complex systems
"How hard can it be?" – these are the famous last words of many a data science project. We try not to underestimate the difficulty, but can go too far. Sometimes we just need to know the simple version.
-
When data (and rationality) got in my way
Not everything is rational =(
-
What the heck is "Insight"?
It's something we "deliver" but what is it anyway?
-
Working in data as a "meh programmer"
Working around myself
-
In the future, we'll work with unstructured data at ~Scale~
Where are the limits? Edge cases?
-
I don't know how to move satisfaction
In ways that I want
-
Everything tries to become a one-stop-shop
Data products, all products
-
Counting explosions at Unity
Gamedev, Counting, and Ethics, oh my!
-
Insights do not equal utility
Took me a while to learn this one
-
Most metrics are conditional
And your product folk might not know that
-
We still build tools and should share them more
Because we're not selling products
-
Most can learn analysis, but won't become analysts
Because its different
-
A day(?) in my life as a Quant UX Researcher
Or a month? Averaged?
-
When the AI comes for my coding job...
It can have it. Please.
-
The reasonable(?) effectiveness of data analysis
Why are we even effective at anything?
-
Square footage is so broken and weird
It's all vibes
-
DS work doesn't have to be purely discovery work
If you like execution-type work like me
-
Data scientist, working without data
Because of course it'll happen some day
-
When are we just speaking for a model
And then catching flak for it
-
There's dashboards all around us that are USED
And they look nothing like the ones I make...
-
Data science has a tool obsession
That we need to balance out
-
We all make for a good conference experience
Yes, you too
-
How do we actually "pull stories out of data"?
It's hard to describe and very branchy
-
Reinforcing our data friend connections while we can
In case "Data Twitter" becomes just "Data"
-
Ways to data-drive yourself into the ground!
It's easy!
-
Seeing the data science work all around us
Everywhere! EVERYWHERE!
-
What if every dashboard self destructed
Would it be so bad?
-
Data driven doesn’t mean data is driving
Beep Beep~~~
-
Making do with the infra we got
Because what choice do we have?
-
Dashboards don't break themselves
Humans break them
-
Stories from the last downturn
Life keeps happening
-
Accidentally trapping ourselves with stats
The insignificance of significance
-
Paper reading time - Forgetting in Data Science
There's a lot
-
Building Discords and Community
A mini crash course
-
Comparing Quant UX Researchers w/ Data Scientists
It's a mess because everything's undefined
-
You're probably on a cutting edge
Of at least something
-
Celebrating everyone counting things
Nerding out on counting
-
The gap between data science, and UX research
There's soooo much overlap
-
Reflecting on how system complexity grows
Flashbacks of past pain
-
Planning quantitative work
Os different from a lot of other work
-
Reflecting upon reflecting
Mirrors upon mirrors
-
Views on dashboards being answers, or not
It's messy because we're messy
-
The utility of an unwatched dashboard
It's not ALL waste. Just mostly waste
-
What Data Folk Were Saying about Zillow
Not a "What happened?" post no one needs to read
-
The many faces of "Production"
It's not all about big, complicated systems
-
Research is a team sport, even the analysis bits
Despite it being our primary job
-
DS jargon is just everyone else's jargon
And we're going to absorb even more. MORE!
-
Every org has a "literature"
And it's a giant mess
-
Go collect some $#*(&% data
You need to. Seriously.
-
Where should a metric be"?
Having an opinion takes a lot of extra work
-
It's Goodhart's Law again
For a certain class of problems
-
Being a broad-spectrum data scientist
Is something of a niche. Heh.
-
Scaling yourself
Because you'll be buried otherwise.
data-quality (67)
-
Daylight savings broke software (again) and other Time News!
Every so often I just have to indulge in this topic
-
Surveys aren't getting easier to do
It's gotten easier to screw up someone else's survey!
-
Avoid the lure of working on metrics over constructs
I should know better, but even I catch myself doing this.
-
Bridging real measurements to idealized ones
Why is the real world so ... messy =\
-
Making fake data for class, then making them worse🙃
Gyah deadlinessssss!
-
Measuring clouds... okay, cloudy water (and air)
How cloudy is a sample of water or air? Well, someone's figured it out, right?
-
Storage is cheap, but not thinking about logging is expensive
The bad habits of data over-collection run deep.
-
Labeling things by hand when everyone's trying not to
The hot stuff right now is labeling with LLMs, but y'know, maybe consider not?
-
Antagonistic data systems
The default state of data systems is to assume they're cooperating with our goals, if not just neutral. It's worth remembering that there are systems that are downright hostile to our use.
-
Book Review: Data Management in Large-Scale Education Research
Probably a lifesaver if you're in this academic field, especially if you've never collected data before. BUT the later chapters are relevant to people of all skill levels and experience, even in industry.
-
"Working" with inconclusive results
Data work tends to mean getting inconclusive results, which are really inconvenient in industry because they're... nothing... and it's quite inconvenient
-
Collecting Data
A guest post about actually collecting data points in the field and coming up with categories that make sense for collected items.
-
Paper read: kids bringing germs home (sorta)
A short post looking at a 2015 paper that tracked households over a year, with PCR swabs and symptom diaries, to see how often households had certain respiratory infections.
-
The call of LLMs is strong, we get to pick up the pieces later
More fields are getting on the whole LLM synthetic data generation bandwagon. Honestly at this point the best career move seems to be in learning how to pick apart and vet LLM systems instead of fighting them all the time...
-
You're Collecting Too Much Data!
The most instinctive reaction to wanting data is to collect and store as much of it as you can, figure it out later. For many reasons, that's probably a bad idea.
-
A Budget Guide for Analyzing AI Company Funding with AI
Pulling, aggregating, categorizing, and interpreting public data is always a hairy task with lots of details. Here, Howe shows an example of pulling startup funding data w/ the help of AI classifiers.
-
Democratizing data might not be about skills
I've been on data democratization efforts countless times now, and they've usually failed despite a lot of coaching and guardails... so maybe I've been completely missing the point?
-
Earthquakes bringing in tons of survey data
April 5th 2024 brought a rare earthquake to the Northeast US. That opens up a big rabbit hole for looking at the roots of the USGS "Did You Feel It" survey.
-
Book Review: Solve Any Data Analysis Problem
Good for people starting their analyst careers
-
"Medium indirect sunlight"? What?
Mis-care of plants and miscommunication of information
-
Tools For Fighting The Data Monster
It's dangerous to go alone...
-
The Big Bad Data Monster
Working with Data Quality
-
Organizing Data is Picking What you Care About
And some examples on how to pick
-
Everyone fears the cartesian
And other duplicated row situations
-
Learning what actually underpins your data
Involves forming an often bad opinions
-
The mythical single source of truth
We can't help but sell this one
-
Data scientist, working without data
Because of course it'll happen some day
-
A couple of weeks wading in contextless metrics
Data nerd bait
-
We need to calibrate our internal achievement scales too
Because it is/was perf season
-
Ways to data-drive yourself into the ground!
It's easy!
-
Want a DS project? There's health insurance data out there
And no (public) resources for working with the data yet
-
Dashboards don't break themselves
Humans break them
-
We take our units of analysis for granted
Because it's so easy
-
Paper reading time - Forgetting in Data Science
There's a lot
-
Data Management is Context Management
We are destroyers and preservers
-
That time I participated in Nielsen's TV Ratings
As a data nerd, pretty interesting stuff
-
We have the power to be wrong with extreme confidence
A reminder to us all
-
Handling shifting survey questions
It's just bad news for everyone involved
-
Measuring broader impact is extremely hard
It's sorta by definition
-
How do we know our thermometers are correct?
Taken to extremes!
-
Go collect some $#*(&% data
You need to. Seriously.
-
Doing better with Excel
Shooting slightly less of your foot off
-
Paper dive - Replication is even harder
Cleaning data is important!
-
Churn is hard
In all sorts of ways. Except the butter kind.
-
Just where is the minimal stats bar for data science?
At what point are you completely useless?
-
Rejecting the null based on a chart
The dream...
-
Working With Moon Eclipses Part 2
Failures ahoy!
-
Practicing data prep with Wikipedia data
It's fairly realistic, albeit messy, which is the point
-
Data Cleaning IS Analysis, Not Grunt Work
Also, most data cleaning articles suck
-
Sessions for analysis, the eternal fiction
There's no escape
-
Fighting Confirmation Bias
It's always been important. It's just a bit more important these days.
-
Some Gamedev and Shoddy Data Arguments
With sketchy charts!
-
The Epic Data Fetch Quest
It's dangerous to go alone! Take this.
-
Interpreting Email Analytics is Handwavy
What you can do when you actually collect email data
-
Counting is hard, 2019-nCoV edition
It’s not every day we can watch an expert explaining how hard counting is
-
Common Data Science Trap— Getting Systems To Agree
Beware! They will not agree! The question is to what degree
-
Character Encodings — The Pain That Won’t Go Away, Part 2/3: Unicode
This is supposed to save us all, it’s merely the best we’ve got so far
-
Character Encodings —The Pain That Won’t Go Away, Part 1/3: Non-Unicode
How we got here, how we’re not getting out yet, and dealing with it
-
Two Stories About Labeling Data by Hand — It Still Works
Sure it’s a pain in the butt and has its own issues, but human brains are still amazing
-
Balancing Who Handles Data Inconsistency
In Production, things inevitably get wonky, and that’s normal
-
It’s All About Trust: Views on opening up data to your org
“Democratizing” data and insight, open access, data transparency, it goes by a lot of names, many want it, but the correct decision for…
-
Trap DS Projects: Beware of “Easy” Segmentation Projects
99% chance you’re not ready
-
Data Science foundations: Know your data. Really, really, know it
Know your data, where it comes from, what’s in it, what it means. It all starts from there.
experimentation (25)
-
Bridging real measurements to idealized ones
Why is the real world so ... messy =\
-
Labeling things by hand when everyone's trying not to
The hot stuff right now is labeling with LLMs, but y'know, maybe consider not?
-
Playing chess when it's more like poker
Subscriber post with thoughts about some of the dangers of giving people data and having them feel like they "know" how things work when there's plenty they don't know
-
Talk summary: Designing experiments for maximizing getting things done
A text transcript of my Quant UX Con 2024 talk for those who prefer text over video.
-
We should talk about theory to teams more
Industry tends to not care about theory, but we should talk about it a bit more because many aren't adverse to it when we talk about it right.
-
Everything's small data again
There once was a time where working with lots of data was so awkward, it became a giant hype cycle. Now, almost nothing is "big" any more.
-
Whatever happened to the multi-armed bandit?
There was a brief period in the early 2010s where data scientists were interested in bandit models... and then you barely hear about them any more.
-
Talk writing: doing experiments better
Bi-weekly Thursday posts are primarily for subscribers and feature more in-progress type work and thoughts from Randy. This week is about experiments.
-
Don't take optimization to a growth fight
Nor vice-versa. We need the right tools for the job.
-
Measuring neutrons in the sea
And how it connects to everyday life. sorta.
-
A day(?) in my life as a Quant UX Researcher
Or a month? Averaged?
-
Heating systems, black boxes, and knowing things about systems
It all starts looking the same
-
Planning quantitative work
Os different from a lot of other work
-
False Discovery Rates in A/B tests
Wait, what?
-
The delicate art of making ourselves wrong
Modeling for incorrectness
-
Hobby trains and actively using failure as a strategy
With lots of photos
-
Just where is the minimal stats bar for data science?
At what point are you completely useless?
-
Rejecting the null based on a chart
The dream...
-
Challenge: Predicting [Lunar] Eclipses
Doing things the incorrect way, for science(?)
-
Smashing Dashboards and Ikea Together
But only in my brain
-
Yeah, getting teams on board w/ experimentation is very tricky.
A lot of it involves making sure fears are addressed. The fear of the repercussions of failure. The fear of finding out they spent a…
-
Data Science in the Trenches: Living w/ Small n
Somewhere, someone’s having this conversation. Right. Now.
-
Succeeding as a data scientist in small companies/startups
It’s nothing like at a big mature company.
guestpost (7)
-
Collecting Data
A guest post about actually collecting data points in the field and coming up with categories that make sense for collected items.
-
Technical writing is too important to leave to language models
There are no shortcuts to writing something worth reading
-
Using Old Tutorials for New Tricks
Learning new programing languages is always tricky. But using tutorials from a language you already know to learn another one makes things surprisingly easier.
-
You're Collecting Too Much Data!
The most instinctive reaction to wanting data is to collect and store as much of it as you can, figure it out later. For many reasons, that's probably a bad idea.
-
A Budget Guide for Analyzing AI Company Funding with AI
Pulling, aggregating, categorizing, and interpreting public data is always a hairy task with lots of details. Here, Howe shows an example of pulling startup funding data w/ the help of AI classifiers.
-
Writing posts, maybe as a guest!
Or just for yourself
hobby (17)
-
Making shiny rocks from man-made crystals
Some pure fun to ring in the new year on
-
Half year of birb data collecting update
Seasonal trends! With CHARTS!
-
Counting the planes overhead
Through the magic of RADIO WAVES~~~~
-
We're counting birbs today
So excited to be able to write about a random fun project again.
-
A personal tool map
Every tool I can remember owning, and what I use it for.
-
Flying as fast as the sun
I got nerd-sniped after being asked if it's possible to fly a plane for a day and keep up with the night side of the Earth.
-
Last minute eclipse travel planning for data nerds
Using tools to figure out where to go in all the chaos
-
"Medium indirect sunlight"? What?
Mis-care of plants and miscommunication of information
-
Make your own space pictures with big telescope data!
Including from the JWST and Hubble!
-
Hobby break! Keyboard building
For nerds who need to see the whole process
-
Learning from running out of memory all the time
HARD. WARES.
-
A weekend playing with DALL-E Mini
So. Much. Weird. Artwork.
-
Life on the abstractions of giants
I love hobbies. This is no surprise to any of you.
-
Hands On: Overdoing a Static Website
Doing things for the fun of it
-
Hobby trains and actively using failure as a strategy
With lots of photos
-
Running small conferences is quirky work
But worth it
learning (135)
-
Half year of birb data collecting update
Seasonal trends! With CHARTS!
-
A generalist in a "specialist" job market
Don't see "Full-stack" flying around that often any more
-
Work is full of blank pages
I didn't see them everywhere until I really started looking. I had forgotten they existed.
-
As a mostly ad-hoc'er, I've got workflow issues
Please, let's celebrate doing things the ugly way. For our peers that do things justly as ugly.
-
No one works with clean slates, we shouldn't write with it either
The audience for messy, one-off examples seems larger than it initially appears. If we are only willing to listen
-
We're all carriers of software memes
In every one of us, there's a little gremlin that learned something cool, silly, weird, but oddly useful.
-
Making fake data for class, then making them worse🙃
Gyah deadlinessssss!
-
Anyone* could learn data analysis. Probably.
Thoughts on how people should teach and learn analytics based on the question at hand, and not the other way around.
-
How much light was that?
There are ways to measure how much visible light something is giving off. It's not simple.
-
Different levels of optimization problems
Data scientists have got "one weird trick" and it's called optimization. Luckily there's a surprising amount of depth to the trick.
-
I'll be teaching a logs analysis course in April
It'll be in person, and it's gonna be fun! I promise at least that much.
-
Paper reading: analyzing string figures across the globe
The surprising depth needed to analyze bits of looped string
-
Making "intermediate+" content
It's what the internet needs, and we're the only people who can create it for ourselves and our peers.
-
Discussing failure in public, we should do that more
Social media has got plenty of positive publication bias, and it's a good thing data folks are slightly more open to talking about failures. Slightly.
-
Book Review: Data Management in Large-Scale Education Research
Probably a lifesaver if you're in this academic field, especially if you've never collected data before. BUT the later chapters are relevant to people of all skill levels and experience, even in industry.
-
Do more writing, it is thinking
Says guy who writes more than most of the population... but hear me out.
-
Flying as fast as the sun
I got nerd-sniped after being asked if it's possible to fly a plane for a day and keep up with the night side of the Earth.
-
Paper read: kids bringing germs home (sorta)
A short post looking at a 2015 paper that tracked households over a year, with PCR swabs and symptom diaries, to see how often households had certain respiratory infections.
-
Technical writing is too important to leave to language models
There are no shortcuts to writing something worth reading
-
Your SQL's probably not that bad (compared to most people)
If you use it all the time at work, you're probably already in the upper percentiles in skill.
-
Using Old Tutorials for New Tricks
Learning new programing languages is always tricky. But using tutorials from a language you already know to learn another one makes things surprisingly easier.
-
Being air-dropped into an analysis
Domain knowledge is critical to dat awork, but we really do get thrown into the deep end at times. Why does that even work at all?
-
The basics of project management for data folk
It's impossible to explain project management in a single post, but the general framework for understanding why work projects feel so objectively different from school ones is helpful to make sense of it all.
-
Thursday data thought: teaching, tutoring, and upskilling others
For this Thursday's Subscriber post, some thoughts bouncing around my head that might eventually turn into a sane post but needs some thinking-out-loud to see where it actually goes.
-
When data (and rationality) got in my way
Not everything is rational =(
-
Summary: Data Mishaps Night 2024
A full summary of all the talks from Data Mishaps Night 2024, lightly anonymized for privacy.
-
Less coding in public, sorta
More venting out product ideas, but finding some live examples and course correcting
-
Onboarding and fishing for tacit knowledge
Making transitions smoother
-
What the heck is "Insight"?
It's something we "deliver" but what is it anyway?
-
Book Review: Solve Any Data Analysis Problem
Good for people starting their analyst careers
-
(Human) networking is making friends
TL;DR: Spread love, joy, or at least helpfulness
-
Stepping outside and looking around
Go out, take a breath, and see what other industries are doing
-
Networking basics for data work
Just enough networking knowledge to do data work and debug your way out of tight spots.
-
Working in data as a "meh programmer"
Working around myself
-
Reflecting on our toolbox mastery
And is it "enough"?
-
Solving the problems in front of you
A PyData NYC 2023 talk
-
Measuring neutrons in the sea
And how it connects to everyday life. sorta.
-
Uhoh, I'm doing a PyData NYC 2023 talk
On Nov 02, 1:30pm
-
Make your own space pictures with big telescope data!
Including from the JWST and Hubble!
-
Insights do not equal utility
Took me a while to learn this one
-
Craftsmanship is a state of mind
But it's optional
-
How people were (math) precise before modern times
Because thinking about LLMs
-
A lot of 13th century counting
And arithmetic!
-
SPF is a pretty unhelpful unit
A just what the heck is it anyway?
-
Failure comes for us all
Gotta learn to handle it
-
It's still worth running a small server in 2023
For the education!
-
Most can learn analysis, but won't become analysts
Because its different
-
How does the Air Quality Index work anyways?
It's a scoring function!
-
Hobby break! Keyboard building
For nerds who need to see the whole process
-
Learning what actually underpins your data
Involves forming an often bad opinions
-
We should watch people dissect and build products
Because we learn a lot that way
-
Square footage is so broken and weird
It's all vibes
-
Learning to go from creating to editing
Just like code reviews
-
Internalizing baselines is hard
Most of us probably don't even work at it
-
Learning from running out of memory all the time
HARD. WARES.
-
Everyone You'll Ever Meet Knows Something You Don't
The corollary to this, of course, is that you know something that they don't!
-
We all make for a good conference experience
Yes, you too
-
Everything is on fire and you should contribute
The Newsletter Version!
-
How do we actually "pull stories out of data"?
It's hard to describe and very branchy
-
Heating systems, black boxes, and knowing things about systems
It all starts looking the same
-
Staying Sharp in Data Science
Can take a lot of unexpected paths
-
Seeing the data science work all around us
Everywhere! EVERYWHERE!
-
Coming up with a talk proposal
Because the hardest part is starting
-
How to measure a subcontinent
Before GPS or even lasers
-
We need to help each other write better
Because knowledge gaps are hard to spot
-
The world isn't as polished as it appears
So it's ok to crack things open to see
-
Self-narrating some usability testing for others
A bit of dogfood reporting
-
A weekend playing with DALL-E Mini
So. Much. Weird. Artwork.
-
We should phase the "SQL Interview" out
For something more general
-
Let's talk a bit about giving interviews
One day you're going to have to
-
Paper reading time - Forgetting in Data Science
There's a lot
-
Life on the abstractions of giants
I love hobbies. This is no surprise to any of you.
-
Measuring "here", coordinate systems for the Earth
It's complicated
-
Celebrating everyone counting things
Nerding out on counting
-
Making space for others
And I'd like to help
-
Skill Windows into the Data World
Psst... run a survey
-
It's that DS meme plane with dots again
With tons of math going over my head
-
Hands On: Overdoing a Static Website
Doing things for the fun of it
-
Research is a team sport, even the analysis bits
Despite it being our primary job
-
DS jargon is just everyone else's jargon
And we're going to absorb even more. MORE!
-
We should treat data science as a craft
And stop obsessing on surface technical skills
-
Go collect some $#*(&% data
You need to. Seriously.
-
Ways to become a Quantitative UX Researcher, w/o a PhD
It can be done... with effort
-
Our lives are nested orders of operation puzzles
Practice practicing!
-
You, too, should run a few machines
For Science!
-
Personal growth and brands
Does one even come before another?
-
When scaling yourself goes a bit too far
Because, surprisingly, it can
making (60)
-
Counting the planes overhead
Through the magic of RADIO WAVES~~~~
-
The show must go on (busted PC edition)
It does go on and on and on ...
-
Legoland, Ikea, and doing UX work
Some thoughts about the intersection that keeps me up at night a surprising amount
-
DataBS Conf attendee registration is open!
Also the first batch of confirmed speakers listed
-
Remembering how the journey can be the work
Every few years I forget and have to remind myself
-
We have DAGs for our data, but also for our work
It's easy to not realize we manage a DAG surrounding... ourselves
-
Bootstrapping early (organizational) interest in data
The first steps at making things work, in my usual scrappy way
-
Work is full of blank pages
I didn't see them everywhere until I really started looking. I had forgotten they existed.
-
We're counting birbs today
So excited to be able to write about a random fun project again.
-
Making fake data for class, then making them worse🙃
Gyah deadlinessssss!
-
Things to consider when modifying your data systems
As analysts, we push data systems until they creak and groan. Then we modify them to not do that. Then repeat. Here I'm just thinking through the process.
-
A personal tool map
Every tool I can remember owning, and what I use it for.
-
Making "intermediate+" content
It's what the internet needs, and we're the only people who can create it for ourselves and our peers.
-
(Not) Building for the long term
Planning is hard ot do, especially at fast-paced startups. So how about you spend less time planning and more time breaking things down into useful chunks?
-
It's 2024. Let's have a WEBRING
Old person revives ancient ritual that connects related sites from data writers. Asks you to submit PR and join.
-
Measuring stuff in practice, wood floor edition
Real world measurement means making compromises in the name of work speed. Because precision is time is money.
-
Collecting Data
A guest post about actually collecting data points in the field and coming up with categories that make sense for collected items.
-
Software system design's got it pretty good
Just some thoughts about how refactoring software is nowhere near as complex as refactoring physical spaces.
-
"Good enough" charts for work
I admire good dataviz, I also know that I'm hopeless at those. So instead I've found a place where it's good enough for the work I do.
-
Last minute eclipse travel planning for data nerds
Using tools to figure out where to go in all the chaos
-
Less coding in public, sorta
More venting out product ideas, but finding some live examples and course correcting
-
Coding... in public... sorta
Who would've thought scheduling random meetings would be hard?
-
Work in progress: a newsletter move
Working on it!
-
In the future, we'll work with unstructured data at ~Scale~
Where are the limits? Edge cases?
-
Reflecting on our toolbox mastery
And is it "enough"?
-
Organizing Data is Picking What you Care About
And some examples on how to pick
-
Uhoh, I'm doing a PyData NYC 2023 talk
On Nov 02, 1:30pm
-
Make your own space pictures with big telescope data!
Including from the JWST and Hubble!
-
Pondering on Process Debt
Maybe it's a thing
-
Craftsmanship is a state of mind
But it's optional
-
A lot of 13th century counting
And arithmetic!
-
We still build tools and should share them more
Because we're not selling products
-
It's still worth running a small server in 2023
For the education!
-
Data DIY in the age of industrial pickaxes and shovels
Talking about using our tools instead of the tools themselves
-
Hobby break! Keyboard building
For nerds who need to see the whole process
-
We should watch people dissect and build products
Because we learn a lot that way
-
How the heck does one measure color?
This is SO HARD!!!!
-
Learning from running out of memory all the time
HARD. WARES.
-
Heating systems, black boxes, and knowing things about systems
It all starts looking the same
-
Want a DS project? There's health insurance data out there
And no (public) resources for working with the data yet
-
The world isn't as polished as it appears
So it's ok to crack things open to see
-
Making space for others
And I'd like to help
-
It's that DS meme plane with dots again
With tons of math going over my head
-
Hands On: Overdoing a Static Website
Doing things for the fun of it
-
Let's play with some Library of Congress data!
And learn about learning about data sets
-
The Complexity Makin' Goods
During a pandemic?!?!?
-
Working With Moon Eclipses Part 2
Failures ahoy!
-
Running small conferences is quirky work
But worth it
-
Practicing data prep with Wikipedia data
It's fairly realistic, albeit messy, which is the point
-
Real Workflows in Data Science
Some of them, at least
-
Let's Get Intentional About Documentation
Because I doubt most people, myself included, is
-
Smashing Dashboards and Ikea Together
But only in my brain
-
Audio silliness in the era of videoconferencing
Mic mic revolution
-
Making Fair Games of Go
Fun is hard
-
Data Science Practice 101: Always Leave An Analysis Paper Trail
It’ll save your butt. Lots.
measurement (102)
-
The era of unscientific management
It's a really annoying turn of events but here we are...
-
Dashboard rot as org attention grave markers
Dysfunction is still dysfunction, regardless of how it got there.
-
Avoid the lure of working on metrics over constructs
I should know better, but even I catch myself doing this.
-
Measuring Snow is Decidedly Not Easy
I've done enough shoveling this weekend. zzzz.
-
Bridging real measurements to idealized ones
Why is the real world so ... messy =\
-
Divorcing data collection from data analysis, slightly
Starting the year with some philosophical wandering!
-
Software interfaces are conversations, measure accordingly
Nothing wrong with building sandcastles, right?
-
People who dared to measure food energy
A peek into the frankly mind-melting complexity surrounding food energy measurement
-
Going back from carats to carobs
There's a whole legend about how the carob became the carat and I wanted to chase that down with my own hands.
-
Data-Driven <-> Data-Stuck
We're all just stumbling around doing our best, I guess
-
Watching a data team get their credibility nuked in national news
Goodbye BLS as we knew you
-
Basic logs analysis via code
Another episode in the ongoing logs analysis primer series
-
Measuring humidity is as messy as how it can make you feel
Heatwave writing means overthinking heatwave measurements
-
Measuring clouds... okay, cloudy water (and air)
How cloudy is a sample of water or air? Well, someone's figured it out, right?
-
Not-ignoring daylight savings time for analysis
While we usually want to avoid touching daylight savings time changes in our datasets, when you're working with people, local time actually does matter.
-
Breaking things, fast movingly
Moving fast and breaking things is the name of the startup game. And as data folk we're called upon to provide guidance while neither we nor anyone else have any real idea what we're doing.
-
How much light was that?
There are ways to measure how much visible light something is giving off. It's not simple.
-
Storage is cheap, but not thinking about logging is expensive
The bad habits of data over-collection run deep.
-
Paper reading: analyzing string figures across the globe
The surprising depth needed to analyze bits of looped string
-
Antagonistic data systems
The default state of data systems is to assume they're cooperating with our goals, if not just neutral. It's worth remembering that there are systems that are downright hostile to our use.
-
Sometimes, stuff's not worth measuring
A guide to things that aren't worth our time and we need to bap the requests that try to make us work on them.
-
When product teams think anecdotes is research
Sometimes product teams constantly hear a feature request and then think that everyone will adopt it upon launch. They are then very very disappointed.
-
Measuring stuff in practice, wood floor edition
Real world measurement means making compromises in the name of work speed. Because precision is time is money.
-
Peculiar measurements: Accumulated Cyclone Energy
Meteorologists have a curious little metric called "accumulated cyclone energy" that they use to compare hurricanes and entire hurricane seasons. This is a dive into what that's about.
-
Playing chess when it's more like poker
Subscriber post with thoughts about some of the dangers of giving people data and having them feel like they "know" how things work when there's plenty they don't know
-
Paper read: kids bringing germs home (sorta)
A short post looking at a 2015 paper that tracked households over a year, with PCR swabs and symptom diaries, to see how often households had certain respiratory infections.
-
Measurement of life things has gotten really cheap
It's never been easier, and cheaper, to measure things in our world, just to actually do it and learn how annoying it can be.
-
Talk summary: Designing experiments for maximizing getting things done
A text transcript of my Quant UX Con 2024 talk for those who prefer text over video.
-
You're Collecting Too Much Data!
The most instinctive reaction to wanting data is to collect and store as much of it as you can, figure it out later. For many reasons, that's probably a bad idea.
-
Democratizing data might not be about skills
I've been on data democratization efforts countless times now, and they've usually failed despite a lot of coaching and guardails... so maybe I've been completely missing the point?
-
Metrics trees and other mental frameworks
Frameworks for doing things like setting metrics or identifying issues abound. They're useful tools but it's worth knowing now to take them too far.
-
Everything's small data again
There once was a time where working with lots of data was so awkward, it became a giant hype cycle. Now, almost nothing is "big" any more.
-
Earthquakes bringing in tons of survey data
April 5th 2024 brought a rare earthquake to the Northeast US. That opens up a big rabbit hole for looking at the roots of the USGS "Did You Feel It" survey.
-
What the heck is "Insight"?
It's something we "deliver" but what is it anyway?
-
"Medium indirect sunlight"? What?
Mis-care of plants and miscommunication of information
-
Measuring neutrons in the sea
And how it connects to everyday life. sorta.
-
Counting explosions at Unity
Gamedev, Counting, and Ethics, oh my!
-
Insights do not equal utility
Took me a while to learn this one
-
SPF is a pretty unhelpful unit
A just what the heck is it anyway?
-
Calls to actions, "like and subscribe", are (sadly) necessary
However much I hate them, you can't avoid them
-
How does the Air Quality Index work anyways?
It's a scoring function!
-
Learning what actually underpins your data
Involves forming an often bad opinions
-
The mythical single source of truth
We can't help but sell this one
-
How the heck does one measure color?
This is SO HARD!!!!
-
Square footage is so broken and weird
It's all vibes
-
A couple of weeks wading in contextless metrics
Data nerd bait
-
When are we just speaking for a model
And then catching flak for it
-
There's dashboards all around us that are USED
And they look nothing like the ones I make...
-
We need to calibrate our internal achievement scales too
Because it is/was perf season
-
We might not see leap seconds after 2035 🤯
For 100 years anyways...
-
What's up with Readability formulas?
Are they just wishful thinking?
-
Ways to data-drive yourself into the ground!
It's easy!
-
Product Work is Intention Hunting Work
A career of chasing ghosts
-
One day, the free (computing) lunch will end
And then we'll have to work harder =O
-
How to measure a subcontinent
Before GPS or even lasers
-
Caching is our friend, until it isn't
And it's everywhere...
-
Dashboards don't break themselves
Humans break them
-
We take our units of analysis for granted
Because it's so easy
-
Accidentally trapping ourselves with stats
The insignificance of significance
-
That time I participated in Nielsen's TV Ratings
As a data nerd, pretty interesting stuff
-
Measuring "here", coordinate systems for the Earth
It's complicated
-
Reflecting upon reflecting
Mirrors upon mirrors
-
We have the power to be wrong with extreme confidence
A reminder to us all
-
Views on dashboards being answers, or not
It's messy because we're messy
-
The utility of an unwatched dashboard
It's not ALL waste. Just mostly waste
-
Handling shifting survey questions
It's just bad news for everyone involved
-
Measuring broader impact is extremely hard
It's sorta by definition
-
How do we know our thermometers are correct?
Taken to extremes!
-
The delicate art of making ourselves wrong
Modeling for incorrectness
-
Hate leap seconds? Imagine a negative one
Because the possibility just got slightly more plausible
-
Communicating changes with percentages is surprisingly hard
Almost bizarrely so
-
What goes into this "Heat Index" thing?
Apparently a huge amount
-
Let's play with some Library of Congress data!
And learn about learning about data sets
-
Paper measurements are an endless pit of mystery
Traditional units ahoy!
-
A Busted Kitchen and Breaking User Journeys
Learning from self inflicted pain
-
Reading between the rows
And collecting data to do that...
-
Showing value as a support data scientist
A past experience dump
-
Sessions for analysis, the eternal fiction
There's no escape
-
Time is annoying? Time durations are worse!
Shocker: Stretching a painful thing out doesn't make it less painful!?
-
Helping others deal with uncertainty and risk
Not how we deal with it ourselves
-
Let's Talk Rice Measurements
The History of a little cup
-
Simple Visualizations Are Pretty Darned Great
If you're not good at Viz, like me, there's hope for us
-
Making Fair Games of Go
Fun is hard
-
Interpreting Email Analytics is Handwavy
What you can do when you actually collect email data
-
Email Analytics: More than you ever need to know
Black boxes are hot again
-
Counting is hard, 2019-nCoV edition
It’s not every day we can watch an expert explaining how hard counting is
-
The User-Agent — That Crazy String Underpinning a Bunch of Analytics
It’s surprising to think we rely on this as much as we do
-
Common Data Science Trap— Getting Systems To Agree
Beware! They will not agree! The question is to what degree
-
Not everyone needs real-time analytics, including you
The art is finding a good cadence for your metrics
-
It’s All About Trust: Views on opening up data to your org
“Democratizing” data and insight, open access, data transparency, it goes by a lot of names, many want it, but the correct decision for…
-
The Metrics Meta-game
As the stewards of good data practice within a company, we data professionals are often asked to help set metrics, there’s a whole…
metrics (35)
-
The era of unscientific management
It's a really annoying turn of events but here we are...
-
Velocity straight into a cul-de-sac
Warp speeeeeeeeed
-
Something weird is going on and "real-time data" is a topic again somewhere?
Somehow google trends for the -topic- of "real-time data" spiked up in the past year and it's just really odd
-
Transmitting user empathy via data
We can do that?
-
Helping other people use metrics frameworks
Metrics frameworks are great tools to help teams think more about how their decisions potentially affect important parts of a business. But we often have to help teams get the finer details.
-
The importance of looking outside of "the data"
Big shoe brand crashes in value due to a big strategy shift that relies heavily on data... so maybe we should pay attention to it a bit.
-
Metrics trees and other mental frameworks
Frameworks for doing things like setting metrics or identifying issues abound. They're useful tools but it's worth knowing now to take them too far.
-
Don't take optimization to a growth fight
Nor vice-versa. We need the right tools for the job.
-
I don't know how to move satisfaction
In ways that I want
-
Counting explosions at Unity
Gamedev, Counting, and Ethics, oh my!
-
Most metrics are conditional
And your product folk might not know that
-
How does the Air Quality Index work anyways?
It's a scoring function!
-
The mythical single source of truth
We can't help but sell this one
-
A couple of weeks wading in contextless metrics
Data nerd bait
-
Internalizing baselines is hard
Most of us probably don't even work at it
-
What's up with Readability formulas?
Are they just wishful thinking?
-
Data driven doesn’t mean data is driving
Beep Beep~~~
-
We take our units of analysis for granted
Because it's so easy
-
False Discovery Rates in A/B tests
Wait, what?
-
Views on dashboards being answers, or not
It's messy because we're messy
-
Measuring broader impact is extremely hard
It's sorta by definition
-
Where should a metric be"?
Having an opinion takes a lot of extra work
-
Communicating changes with percentages is surprisingly hard
Almost bizarrely so
-
It's Goodhart's Law again
For a certain class of problems
-
Churn is hard
In all sorts of ways. Except the butter kind.
-
A Busted Kitchen and Breaking User Journeys
Learning from self inflicted pain
-
Sessions for analysis, the eternal fiction
There's no escape
-
Email Analytics: More than you ever need to know
Black boxes are hot again
-
Common Data Science Trap— Getting Systems To Agree
Beware! They will not agree! The question is to what degree
-
Not everyone needs real-time analytics, including you
The art is finding a good cadence for your metrics
-
The Metrics Meta-game
As the stewards of good data practice within a company, we data professionals are often asked to help set metrics, there’s a whole…
ml-ai (16)
-
Data work in the fast fashion code era
There's probably more upsides for us than SWEs
-
Doing SQL work with LLM aids as a SQL addict
It could be better, it could be worse
-
Vibe coding is delayed pain
The fact that it CAN work, but only with a ton of aggravation is the worst part of the whole thing...
-
We're counting birbs today
So excited to be able to write about a random fun project again.
-
Labeling things by hand when everyone's trying not to
The hot stuff right now is labeling with LLMs, but y'know, maybe consider not?
-
ML vs code rot
I tried to run some ML stuff for the first time in ages! And I failed! r-o-f-l
-
The call of LLMs is strong, we get to pick up the pieces later
More fields are getting on the whole LLM synthetic data generation bandwagon. Honestly at this point the best career move seems to be in learning how to pick apart and vet LLM systems instead of fighting them all the time...
-
A Budget Guide for Analyzing AI Company Funding with AI
Pulling, aggregating, categorizing, and interpreting public data is always a hairy task with lots of details. Here, Howe shows an example of pulling startup funding data w/ the help of AI classifiers.
-
Whatever happened to the multi-armed bandit?
There was a brief period in the early 2010s where data scientists were interested in bandit models... and then you barely hear about them any more.
-
When the AI comes for my coding job...
It can have it. Please.
-
A weekend playing with DALL-E Mini
So. Much. Weird. Artwork.
-
Discussing AI is such a confusing mess
and it's no one's and everyone's fault
-
Two Stories About Labeling Data by Hand — It Still Works
Sure it’s a pain in the butt and has its own issues, but human brains are still amazing
-
Emulators, Machine Translation, Self-service skills
Not all AI/ML ethical dilemmas involve life and death…
-
Trap DS Projects: Beware of “Easy” Segmentation Projects
99% chance you’re not ready
newsletter (277)
-
A year after Substack, good and bad
An update for a year of changes. In hopes anyone wanting to write more can get something out of it
-
And we keep counting on...
Because I don't know what else to do...
-
Phew, coming up for air soon
A brief update on what's been going on at the newsletter. Nothing crazy.
-
Talk writing: doing experiments better
Bi-weekly Thursday posts are primarily for subscribers and feature more in-progress type work and thoughts from Randy. This week is about experiments.
-
Summary: Data Mishaps Night 2024
A full summary of all the talks from Data Mishaps Night 2024, lightly anonymized for privacy.
-
Book Review: Solve Any Data Analysis Problem
Good for people starting their analyst careers
-
"Medium indirect sunlight"? What?
Mis-care of plants and miscommunication of information
-
(Human) networking is making friends
TL;DR: Spread love, joy, or at least helpfulness
-
Stepping outside and looking around
Go out, take a breath, and see what other industries are doing
-
The mundane parts of migrating a newsletter
Normcore self-hosting a newsletter
-
It's OK To Declare Email Bankruptcy!
I'll do it. You should do it too!
-
Work in progress: a newsletter move
Working on it!
-
Tools For Fighting The Data Monster
It's dangerous to go alone...
-
The Big Bad Data Monster
Working with Data Quality
-
Working in data as a "meh programmer"
Working around myself
-
Refresh your baselines
A brief reminder
-
In the future, we'll work with unstructured data at ~Scale~
Where are the limits? Edge cases?
-
Reflecting on our toolbox mastery
And is it "enough"?
-
Finding work at work on your own
During slow stretches
-
I don't know how to move satisfaction
In ways that I want
-
How the heck do I time a talk
I seriously have no clue
-
Solving the problems in front of you
A PyData NYC 2023 talk
-
Organizing Data is Picking What you Care About
And some examples on how to pick
-
Measuring neutrons in the sea
And how it connects to everyday life. sorta.
-
Cursed with reading systems from data
An unexpected downside
-
Uhoh, I'm doing a PyData NYC 2023 talk
On Nov 02, 1:30pm
-
Make your own space pictures with big telescope data!
Including from the JWST and Hubble!
-
Following up on all-in-ones, plus a mishmash
There's a lot of stuff going on
-
Everything tries to become a one-stop-shop
Data products, all products
-
Pondering on Process Debt
Maybe it's a thing
-
Rev-share validation is weird
And difficult, and nothing new
-
Counting explosions at Unity
Gamedev, Counting, and Ethics, oh my!
-
Insights do not equal utility
Took me a while to learn this one
-
Data design has nothing on process design
And that's what makes it interesting
-
Craftsmanship is a state of mind
But it's optional
-
Most metrics are conditional
And your product folk might not know that
-
How people were (math) precise before modern times
Because thinking about LLMs
-
A lot of 13th century counting
And arithmetic!
-
We still build tools and should share them more
Because we're not selling products
-
Everyone fears the cartesian
And other duplicated row situations
-
An evening of maps. Actively changing maps.
For the fun of it
-
We don't talk about communicating well
Despite it being critical to our job
-
SPF is a pretty unhelpful unit
A just what the heck is it anyway?
-
The value in de-abstracting numbers
Like touching set theoretical grass?
-
Failure comes for us all
Gotta learn to handle it
-
It's still worth running a small server in 2023
For the education!
-
Most can learn analysis, but won't become analysts
Because its different
-
Calls to actions, "like and subscribe", are (sadly) necessary
However much I hate them, you can't avoid them
-
The data-sphere seems to be drifting apart more
At least, the bits I regularly see
-
A day(?) in my life as a Quant UX Researcher
Or a month? Averaged?
-
How does the Air Quality Index work anyways?
It's a scoring function!
-
I am terrible at talking about my achievements
But it's not just me
-
When the AI comes for my coding job...
It can have it. Please.
-
Remember to check your stress levels
Before things spin out of control
-
Data DIY in the age of industrial pickaxes and shovels
Talking about using our tools instead of the tools themselves
-
Hobby break! Keyboard building
For nerds who need to see the whole process
-
Over-simplified metrics suck
I want MORE damnit!
-
Learning what actually underpins your data
Involves forming an often bad opinions
-
You are not a method
But you can easily become one if you're not careful
-
My processes barely save me from myself
Thankfully I've got them, but plenty of people don't
-
The reasonable(?) effectiveness of data analysis
Why are we even effective at anything?
-
We should watch people dissect and build products
Because we learn a lot that way
-
🙄 The AI is coming for my UX job
Maybe? Not really? It's messy
-
When your product's like a tax form
It's about as weird as it sounds
-
The mythical single source of truth
We can't help but sell this one
-
How the heck does one measure color?
This is SO HARD!!!!
-
Square footage is so broken and weird
It's all vibes
-
😔 Everyone'll have an LLM...
What'll it look like then?
-
DS work doesn't have to be purely discovery work
If you like execution-type work like me
-
Data scientist, working without data
Because of course it'll happen some day
-
A couple of weeks wading in contextless metrics
Data nerd bait
-
Mini Recap: Data Mishaps Night
So much fun chaos
-
When are we just speaking for a model
And then catching flak for it
-
Learning to go from creating to editing
Just like code reviews
-
Ugh, the "ai war"(🤮) or whatever is upon us
But for how long?
-
Internalizing baselines is hard
Most of us probably don't even work at it
-
There's dashboards all around us that are USED
And they look nothing like the ones I make...
-
Watching data science in a fraud lawsuit filing is fun!
And slightly infuriating
-
Learning from running out of memory all the time
HARD. WARES.
-
Everyone You'll Ever Meet Knows Something You Don't
The corollary to this, of course, is that you know something that they don't!
-
Writing posts, maybe as a guest!
Or just for yourself
-
Data science has a tool obsession
That we need to balance out
-
Working with the rhythms of the business
Going with the flow
-
We need to calibrate our internal achievement scales too
Because it is/was perf season
-
We all make for a good conference experience
Yes, you too
-
It's sorta odd that data science is as open as it is
Compared to many other data fields
-
Everything is on fire and you should contribute
The Newsletter Version!
-
How do we actually "pull stories out of data"?
It's hard to describe and very branchy
-
Old dog revisits the DS job market out of curiosity
What has happened?
-
We might not see leap seconds after 2035 🤯
For 100 years anyways...
-
Heating systems, black boxes, and knowing things about systems
It all starts looking the same
-
Groups that talk To versus talk AT each other
We're lucky to be the former
-
Staying Sharp in Data Science
Can take a lot of unexpected paths
-
What's up with Readability formulas?
Are they just wishful thinking?
-
Reinforcing our data friend connections while we can
In case "Data Twitter" becomes just "Data"
product (46)
-
Velocity straight into a cul-de-sac
Warp speeeeeeeeed
-
I dislike persona projects
A rant about failure and magical thinking
-
Legoland, Ikea, and doing UX work
Some thoughts about the intersection that keeps me up at night a surprising amount
-
Building less flexibly for better usability
Some musings on designing spaces and software
-
Transmitting user empathy via data
We can do that?
-
What's a Quantitative User Experience (UX) Researcher, 2025 edition
An update for an evolving job title.
-
When product teams think anecdotes is research
Sometimes product teams constantly hear a feature request and then think that everyone will adopt it upon launch. They are then very very disappointed.
-
The importance of looking outside of "the data"
Big shoe brand crashes in value due to a big strategy shift that relies heavily on data... so maybe we should pay attention to it a bit.
-
We should talk about theory to teams more
Industry tends to not care about theory, but we should talk about it a bit more because many aren't adverse to it when we talk about it right.
-
A Budget Guide for Analyzing AI Company Funding with AI
Pulling, aggregating, categorizing, and interpreting public data is always a hairy task with lots of details. Here, Howe shows an example of pulling startup funding data w/ the help of AI classifiers.
-
Metrics trees and other mental frameworks
Frameworks for doing things like setting metrics or identifying issues abound. They're useful tools but it's worth knowing now to take them too far.
-
Why I enjoy improving enterprise software
Consumer products are fairly straightforward and interesting, but I guess I like the weirder stuff.
-
Less coding in public, sorta
More venting out product ideas, but finding some live examples and course correcting
-
Don't take optimization to a growth fight
Nor vice-versa. We need the right tools for the job.
-
Coding... in public... sorta
Who would've thought scheduling random meetings would be hard?
-
I don't know how to move satisfaction
In ways that I want
-
Solving the problems in front of you
A PyData NYC 2023 talk
-
Everything tries to become a one-stop-shop
Data products, all products
-
Most metrics are conditional
And your product folk might not know that
-
Calls to actions, "like and subscribe", are (sadly) necessary
However much I hate them, you can't avoid them
-
A day(?) in my life as a Quant UX Researcher
Or a month? Averaged?
-
When the AI comes for my coding job...
It can have it. Please.
-
We should watch people dissect and build products
Because we learn a lot that way
-
When your product's like a tax form
It's about as weird as it sounds
-
Internalizing baselines is hard
Most of us probably don't even work at it
-
There's dashboards all around us that are USED
And they look nothing like the ones I make...
-
Product Work is Intention Hunting Work
A career of chasing ghosts
-
Caching is our friend, until it isn't
And it's everywhere...
-
Self-narrating some usability testing for others
A bit of dogfood reporting
-
The gap between data science, and UX research
There's soooo much overlap
-
Churn is hard
In all sorts of ways. Except the butter kind.
-
A Busted Kitchen and Breaking User Journeys
Learning from self inflicted pain
-
The Complexity Makin' Goods
During a pandemic?!?!?
-
Path dependency is important to know
Knowing how you got into a situation can help you get out
-
Navigating working with other teams
Generating value upwards
-
Smashing Dashboards and Ikea Together
But only in my brain
-
Time is annoying? Time durations are worse!
Shocker: Stretching a painful thing out doesn't make it less painful!?
-
Helping others deal with uncertainty and risk
Not how we deal with it ourselves
-
Questions that make Quant UXRs excited
Everyone tends to complain about their work
-
Being new to the UX part of "Quant UXR" - Overall Process
I'm new to this part too
-
Simple Visualizations Are Pretty Darned Great
If you're not good at Viz, like me, there's hope for us
-
Becoming a Quantitative UX Researcher is messy
Don't expect to land the title easily, heck, don't even go for the title
-
How I wound up being a Quantitative UX Researcher
Not exactly Data Scientist, but close enough you can barely tell the difference. It’s a fairly obscure position. This is, in broad strokes…
statistics (38)
-
Surveys aren't getting easier to do
It's gotten easier to screw up someone else's survey!
-
Watching a data team get their credibility nuked in national news
Goodbye BLS as we knew you
-
What's a Quantitative User Experience (UX) Researcher, 2025 edition
An update for an evolving job title.
-
Uh oh, the Earth *continues* to spin faster
Three years since I wrote about this last time, it's still happening, meaning a negative leap second continues to be a possibility.
-
"Working" with inconclusive results
Data work tends to mean getting inconclusive results, which are really inconvenient in industry because they're... nothing... and it's quite inconvenient
-
Measuring stuff in practice, wood floor edition
Real world measurement means making compromises in the name of work speed. Because precision is time is money.
-
Peculiar measurements: Accumulated Cyclone Energy
Meteorologists have a curious little metric called "accumulated cyclone energy" that they use to compare hurricanes and entire hurricane seasons. This is a dive into what that's about.
-
Playing chess when it's more like poker
Subscriber post with thoughts about some of the dangers of giving people data and having them feel like they "know" how things work when there's plenty they don't know
-
Being air-dropped into an analysis
Domain knowledge is critical to dat awork, but we really do get thrown into the deep end at times. Why does that even work at all?
-
You're Collecting Too Much Data!
The most instinctive reaction to wanting data is to collect and store as much of it as you can, figure it out later. For many reasons, that's probably a bad idea.
-
Everything's small data again
There once was a time where working with lots of data was so awkward, it became a giant hype cycle. Now, almost nothing is "big" any more.
-
Whatever happened to the multi-armed bandit?
There was a brief period in the early 2010s where data scientists were interested in bandit models... and then you barely hear about them any more.
-
When are we just speaking for a model
And then catching flak for it
-
What's up with Readability formulas?
Are they just wishful thinking?
-
Accidentally trapping ourselves with stats
The insignificance of significance
-
That time I participated in Nielsen's TV Ratings
As a data nerd, pretty interesting stuff
-
False Discovery Rates in A/B tests
Wait, what?
-
We have the power to be wrong with extreme confidence
A reminder to us all
-
It's that DS meme plane with dots again
With tons of math going over my head
-
Handling shifting survey questions
It's just bad news for everyone involved
-
The delicate art of making ourselves wrong
Modeling for incorrectness
-
Communicating changes with percentages is surprisingly hard
Almost bizarrely so
-
Paper dive - Replication is even harder
Cleaning data is important!
-
Just where is the minimal stats bar for data science?
At what point are you completely useless?
-
Rejecting the null based on a chart
The dream...
-
Challenge: Predicting [Lunar] Eclipses
Doing things the incorrect way, for science(?)
-
Fighting Confirmation Bias
It's always been important. It's just a bit more important these days.
-
Some Gamedev and Shoddy Data Arguments
With sketchy charts!
-
Making Fair Games of Go
Fun is hard
-
Data Literacy Via COVID-19
Some people are learning data science quickly without realizing it
-
Interpreting Email Analytics is Handwavy
What you can do when you actually collect email data
-
Counting is hard, 2019-nCoV edition
It’s not every day we can watch an expert explaining how hard counting is
-
Data Science in the Trenches: Living w/ Small n
Somewhere, someone’s having this conversation. Right. Now.
technical-culture (142)
-
Code review for data (and non-SWE) folks
TL;DR is, your job as a reviewer is to help the reviewee come up with better code than they initially did.
-
The era of unscientific management
It's a really annoying turn of events but here we are...
-
Dashboard rot as org attention grave markers
Dysfunction is still dysfunction, regardless of how it got there.
-
I dislike persona projects
A rant about failure and magical thinking
-
Looking for work, surprise edition
I always told my wife this newsletter was my backup plan...
-
The show must go on (busted PC edition)
It does go on and on and on ...
-
Remembering how the journey can be the work
Every few years I forget and have to remind myself
-
Doing SQL work with LLM aids as a SQL addict
It could be better, it could be worse
-
We have DAGs for our data, but also for our work
It's easy to not realize we manage a DAG surrounding... ourselves
-
Basic logs analysis via code
Another episode in the ongoing logs analysis primer series
-
Building less flexibly for better usability
Some musings on designing spaces and software
-
Vibe coding is delayed pain
The fact that it CAN work, but only with a ton of aggravation is the worst part of the whole thing...
-
As a mostly ad-hoc'er, I've got workflow issues
Please, let's celebrate doing things the ugly way. For our peers that do things justly as ugly.
-
No one works with clean slates, we shouldn't write with it either
The audience for messy, one-off examples seems larger than it initially appears. If we are only willing to listen
-
We're all carriers of software memes
In every one of us, there's a little gremlin that learned something cool, silly, weird, but oddly useful.
-
Measuring clouds... okay, cloudy water (and air)
How cloudy is a sample of water or air? Well, someone's figured it out, right?
-
Anyone* could learn data analysis. Probably.
Thoughts on how people should teach and learn analytics based on the question at hand, and not the other way around.
-
Different levels of optimization problems
Data scientists have got "one weird trick" and it's called optimization. Luckily there's a surprising amount of depth to the trick.
-
I'll be teaching a logs analysis course in April
It'll be in person, and it's gonna be fun! I promise at least that much.
-
Storage is cheap, but not thinking about logging is expensive
The bad habits of data over-collection run deep.
-
ML vs code rot
I tried to run some ML stuff for the first time in ages! And I failed! r-o-f-l
-
Paper reading: analyzing string figures across the globe
The surprising depth needed to analyze bits of looped string
-
(Not) Building for the long term
Planning is hard ot do, especially at fast-paced startups. So how about you spend less time planning and more time breaking things down into useful chunks?
-
And we keep counting on...
Because I don't know what else to do...
-
Do more writing, it is thinking
Says guy who writes more than most of the population... but hear me out.
-
The much broader world of datetime formats
There's more to date time formats than YYYY-MM-DD HH:MM:SS. Most of it confusing.
-
Software system design's got it pretty good
Just some thoughts about how refactoring software is nowhere near as complex as refactoring physical spaces.
-
"Good enough" charts for work
I admire good dataviz, I also know that I'm hopeless at those. So instead I've found a place where it's good enough for the work I do.
-
Measurement of life things has gotten really cheap
It's never been easier, and cheaper, to measure things in our world, just to actually do it and learn how annoying it can be.
-
Technical writing is too important to leave to language models
There are no shortcuts to writing something worth reading
-
Your SQL's probably not that bad (compared to most people)
If you use it all the time at work, you're probably already in the upper percentiles in skill.
-
Using Old Tutorials for New Tricks
Learning new programing languages is always tricky. But using tutorials from a language you already know to learn another one makes things surprisingly easier.
-
We should talk about theory to teams more
Industry tends to not care about theory, but we should talk about it a bit more because many aren't adverse to it when we talk about it right.
-
Thursday data thought: teaching, tutoring, and upskilling others
For this Thursday's Subscriber post, some thoughts bouncing around my head that might eventually turn into a sane post but needs some thinking-out-loud to see where it actually goes.
-
Why I enjoy improving enterprise software
Consumer products are fairly straightforward and interesting, but I guess I like the weirder stuff.
-
Simplifying (reasoning about) complex systems
"How hard can it be?" – these are the famous last words of many a data science project. We try not to underestimate the difficulty, but can go too far. Sometimes we just need to know the simple version.
-
Talk writing: doing experiments better
Bi-weekly Thursday posts are primarily for subscribers and feature more in-progress type work and thoughts from Randy. This week is about experiments.
-
Summary: Data Mishaps Night 2024
A full summary of all the talks from Data Mishaps Night 2024, lightly anonymized for privacy.
-
Onboarding and fishing for tacit knowledge
Making transitions smoother
-
Networking basics for data work
Just enough networking knowledge to do data work and debug your way out of tight spots.
-
The mundane parts of migrating a newsletter
Normcore self-hosting a newsletter
-
The Big Bad Data Monster
Working with Data Quality
-
Finding work at work on your own
During slow stretches
-
Solving the problems in front of you
A PyData NYC 2023 talk
-
Organizing Data is Picking What you Care About
And some examples on how to pick
-
Everything tries to become a one-stop-shop
Data products, all products
-
Pondering on Process Debt
Maybe it's a thing
-
How people were (math) precise before modern times
Because thinking about LLMs
-
Everyone fears the cartesian
And other duplicated row situations
-
We don't talk about communicating well
Despite it being critical to our job
-
SPF is a pretty unhelpful unit
A just what the heck is it anyway?
-
Failure comes for us all
Gotta learn to handle it
-
It's still worth running a small server in 2023
For the education!
-
I am terrible at talking about my achievements
But it's not just me
-
Remember to check your stress levels
Before things spin out of control
-
You are not a method
But you can easily become one if you're not careful
-
The reasonable(?) effectiveness of data analysis
Why are we even effective at anything?
-
When your product's like a tax form
It's about as weird as it sounds
-
Learning to go from creating to editing
Just like code reviews
-
Working with the rhythms of the business
Going with the flow
-
We need to calibrate our internal achievement scales too
Because it is/was perf season
-
Everything is on fire and you should contribute
The Newsletter Version!
-
We might not see leap seconds after 2035 🤯
For 100 years anyways...
-
Some brief musings on SQL abstrations
They're wonky creatures
-
Working with a non-DS manager
It's either a bit of work, or a lot of work
-
Seeing the data science work all around us
Everywhere! EVERYWHERE!
-
What if every dashboard self destructed
Would it be so bad?
-
One day, the free (computing) lunch will end
And then we'll have to work harder =O
-
We need to help each other write better
Because knowledge gaps are hard to spot
-
Caching is our friend, until it isn't
And it's everywhere...
-
The world isn't as polished as it appears
So it's ok to crack things open to see
-
Optimizing for personal portability
Instead of mastering a thing
-
We should phase the "SQL Interview" out
For something more general
-
Let's talk a bit about giving interviews
One day you're going to have to
-
Data Management is Context Management
We are destroyers and preservers
-
Comparing Quant UX Researchers w/ Data Scientists
It's a mess because everything's undefined
-
Life on the abstractions of giants
I love hobbies. This is no surprise to any of you.
-
Hidden treasure in the timezone database
It's like a storybook
-
Measuring "here", coordinate systems for the Earth
It's complicated
-
Discussing AI is such a confusing mess
and it's no one's and everyone's fault
-
Reflecting on how system complexity grows
Flashbacks of past pain
-
Surviving planning from the bottom up
Not all of us sits at the big table
-
The many faces of "Production"
It's not all about big, complicated systems
-
How do we know our thermometers are correct?
Taken to extremes!
-
Every org has a "literature"
And it's a giant mess
-
We should treat data science as a craft
And stop obsessing on surface technical skills
-
Ways to become a Quantitative UX Researcher, w/o a PhD
It can be done... with effort
-
Hate leap seconds? Imagine a negative one
Because the possibility just got slightly more plausible
-
Vacation is the power say "no" by not even being there
And that's a good thing
-
What goes into this "Heat Index" thing?
Apparently a huge amount
-
You, too, should run a few machines
For Science!
time (24)
-
Daylight savings broke software (again) and other Time News!
Every so often I just have to indulge in this topic
-
Not-ignoring daylight savings time for analysis
While we usually want to avoid touching daylight savings time changes in our datasets, when you're working with people, local time actually does matter.
-
Uh oh, the Earth *continues* to spin faster
Three years since I wrote about this last time, it's still happening, meaning a negative leap second continues to be a possibility.
-
The much broader world of datetime formats
There's more to date time formats than YYYY-MM-DD HH:MM:SS. Most of it confusing.
-
Phew, coming up for air soon
A brief update on what's been going on at the newsletter. Nothing crazy.
-
It's OK To Declare Email Bankruptcy!
I'll do it. You should do it too!
-
Finding work at work on your own
During slow stretches
-
Working with the rhythms of the business
Going with the flow
-
We might not see leap seconds after 2035 🤯
For 100 years anyways...
-
Hidden treasure in the timezone database
It's like a storybook
-
Planning quantitative work
Os different from a lot of other work
-
Hate leap seconds? Imagine a negative one
Because the possibility just got slightly more plausible
-
Vacation is the power say "no" by not even being there
And that's a good thing
-
Scaling yourself
Because you'll be buried otherwise.
-
Challenge: Predicting [Lunar] Eclipses
Doing things the incorrect way, for science(?)
-
Side gigs and avoiding them
For (your) sanity's sake
-
Time is annoying? Time durations are worse!
Shocker: Stretching a painful thing out doesn't make it less painful!?
-
Making the best of having too many meetings
What else can you do =\
-
Dates, Times, Calendars— The Universal Source of Data Science Trauma
Some survival tips for all the crazy out there
tooling (52)
-
Please write docs with a theory of mind
It really shows when people don't do this
-
Daylight savings broke software (again) and other Time News!
Every so often I just have to indulge in this topic
-
Data work in the fast fashion code era
There's probably more upsides for us than SWEs
-
Counting the planes overhead
Through the magic of RADIO WAVES~~~~
-
The show must go on (busted PC edition)
It does go on and on and on ...
-
Doing SQL work with LLM aids as a SQL addict
It could be better, it could be worse
-
Basic logs analysis via code
Another episode in the ongoing logs analysis primer series
-
Vibe coding is delayed pain
The fact that it CAN work, but only with a ton of aggravation is the worst part of the whole thing...
-
Things to consider when modifying your data systems
As analysts, we push data systems until they creak and groan. Then we modify them to not do that. Then repeat. Here I'm just thinking through the process.
-
A personal tool map
Every tool I can remember owning, and what I use it for.
-
A year after Substack, good and bad
An update for a year of changes. In hopes anyone wanting to write more can get something out of it
-
ML vs code rot
I tried to run some ML stuff for the first time in ages! And I failed! r-o-f-l
-
Measurement of life things has gotten really cheap
It's never been easier, and cheaper, to measure things in our world, just to actually do it and learn how annoying it can be.
-
Using Old Tutorials for New Tricks
Learning new programing languages is always tricky. But using tutorials from a language you already know to learn another one makes things surprisingly easier.
-
Data tool design as a sign of field dynamism
Post for subscribers -- I think the way we still want our many data tools to interop via common file formats is a sign of dynamism and optimism of future innovation
-
Last minute eclipse travel planning for data nerds
Using tools to figure out where to go in all the chaos
-
Networking basics for data work
Just enough networking knowledge to do data work and debug your way out of tight spots.
-
The mundane parts of migrating a newsletter
Normcore self-hosting a newsletter
-
Work in progress: a newsletter move
Working on it!
-
Tools For Fighting The Data Monster
It's dangerous to go alone...
-
In the future, we'll work with unstructured data at ~Scale~
Where are the limits? Edge cases?
-
Reflecting on our toolbox mastery
And is it "enough"?
-
We still build tools and should share them more
Because we're not selling products
-
Data DIY in the age of industrial pickaxes and shovels
Talking about using our tools instead of the tools themselves
-
How the heck does one measure color?
This is SO HARD!!!!
-
Data science has a tool obsession
That we need to balance out
-
Some brief musings on SQL abstrations
They're wonky creatures
-
What if every dashboard self destructed
Would it be so bad?
-
Making do with the infra we got
Because what choice do we have?
-
Optimizing for personal portability
Instead of mastering a thing
-
Hidden treasure in the timezone database
It's like a storybook
-
Skill Windows into the Data World
Psst... run a survey
-
Every org has a "literature"
And it's a giant mess
-
Doing better with Excel
Shooting slightly less of your foot off
-
Our lives are nested orders of operation puzzles
Practice practicing!
-
Why we reluctantly work with regex
Because, sometimes there's no choice
-
Let's play with some Library of Congress data!
And learn about learning about data sets
-
My favorite file format is ~50 yrs old
tl;dr CSV (and friends)
-
MapReduce for normal folk who don't need it anymore
Because we're in a post-MR world
-
(Maybe) Adopting New Tech
Not everyone's on the bleeding edge
-
Audio silliness in the era of videoconferencing
Mic mic revolution
-
Character Encoding, Part 3 of 3 — Gotchas while working with Unicode
Off by one errors are common in programming, right?
-
Character Encodings — The Pain That Won’t Go Away, Part 2/3: Unicode
This is supposed to save us all, it’s merely the best we’ve got so far
-
Dates, Times, Calendars— The Universal Source of Data Science Trauma
Some survival tips for all the crazy out there
-
Learning SQL 201: Optimizing Queries, Regardless of Platform
TL;DR: Disk is Frickin’ Slow. Network is Worse.
-
It’s OK to use spreadsheets in data science
Because they’re awesome in a bunch of messy data science contexts.
-
Optimization is way out of scope for this rant so… yeah.
Sometimes, index or no index, you’re forced to make some really ugly queries out of necessity given unruly tables or bizarre business…
-
Can we stop with the SQL JOINs venn diagrams insanity?
Really, please, OMG, stop
TopPost (46)
-
Code review for data (and non-SWE) folks
TL;DR is, your job as a reviewer is to help the reviewee come up with better code than they initially did.
-
Your data work is part of work politics
And there's no escaping it
-
DataBS-Conf planning updates and how scrappy conferences are made
Just about a decade of conference organizing smushed into one post
-
Everyone is capable of doing logs analysis
It all starts from something most of us has. Just powered up and tooled up.
-
Labeling things by hand when everyone's trying not to
The hot stuff right now is labeling with LLMs, but y'know, maybe consider not?
-
The importance of looking outside of "the data"
Big shoe brand crashes in value due to a big strategy shift that relies heavily on data... so maybe we should pay attention to it a bit.
-
Talk summary: Designing experiments for maximizing getting things done
A text transcript of my Quant UX Con 2024 talk for those who prefer text over video.
-
We should talk about theory to teams more
Industry tends to not care about theory, but we should talk about it a bit more because many aren't adverse to it when we talk about it right.
-
Everything's small data again
There once was a time where working with lots of data was so awkward, it became a giant hype cycle. Now, almost nothing is "big" any more.
-
Don't expect data to change everything like magic
Many people joining the data science field harbor the belief that they're gonna help change how things work, make stuff magically better. Sadly, we all learn that it's much harder than that.
-
Don't take optimization to a growth fight
Nor vice-versa. We need the right tools for the job.
-
What the heck is "Insight"?
It's something we "deliver" but what is it anyway?
-
(Human) networking is making friends
TL;DR: Spread love, joy, or at least helpfulness
-
Solving the problems in front of you
A PyData NYC 2023 talk
-
How people were (math) precise before modern times
Because thinking about LLMs
-
It's still worth running a small server in 2023
For the education!
-
Calls to actions, "like and subscribe", are (sadly) necessary
However much I hate them, you can't avoid them
-
Data scientist, working without data
Because of course it'll happen some day
-
There's dashboards all around us that are USED
And they look nothing like the ones I make...
-
How do we actually "pull stories out of data"?
It's hard to describe and very branchy
-
How to measure a subcontinent
Before GPS or even lasers
-
Data driven doesn’t mean data is driving
Beep Beep~~~
-
Data Management is Context Management
We are destroyers and preservers
-
Hidden treasure in the timezone database
It's like a storybook
-
What Data Folk Were Saying about Zillow
Not a "What happened?" post no one needs to read
-
Go collect some $#*(&% data
You need to. Seriously.
-
Let's Get Intentional About Documentation
Because I doubt most people, myself included, is
-
Data Cleaning IS Analysis, Not Grunt Work
Also, most data cleaning articles suck
-
Time is annoying? Time durations are worse!
Shocker: Stretching a painful thing out doesn't make it less painful!?
-
In search of "Good Enough" data science
I still can't find it at times
-
Let's Talk Rice Measurements
The History of a little cup
-
Email Analytics: More than you ever need to know
Black boxes are hot again
-
Be Yourself: The Data Scientists You See In Public Are Not Representative
It needs to be said regularly — it’s OK to be nothing like them
-
The User-Agent — That Crazy String Underpinning a Bunch of Analytics
It’s surprising to think we rely on this as much as we do
-
Dates, Times, Calendars— The Universal Source of Data Science Trauma
Some survival tips for all the crazy out there
-
Learning SQL 201: Optimizing Queries, Regardless of Platform
TL;DR: Disk is Frickin’ Slow. Network is Worse.
-
Data Science Practice 101: Always Leave An Analysis Paper Trail
It’ll save your butt. Lots.
-
Trap DS Projects: Beware of “Easy” Segmentation Projects
99% chance you’re not ready
-
Can we stop with the SQL JOINs venn diagrams insanity?
Really, please, OMG, stop
-
Data Science foundations: Know your data. Really, really, know it
Know your data, where it comes from, what’s in it, what it means. It all starts from there.
-
Succeeding as a data scientist in small companies/startups
It’s nothing like at a big mature company.