Measurement of life things has gotten really cheap
Recently, our car had a warning light for tire pressure come on. We'd later learn that we had run over a screw, but at the time I realized that the only tire pressure gauge that I owned was a really badly inaccurate one on a hand-operated bike pump. So, off to the internet I went to find a tire pressure gauge, expecting to get one of those pop-up stick gauges that my parents used on their car decades ago.
While those pencil sizes gauges were definitely the cheapest option (under $5) and stated to be accurate within 2%, I was surprised at the sheer amount of choice I had available to me. For less than $10 I could get a digital one with stated accuracy to 0.1 PSI (even if I don't fully believe that last digit). Getting a fancy one that can double as an inflator when attached to an air compressor cost me all of $25.
Our modern world is awash with ridiculously accurate measuring tools. Plus, if you don't care about getting an accuracy certification from an actual legit certifying body, meaning you aren't doing things that require you to know your measurement margin of error, you can get such devices for ridiculously low prices. Sure, the knockoff clones of good equipment have much more uncertainty about the values they give, but they're still pretty darned close.
Kitchen thermometers and scales used to be only found in commercial kitchens but now good ones will only set you back a small amount and make your recipes so much more predictable. Even the cheapest plastic ruler is, for all practical purposes, closer to the national length standards than my ability to hold a ruler up to something to to measure accurately. Laser levels, angle finders, laser rulers, digital calipers, sound pressure meters, light meters, anemometers, Geiger counters, even temperature data loggers that track data regularly for a whole month all cost less than $50 for the "good enough for non-certified use" grade stuff. Heck, my sound pressure meter hooks up to my phone and does come with an accuracy certificate, and cost me all of $40.
For more money you can get even more specialized contraptions for measuring things. As an example, a ~$200 traffic/people counter can attach to a doorway and count people entering a space. Some are use two sensors and can track people entering/leaving. Air quality sensor packages are in a similar price range and cover all sorts of pollutants. Full blown home weather monitoring stations with a mix of functionality are just a bit more expensive.
Also, if you want more flexibility and automation, there's somewhat more expensive Data Acquisition (DAQ) systems (decent overview here) that connect sensors that speak standard protocols to your computer for logging and analysis use. Using such systems requires a bit more learning and money than simple plug and play data loggers, but they're more flexible for it. There are efforts to make data loggers using raspberry pi hardware, though once you acquire all the parts it might not be any more cost effective than a commercial solution.
It is easier than ever to measure and understand our world than at any prior point in history – and many of us don't really even notice because they're so ubiquitous and cheap. What this means for us data people is that data collection at the personal level has fallen through the floor. We can personally collect data about our world to great accuracy, without the need for industrial funding, for whatever uses we can dream up (even if they're super mundane and overkill!).
And what can we do with such data? Learn stuff! Because one of the trickier things about learning data science is appreciating how FREAKIN' HARD it is to collect data. Most of our professional lives, the data is just dropped into our laps, pre-generated and pre-filtered by many layers of engineering decisions. Occasionally we may be asked to collect our own raw data in the form of a survey and already many of us (myself included) moan and complain about the huge amount of cleaning work we need to do to get the data to a usable state.
So guess what happens if we want to do something as simple as logging the temperature of the inside of your refrigerator over time and have to face the reality of natural sensor jitter, how different parts of a refrigerator have different temperatures, and how opening the door causes the data to jump. Doing analysis on that data is going to require you to make some pretty arbitrary decisions in the name of cleaning.
Try to make a crappy DIY weather station with your own sensors and you'll quickly realize why places like the National Weather Service have established standards for where to place a temperature sensor to get a valid reading. Something you can almost completely ignore while using historic temperature data (unless you find an anomaly in measurement methodology and want to build a case to invalidate a temperature record from 1922).
One of my long-standing bits of advice I give to people who plan to work with data over their careers is they need hands on experience in collecting data for their own use to get a taste of just how much happens at that stage of work. People who don't eventually engage with the inevitable problems of data collection are going to get burnt by anomalous data that they can't explain without digging deep into what's often a messy, undocumented, ad-hoc process.
That sort of experience tends to permeate into every aspect of how you work with data. I've noticed that when I'm interviewing people for UX researcher positions and the topic of data collection comes up, it's very obvious when someone's learned that lesson because it permeates every step of their workflow. They'll make efforts to understand the raw data generation process, they'll check for "impossible" values that engineers are confident can't be generated, they'll do exploratory analysis to check assumptions before even considering what method they'll apply. All of this is stuff you never learn in a data analysis class.
So back to cheap sensors. Go grab some and just play with them for the sake of just knowing a bit more about your immediate surroundings. Temperature, humidity, and sound pressure (loudness) are all very accessible and immediately relevant to your life experience. Many people picked up air quality monitors due to concerns about wildfires or COVID. Log that data down with pen and paper if you have to. Then just use it to get some small, probably trivial, insight about your surroundings.
What do I mean by trivial insight? Stuff that is a cute little factoid to know, even if it has no practical use. How loud IS the traffic outside my window during the day, and can I even measure a change over time? How do different corners of the room get warmer/cooler differently across the day. It's something you need to put in some work to find the answer to. Maybe it'll help you make a decision in the future. Or maybe gives you the world's most boring bit of small talk.
Totally worth it.
Standing offer: If you created something and would like me to review or share it w/ the data community — just email me by replying to the newsletter emails.
Guest posts: If you’re interested in writing something a data-related post to either show off work, share an experience, or need help coming up with a topic, please contact me. You don’t need any special credentials or credibility to do so.
About this newsletter
I’m Randy Au, Quantitative UX researcher, former data analyst, and general-purpose data and tech nerd. Counting Stuff is a weekly newsletter about the less-than-sexy aspects of data science, UX research and tech. With some excursions into other fun topics.
All photos/drawings used are taken/created by Randy unless otherwise credited.
- randyau.com — Curated archive of evergreen posts. Under re-construction thanks to *waves at everything
Supporting the newsletter
All Tuesday posts to Counting Stuff are always free. The newsletter is self hosted, so support from subscribers is what makes everything possible. If you love the content, consider doing any of the following ways to support the newsletter:
- Consider a paid subscription – the self-hosted server/email infra is 100% funded via subscriptions
- Share posts you like with other people!
- Join the Approaching Significance Discord — where data folk hang out and can talk a bit about data, and a bit about everything else. Randy moderates the discord. We keep a chill vibe.
- Get merch! If shirts and stickers are more your style — There’s a survivorship bias shirt!