Thursday data thought: teaching, tutoring, and upskilling others
For this Thursday's Subscriber post, I've got some thoughts bouncing around my head that might eventually turn into an actual post but needs some thinking-out-loud to see where it actually goes. So this is a reflection of the swirls of anecdotes and idea exploration that comes ahead of a formal post. Comments, opinions, and thoughts from everyone would be super helpful.
Ever since high school I knew something about myself – I can make a decent tutor, but would make an utterly terrible teacher. The reason is quite simple, while I can quite passionately go on for hours explaining, demonstrating, doing activities on a bunch of topics, I simply have no idea how to teach people who aren't particularly self-motivated enough to learn. I don't know how to do all the hard stuff about figuring out where student skill levels are nor identifying knowledge gaps. I most certainly have no idea how to convinced someone who doesn't care how to cram things into their head to at least pass a test.
So one day, it occurred to me that I've "tutored" a bunch of people on working with data at the various jobs I've been on. Usually in a big enough group, there's always a very small handful of people who actively want to learn how to write their own SQL queries, mini data analysis pipelines, or crunch some interesting numbers in spreadsheets. Most of those people eventually managed to learn to accomplish what they set out to learn and it helped in their work. Occasionally they'll hit something they're not confident about and come for some advice.
Usually, I'd pat myself on the back for having helped a handful of people become more data-literate. They can pull data, they have a sense of the importance behind counting the right things, and they have a rough idea of what makes a good experiment. While they can't stand in for a data analyst, they know to ask good questions of a data analyst. Even if I can't do the "democratize data" thing to a broad group, I've helped nudge it forward slightly at the individual level.
But now I somewhat wonder did I even succeed in that. While it's indisputable that I did teach those people useful skills, the foundations upon which all those things rest is of uncertain quality. There's bound to be gaps that might be dangerous. For example, while they know that they should have random samples for a test, they might not have the intuition about why randomness matters. They might accidentally bias their sampling trying to be clever. As a tutor, I'm just filling in the knowledge gaps that happen to present themselves in the moment like propping up an incomplete house in the woods. It's all patching without checking if the foundation is any good.
The same thing goes for when I teach people SQL, or basic programming skills. I'd teach enough to get the job done, as well as do my best to point them to more general concepts and applications. But there's a whole untold sea of knowledge that gets skipped over.
So the thought nagging at me is, have I actually help teach anyone how to fish? Or did I just give people a live fish in a bag of water, so they feel like they fished a bit but haven't actually learned much. This would be so much easier if data science work wasn't so damn interdisciplinary. Everything hangs off everything else so there's so much extra stuff to cram in. The more I ponder on it, the more impossible it seems to be to be able to tutor someone into being a junior data scientist on a part time basis. I can't think of any good way to bring someone up to speed without just throwing them into the deep end of the pool and letting them soak in the complexity.
So while all this is swirling around in my head, here comes the constant efforts I see to "democratize data" and the debate of whether to put BI-style guard rails on point and clicky dashboard systems like Looker or Tableau might encourage, or somehow teach people to work with certain kinds of data for specific situations. You'd think my doubts about teaching data skills would mean leaning towards using systems with strict guard rails, but I also detest having to set such systems up. It'd be just so much better if people and processes were data-literate enough to make use of more flexible systems.
I don't particularly want to re-hash the endless debate about democratizing data. I think those programs always swing back and forth between extremes and create a lot of organizational churn in the process. What I'm more interested in is coming to a better understanding of what are the ways in which data fundamentals can be assessed and then taught to interested folk. Essentially, what's a good fundamental curriculum for the work that we do, and how do I detect gaps in those fundamentals in the people I'm trying to help out.
Yes, there's lots of data courses and syllabi for data science work nowadays. But even if I sift through a bunch of them to see what the patterns are, I'm still missing the "evaluation" bit.
Look, teaching's an art that I most certainly don't have talent in. But that doesn't stop me from putting at least some thought into it...