Somehow we're in an age of spellcasting now
This cycle's Thursday subscriber post is just a peculiar observation that has stuck in my brain and is refusing to let go.
As a nerdy kid of the '90s I grew up reading a bunch of fantasy books, as is tradition. Even now, while I'm reading significantly fewer books than those days, I still keep up with interesting anime broadcasts that has of late had a decent selection of high fantasy stories worth following.
When it comes to magical systems in fantasy, there's a handful of tropes that magic systems follow. For example, there's the trope of "advanced technology from lost civilizations = magic", as well as "magic = a mysterious system of rules and incantations that must be followed to get an effect" tropes. The details of course depend on how technical and complicated the author wanted to make things.
One pretty common concept is that magic is this external thing that has its own internal rules and logic separate from normal experience. Magicians do research and study to poke at the limits to figure out what can be done with the system. Depending on the story, there's dogma, rituals and incantations that have been passed down without question as to "why is this necessary?" Actually knowing how the magic system works at a fundamental level is often impossible or seen as within the realm of superior beings.
Of late, when I look at modern society and recall these well-worn tropes, I think "that's prompt engineering". You have huge swathes of people all treating this literal mathematical/computation system we engineered as this black box system where if you utter the correct spells, the black box will manifest magical results into the real world. The past couple of days has seen people mocking Mark Andreesen's system prompt that includes incantations like "don't hallucinate" or "check your work", which if you actually understood how LLMs function you'd know that those statements have as much effect as if I told an intern doing a project "you cannot be wrong". At best, some of those sentences triggers weights in the model that leads it to a blob of probabilities that doesn't engage in fiction writing as much. In other cases there might be behavior wired into the system to do things like run a RAG search or something to make an attempt at grounding responses. At its worst, it is a form of self-delusion trying to make a model do something it is fundamentally incapable of doing.
Either way, I'm not a LLM model expert, and in this instance don't particularly care whether this particular instance of telling the thing to not hallucinate actually does have an effect on the output or not. Instead, I stand in fascination watching the humans act in this human-computer interaction scenario, not the computer.
I find it really funny that, for all the talk of AI supporters of creating a thing that is "like magic", they have indeed created a thing that people interact with like magic – an unknowable system that can only be manipulated with incantations – without actually creating besides a device that mechanically chains words together according to probabilities.
I'd be willing to bet that if you asked people what a magical AI experience would feel like, people would talk about how it can understand their needs and automate boring work away. People most definitely won't be gushing about how they expect the interaction to take place via a lengthy prompt. So there's a big disconnect going on.
As someone who is interested in humans, users, and UX, while being decidedly not interested in the actual model training and building details, I find this peculiar result fascinating. It's a giant mirror held up to society at large and reflecting some of the weird things about being human.
Another interesting thing about these LLM systems is that there are two separate ways to manipulate output. The obvious way is to train the model weights directly while the other is through prompting. There's a big asymmetry in feasibility there that I think is contributing to the flourishing of prompt madness we're seeing.
Modern LLM models are completely impractical to train from scratch by anyone without ridiculous amounts of money and resources. At best, individual practitioners can update and patch a pre-trained model. This leaves the majority of potential users with no other way to change how models behave outside of prompt manipulation. If there's two levers to pull to manipulate the model, but one requires using $100 million in compute, then the vast majority of people will pull frantically at the remaining lever.
This kind of "LLM is a black box we invoke with our spells" viewpoint isn't just regulated to naive users. There's endless papers being published about how this or that prompting method somehow makes an LLM model perform better in certain ways. For example, there's various Chain-of-thought prompting papers from 2022. These papers are written by researchers who actually understand the underlying models and how they're constructed and work. Despite that knowledge, that little subarea of study of how prompts manipulate the output of LLM models is still seen as a serious line of inquiry. The results of those lines of research ultimately informed of a lot of the weird incantations that become system prompts. We don't know exactly how various prompt things do what they do, but it shows up in the benchmark tests.
And sometimes, the incantations that survive are mind boggling. Just a couple of weeks back, OpenAI put an entire blog post about how "goblins" (and other creatures) kept cropping up in certain responses, enough so that the devs had put in firm instructions in their system prompt to not mention those creatures. The corner of the internet I watch with was briefly abuzz with quotes of this system prompt snippet:
Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query.The modelers at OpenAI say they found a way to remove the constant mention of certain animals in the training phase, so in theory that explicit exclusion isn't strictly necessary any more for certain models. But since model and system prompt are separate things it can stick around for much longer. Presumably, in a few months or years, that instruction will just permanently become part of the incantation and no one will remember why it is even there.
At this point, if you care to poke around, you'll easily clusters of people trading "tips" for prompting LLMs to accomplish various things, whether it's writing software, classifying text, or whatever. These are completely distinct from the academic discussion that's grounded in somewhat more rigorous benchmarks and tests, but the end result is going to be the same – clusters of bespoke incantations that people believe do something better. These will be passed down and take on a weird life of their own.
If people keep adopting things so much to the point where their base skills atrophy, we're going see even more wild incantations sprout up. It is going to be a very wild ride as society at large figures out what it wants to do with this technology.
Incidentally, this 2026 spring season's high fantasy anime of note for me is "Witch Hat Atelier". Some of the animation for spells is particularly breathtaking. The magic system employed there are drawings of various symbols using a special ink. Meanwhile, the study of magic involves learning how to draw and combine symbols of various meanings in a way to achieve the properties you desire.