In economics, there’s a market failure concept called a “market for lemons”.
Imagine I have a pretty good used car I’m selling to you. It’s a nice car! The problem is, I can’t prove it’s a nice car. For all you know, my car might be a lemon: a used car that seems fine on the day of purchase, but actually has several problems and breaks down days later, at which point I’ve taken your money and disappeared into the night.
My car isn’t a lemon. It’s a peach. But because I can’t prove it’s not a lemon, I can’t charge peach prices. This sucks for everyone trying to sell a peach.
The average used car price in the market depends on the ratio of peaches to lemons. Every buyer’s price is discounted by the risk of purchasing a lemon. As long as there’s even a few lemons in the market, the average market price for a peach will always be slightly lower.
So if I can’t sell a peach for what it’s worth, I’m just going to take it off the market. I may as well give it to a family member, or drive it till it dies, or sell it off the open market in a private deal.
And now we have one less peach in the market, which increases a buyer’s risk of purchasing a lemon, which drags the average selling price down, which means more peaches leave the market, and so on until eventually everything collapses and you’re left with a market for lemons.
The reverse Turing test
Years ago, 4chan proposed the dead internet theory: the internet is “empty and devoid of people” and the U.S government is leveraging AI to gaslight the entire world. Of course, the internet is (probably) not a government psyop. But the milder form of this theory is that we’re overrun by bots, and most of us assume that’s true at this point. Just look at an Instagram influencer’s followers list.
Over the last year or so, we’ve been deluged by a flood of LLM tools: ChatGPT, Bard, Jasper, Moonbeam, and so on. All of these tools are good at generating coherent human-ish text. Some are very good.
Until now, all forms of spam, catfishing, social engineering, scamming, and brigading has been mostly bottlenecked by humans. You can duplicate a message and blast that to half a million people, but that’s easy to spot. There’s always been a sort of trade-off between quantity and quality with spam.
But with LLMS, the spam game has changed.
You thought the first page of Google was bad before? We now have a Google where SEOs pump out billions of extremely dull articles, all with the same talking points and optimized for every long-tail keyword combination possible. And that’s before we account for Google’s own generative AI search results.
Now anyone can relentlessly auto-publish a banal stream of LinkedIn posts, content marketing articles, and tweet threads with the OpenAI → Zapier pipeline. Connect ChatGPT + Stable Diffusion + Ebsyth + Murf.ai to publish an endless stream of Youtube videos or TikTok clips, then automatically regurgitate it for every platform you can think of.
“But surely,” you might protest. “Surely people can differentiate humans from bots. I know I can.”
Okay. How would you prove you’re not an LLM? What special little human tricks can you do that a language model can’t?
How do we identify lemons when they’re indistinguishable from peaches?
1. Have an opinion
Language models learn from text around the web. Some people recycle this into “original” creations, which then trains other models, which is used by humans for more creations, and so on we go recycling generic ideas and arguments like an ouroboros of bad advice. In the Stable Diffusion community, this is known as “incestuous merging”.
Ending this cycle requires coming up with an actually original thought and opinion.
(To be clear: having an opinion doesn’t mean being angry or contrarian (though that’s worked for Cards Against Humanity). It just means you’re not creating saccharine content that everybody agrees with all the time. If nobody disagrees with what you’re creating, then there’s no conversation to be had. )
2. Lean into linguistic quirks
The linguist Ferdinand de Saussure argued there are two kinds of language:
- La langue, the formal language. These are words we print in the dictionary and teach in school. La langue existed before you and it will exist long after you.
- La parole, the speech of everyday life. La parole is as diverse and varied as the people who speak it. It’s Thais using “555” for “hahaha”, emojis and slang, creole languages that mash multiple languages. This is where language evolves.
No language model can keep up with la parole. Using jargon, emojis, unusual phrases, and memes signals humanity.
3. Retreat into the cozyweb
In the 90s, there was the Internet and various Intranets. Over time, social media (but let’s be real, mostly Meta) took over and turned the clearweb an amorphous, indistinguishable platform.
The cozyweb is a modern intranet: instead of a purpose-built walled garden, the cozyweb is assembled more organically. There is no single login page or point of entry. It’s the sum of various Discords and X (formerly known as Twitter) profiles and links copy/pasted into a private Slack channel.
Private socialization will only become increasingly prized. People are already migrating into micro communities like subreddits and Discord chats, but those have the downside of being open to everyone as long as you know it exists. Private groups have the advantage of being invite-only.
On the other hand, as private groups splinter and fracture into even smaller groups, you risk echo chambers and further isolation (narcissim of small differences).
4. Some sort of human identification?
We’ve used CAPTCHAs to identify robots. What if we reversed that and implemented a way to identify humans? Before crypto, Proof of Work was an anti-spam mitigation technique.
There’s several problems with this, not least that it requires a verifying institution. Who would you trust to do this? A university? Government? Would you have to send someone your ID? What implications does that have for privacy and security?