The price of a single AI token has fallen 1,000-fold in three years. What cost $60 per million tokens in the GPT-3 era now runs at just $0.06 for equivalent performance.
Yet enterprise AI bills have tripled. Prices fell — spending went up. This paradox has a name. In 1865, British economist William Stanley Jevons discovered that improvements in coal efficiency didn't reduce coal consumption — they caused it to explode. That's exactly what we're watching happen with AI.
Here's what everyone assumed
"If AI costs come down, our AI bill will shrink." Sounds obvious, right? If token prices drop by half, you should be able to do the same work for half the cost.
The actual data went the opposite direction. Enterprise AI spending jumped from $11.5B in 2024 to $37B in 2025 — a 320% increase — while per-token costs dropped 1,000x.
On the a16z podcast, Marc Andreessen called AI "the biggest tech revolution I've ever experienced." He also said "the unit cost of intelligence is collapsing faster than Moore's Law." And then — in the same breath — "the market is still very early".
Those two statements have to be read together. When costs drop, demand explodes — that's the heart of Jevons Paradox. A single agentic workflow burns 50 to 500x more tokens than a simple chat. Cheaper tokens unlock more complex tasks, which require even more tokens. And 72% of real AI costs aren't even in your model invoice — they're in orchestration, retries, and monitoring.
China lit the price war on fire
There's one more variable in this equation: China.
Andreessen openly acknowledged on the podcast that "DeepSeek surprised Silicon Valley". And the benchmarks back it up — DeepSeek, Kimi (Moonshot AI), Qwen (Alibaba), and ByteDance models reached near-parity with Claude 3.5 Sonnet and GPT-4o within just 12 months of release.
| Top US Models | Top Chinese Models | |
|---|---|---|
| Overall Benchmark | Claude Opus 4.6 — 88pts | DeepSeek V4 Pro — 87pts |
| Coding Score | Claude: 93.9% (SWE-Bench) | DeepSeek: 91.2% (SWE-Bench) |
| API Price Gap | Baseline (100%) | ~10–13x cheaper |
| Open Source | Closed source | Open weights, self-hostable |
The performance gap is 5–7% on benchmarks. The cost gap is more than 10x. Andreessen sees this as "reshaping global price competition through an open-source strategy".
The age of Chinese AI researchers is also worth noting. As Andreessen mentioned, the lead researchers are 22–24 years old. That's not inexperience — it means they carry no preconceptions about existing paradigms. These teams pushed GPU efficiency to the extreme, achieving comparable performance at a fraction of the cost.
Following a16z's projection, the current GPU shortage will flip to oversupply within five years, as big tech companies build their own chips and AMD plus Chinese manufacturers scale production. When that happens, AI unit costs fall even further — and Jevons Paradox goes into overdrive.
Andreessen's core argument
He called AI "the biggest tech revolution of my lifetime" — bigger than the internet, comparable to electricity and the microprocessor. Yet he simultaneously said the market is "still very early". How can hundreds of millions of ChatGPT users still be "early"? His point: the product forms are still immature. Jevons Paradox hasn't even really started — far more use cases still have to emerge before it reaches full force.
What to audit in your business right now
- Break down your AI spend structure
Separate your model API costs from everything else. 72% of real AI costs live outside the model invoice — in orchestration, retries, and monitoring. If your bill looks wrong, start here. - Simulate token consumption before deploying agents
Agentic workflows burn 50–500x more tokens than simple chats. Run a small-scale pilot first, measure actual tokens per task, and use that as your budget baseline. - A/B test open-source and Chinese models
DeepSeek V4 Pro is 5–7% behind top US models on benchmarks and 10–13x cheaper on API costs. Start with specific workflows rather than a wholesale switch. Self-hosting is an option too. - Set monthly AI budget caps
Uber burned through its annual AI budget in four months and had to impose a $1,500/employee monthly cap. Knowing about Jevons Paradox doesn't help if you have no hard spending limits. Set caps by team and function. - Reinvest savings into new automation
Don't book token price savings as cost reduction — reinvest them into new workflow automation. That's how you use Jevons Paradox offensively. If you're not planning this now, a competitor will capture that edge first.
More to explore
Marc Andreessen's 2026 Outlook: AI Timelines, US vs. China, and The Price of AI Andreessen's 81-minute a16z AMA podcast — his full take on intelligence cost collapse and the China competition a16z.com
2026 Early Interview with Marc Andreessen: AI Revolution Just Started, Intelligence Price Collapsing 36kr's English summary of Andreessen's key statements on cost collapse and the rise of Chinese AI 36kr.com
a16z's 2026 Outlook: Shortages Will Eventually Lead to Surpluses 36kr's analysis of the GPU oversupply thesis and structural changes in the AI chip market 36kr.com
The Inference Cost Paradox: Why Generative AI Spending Surged 320% Despite Per-Token Costs Dropping 1,000x A deep data analysis of how Jevons Paradox is playing out in enterprise AI spending arturmarkus.com
AI Token Cost Over Time: Down 99.7%, Bills Up 3x NavyaAI's data report on the token price paradox navyaai.com
Best Chinese LLMs in 2026: DeepSeek V4, Kimi K2.6, Qwen and Every Model Ranked Benchmark comparisons between Chinese and US AI models, with cost analysis benchlm.ai



