They deployed AI coding tools to 5,000 engineers. Four months later, the annual budget was gone. This is Uber's story. It happened in April 2026.

30-second summary
Deploy AI coding tools Token consumption spikes 18.6× Budget gone in 4 months Licenses forcibly revoked What happens without governance

Uber is not the only one

Microsoft went through the same thing. They gave engineers Claude Code in December 2025 — then canceled most licenses six months later. The reason was simple: token-based bills were draining the annual AI budget far ahead of schedule.

18.6×
Per-developer token consumption growth in 9 months
$40K
One engineer's monthly token bill (peak)
68%
Orgs with no visibility into per-developer AI costs

NVIDIA VP Bryan Catanzaro put it bluntly: "For my team, the cost of compute is far beyond the costs of the employees." Venture investor Jason Calacanis revealed his org's Claude API agent costs hit $300/day — annualized at $109,500. That is an employee salary.

Here is how Uber got there. They deployed Claude Code to 5,000 engineers. Adoption climbed from 32% in February to 84% by March. About 70% of committed code came from AI. Productivity metrics looked great. But COO Andrew Macdonald said the link to better consumer features was not there yet.

Why cheaper tokens actually cost more

Here is the thing — this sounds paradoxical but it is actually happening. Goldman Sachs forecasts token consumption will grow 24× by 2030. Token prices may fall 90% in that window — but total costs will still go up.

Two reasons. First, agentic AI uses 10× more tokens for the same task. It does not just answer a question — it plans, writes code, verifies, rewrites, and loops. Second, cheaper tokens mean more consumption. Economists call this the Jevons paradox: when coal engines got more efficient, coal consumption went up, not down.

OpenAI shifted the conversation

OpenAI's Alexander Embiricos said: "Our conversations are never about capability anymore. Now it is about spending visibility, auditability, token controls, and model efficiency."

Faros AI research found something interesting. Engineers who used the most tokens were about 2× more productive. But they consumed 10× the tokens. Productivity went up — but costs went up way more. Bug rates and rewrite frequency went up too. That is why ROI calculations get complicated.

Uncontrolled deployment Governed deployment
Cost predictability 85% miss targets by 10%+ 60–80% reduction possible
Visibility 68% cannot track per-dev costs Real-time rollup by team, project, model
Model selection Everyone defaults to premium models Auto-routing by task complexity
Budget limits Only 12% have any budget controls Per-team and per-user caps with alerts

3-step token governance framework to start today

The fix is not cutting the tools — it is controlling them. Cursor launched Organizations on June 3, 2026, targeting exactly this problem. The Linux Foundation's Tokenomics Foundation formally launches July 2026 to create open standards.

  1. Visibility first: real-time dashboards by team
    Move from monthly invoices to real-time dashboards breaking down consumption by team, project, and model. Tools like Datadog, New Relic, and Pay-i do this. Cursor Organizations gives a single-pane view across the org. One month of data is usually enough to see exactly where the money is going.
  2. Model routing: match model to task complexity
    Route simple summarization and repetitive tasks to cheap models ($0.04–0.10/M tokens). Reserve premium models ($100–180/M) for complex multi-file work. The price spread between cheapest and most expensive is up to 4,500×. Done right, this alone cuts 60–80% of costs. Segment by team function too — engineering and product get frontier models; marketing and finance get restricted access.
  3. Budget caps: set per-team and per-user limits explicitly
    Chamath Palihapitiya's rule: without limits, costs spiral fast, and agents need to demonstrate at least 2× the productivity of other staff to justify the spend. Set consumption caps at the API key level with alert-then-block flows. Cursor Enterprise ships a 3-tier budget hierarchy (group, team, org) out of the box.

Still running without controls?

One healthcare enterprise consumed 1 trillion tokens in 6 months, generating $6M+ in unplanned costs. That is what unconstrained agent deployment looks like. "Deploy now, govern later" does not work here.

Want to go deeper?

The token bill comes due TechCrunch deep-dive into the industry scramble over AI coding costs techcrunch.com

Uber burned its AI budget in 4 months — COO questions ROI Fortune interview with the Uber COO on the disconnect between AI spend and consumer value fortune.com

AI Token Cost Enterprise: Stop Budget Blowouts in 2026 Practical governance frameworks and model routing strategies elvex.com

Cursor Organizations: Govern Enterprise AI Coding at Scale Full breakdown of Cursor 3-tier governance hierarchy digitalapplied.com

Microsoft Cancels Claude Code Licenses, Pushes Engineers to Copilot CLI The backstory and cost comparison behind Microsoft's decision opentools.ai

Microsoft AI Cost Problem: Using the tech is more expensive than paying employees Fortune analysis of the enterprise AI cost paradox fortune.com