A developer opened his IDE dashboard and saw "PCW: 98%" — meaning 98% of his code was supposedly written by AI. That felt off, so he ran experiments. The result? When he typed 49 characters by hand, the system credited only 46 to him. A copy-paste of his own work? Zero credit. The AI took 100% of the score.
So what actually happened?
In April 2026, William O''Connell published his findings. His company''s Windsurf dashboard showed his "% new code written by Windsurf" at 98%. His gut said maybe 10–20%. So if AI was producing 49x more code than he was, why hadn''t he blown through his token budget? Why hadn''t he been promoted (or fired)?
Windsurf''s own blog states: "Customers should expect PCW values of 85%+, often 95%+. This is not a hallucination." But once you dig into how it''s computed, "not a hallucination" gets a lot harder to defend.
O''Connell tried mitmproxy first, but Windsurf encodes traffic with protobuf. Luckily the analytics dashboard exposed user_bytes, codeium_bytes, total_bytes, and percent_code_written in plain JS. From there, he uncovered five biases.
Five ways Windsurf''s PCW shaves down human contribution
1. Auto-closed brackets/quotes don''t count as typing. Type 49 chars, get credit for 46.
2. Pasting doesn''t increment user_bytes. Move your own code to another file? Zero credit.
3. Refactors get fully credited to AI. Ask AI to move a function you wrote? 100% AI.
4. Sessions reset memory. After restart, the editor forgets where each line came from.
5. "Measured at commit time" — not really. The counters move while you type.
The decisive test: O''Connell typed one line into human_file.js (49 chars), asked AI to write a similar one in ai_file.js. Then he pasted his own function into the AI file and asked AI to copy his function back. Result: AI was credited with more than twice the code he was, even though both files were almost identical in length.
Cursor''s "AI Share of Committed Code" is built on cleaner git plumbing, but it has its own failure mode. O''Connell pasted a 100-line JS file and asked Cursor to convert double quotes to single quotes. AI touched 49 of 93 non-blank lines — yet Cursor reported 100% of all 100 lines as AI-authored. Different mechanism, same direction: AI share gets inflated.
Why this is more than a metrics-accuracy problem
In January 2026, Anthropic''s Boris Cherny posted on X that 100% of his code is now written by Claude, and "pretty much 100%" company-wide. Satya Nadella claimed 30% at Microsoft. Google: 75%. Numbers like these make great press releases — and great fundraising material for AI vendors.
Then there''s the METR randomized controlled trial: 16 senior open-source developers, real PRs in their own repos. AI-allowed group was 19% slower. The kicker? They believed they''d been 20% faster. Self-reported speedup is essentially noise.
GitClear analyzed 211 million changed lines across five years. Refactoring dropped from 25% (2021) to under 10% (2024). Copy/paste rose from 8.3% to 12.3% — the first time in history "pasted" beat "moved" lines. The volume went up, the codebase got worse, in measurable ways.
| Vendor metrics | What you should track instead | |
|---|---|---|
| Unit | Bytes / lines (volume) | PR cycle time, post-merge fix rate (quality) |
| Bias direction | Inflates AI share | Neutral — validated post-merge |
| When measured | On keystroke / commit | 7–30 days after merge |
| What it justifies | "AI does it all — cut headcount" | "Where is verification debt accumulating?" |
| Legal exposure | "Most code isn''t copyrightable" | Conservative human-attribution |
The damage isn''t numerical accuracy — it''s narrative. The sentence "90% of our code is AI" creates a gut feeling executives can''t un-feel. They start asking "why do we need this many engineers?" And U.S. courts have made clear AI-generated work isn''t copyrightable, so "most of our code is AI" is a legal-team nightmare too.
Even on Korean dev forums, the same Goodhart''s Law warning is showing up: AI code-acceptance rates as KPIs incentivize accepting without review. And one CTO summed it up — "AI writes code in one minute, then humans spend ten reviewing it." The metric lies, but the cost shifts to review time anyway.
The PR-review checklist that catches the real signal
- Treat dashboard PCW/AI Share numbers as directional only
Even Windsurf calls it a "directional proxy." Absolute values are meaningless. Track trends within the same team, same tool, same quarter. Never compare across tools or teams. - Read the diff layout first
AI-authored PRs often touch files they didn''t need to touch (the "100 lines all AI" pattern). When the diff is unusually wide, the first question is "what was the actual intent?" before reviewing logic. - Suspect tests that look too clean
METR found AI tends to write self-fulfilling tests with hardcoded values. If assertions echo input, or there''s only a happy path, that''s a red flag. Add one failing case and watch what breaks. - Wire up duplicate-code detection in CI
GitClear''s 4x copy-paste explosion is the fastest signal to catch. jscpd, SonarQube duplications, or even a grep script in GitHub Actions works. AI tends to rewrite similar functions instead of reusing existing ones. - Make AI defend its own code
Adam Ferrari''s pattern: feed the PR diff back to a model and ask why each change was needed and what could break. If "the author" can''t explain it, you didn''t save reviewer time — you deferred it.
One-liner for managers
When someone reports "X% of our code is AI," ask two questions: ① How is that number computed? (PCW? Cursor Share? a custom definition?) ② How have post-merge fix rate and rollback rate moved in the same quarter? If you can''t answer both, don''t use the number for decisions.
Going deeper
Your AI Might be Lying to Your Boss O''Connell''s original write-up — full reverse-engineering of Windsurf and Cursor metrics williamoconnell.me
Percentage of Code Written Windsurf''s official PCW explainer — 85–95% is "normal" plus six caveats windsurf.com
METR: Early-2025 AI on Experienced Developers RCT with 16 senior devs — 19% slowdown, 20% perceived speedup metr.org
GitClear AI Copilot Code Quality 2025 211M lines analyzed — 4x copy-paste, refactoring down from 25% to 10% gitclear.com
Anthropic and OpenAI engineers claim 100% AI code Boris Cherny and Roon say none of their code is hand-written anymore fortune.com
Quantifying AI Coding Impact Adam Ferrari on PCW limits and alternative metrics adamferrari.substack.com
AI Code Review Reliability — Engineering Org Principles Korean enterprise case — AI review on top, senior review above it brunch.co.kr




