The people who built Claude Code shared how their own team actually uses skills. This isn't just a "here's how to build one" guide — it's a breakdown of what works and what doesn't after running hundreds of skills in production. It was written by Thariq Shihipar — an engineer on the Claude Code team and the person behind the 1.2M-view "Claude Code is All You Need."
What is this about?
Anthropic develops Claude Code internally while also being its biggest user. Nine teams — security, legal, infrastructure, frontend, data, and more — each build and use their own skills, totaling hundreds.
What was shared this time comes in two parts. First, Best Practices and a Skills guide added to the official Claude Code documentation. Second, internal operational experience that Thariq shared on X. If the existing 33-page PDF guide was about "how to build skills," this is about "what actually matters after running hundreds of them in production".
The core message is clear — a skill's success or failure is almost entirely determined by the design phase before writing any code. Define 3 use cases first, before writing a single line of code.
What's actually changing?
The existing 33-page guide was "here's what skills are and how to build them." This release is on a different level — it covers what works and what doesn't in practice, and how to manage skills at team scale.
| Existing 33p Guide | Internal Playbook (This Release) | |
|---|---|---|
| Focus | Skill concepts and how to build them | Patterns and anti-patterns from running hundreds |
| Design principles | SKILL.md format explanation | Define 3 use cases first → then write |
| Quality control | Basic testing mentioned | 3-stage verification: trigger / function / performance |
| Team sharing | Paths mentioned only | Enterprise → Personal → Project priority system + plugin distribution |
| Context management | Briefly mentioned | 2% character budget rule + 500-line cap + progressive disclosure |
| Trigger control | Basic explanation | Forced triggers via Hooks + disable-model-invocation separation |
What's especially impressive is the 70% rule. It's a pattern that appears consistently across Anthropic's internal teams — Claude Code reliably handles about 70% of implementation work, and humans handle the remaining 30%. That 30% is precisely skill design, verification loops, and edge case handling.
The essentials: How to get started
- Start by defining 3 use cases
Before building a skill, write out 3 scenarios: "who says what, and what result should come out." Anthropic says skill quality drops dramatically when this step is skipped.Example scenario: "Review this PR" → 3 parallel agents review security / code quality / efficiency → consolidated report - Write the description precisely
Claude loads skills via "progressive disclosure." First, only the name + description go into the system prompt (~100 tokens), and only when a user request matches does it read the full SKILL.md. If the description is vague, the skill won't trigger. If it's too broad, it'll fire at the wrong times. - Put a verification loop in the body
This is what Anthropic emphasizes most. Add an execute → verify → fix → re-verify loop as a checklist in the skill body. For example, a code generation skill should explicitly state "after generation, run lint → run tests → fix on failure → re-run." - Keep it under 500 lines
When SKILL.md gets too long, Claude misses instructions. Put detailed content in a references/ folder and keep only the table of contents and key directives in SKILL.md. If reference files are long, put a table of contents at the top so Claude can grasp the structure when it reads head -100. - Share with the team
Put skills in.claude/skills/and commit to git — the whole team can use them. Personal ones go in~/.claude/skills/, and org-wide distribution uses managed settings. Priority order is Enterprise > Personal > Project.
5 Lessons Anthropic Learned in Production
Extracted from the official docs and Thariq's posts — key lessons from operating hundreds of skills.
Lesson 1: Don't write "When to Use" in the SKILL.md body
By the time the body loads, the trigger decision has already been made from the description. Writing "this skill is used when~" in the body gets read only after it's already fired, wasting context. Put only "how to execute" in the body.
Lesson 2: Skills with side effects must be manual-trigger only
Set disable-model-invocation: true. Skills like deployment, commits, or sending Slack messages should never auto-fire with Claude saying "looks like the code is ready, I'll deploy it."
Lesson 3: Too many skills means nothing works
Claude manages the skill list within a 2% budget (minimum 16,000 characters) of the context window. If skills exceed this budget, some get excluded entirely. Check with /context.
Lesson 4: Use Hooks to boost trigger probability
The #1 reason a skill doesn't fire is Claude failing to match the description. Hooks let you directly append skill recommendations to user input — like auto-inserting "CRITICAL SKILLS: session-management" when someone says "help me implement this feature."
Lesson 5: Use context: fork to protect the main context
Skills like research or code review read lots of files and pollute the main context. Setting context: fork runs them in an isolated sub-agent, returning only a summary of results.
Real-World Skill Structure Examples
Based on patterns Anthropic actually uses, here's an effective skill structure for team-scale operation.
| Skill Type | Examples | Key Setting | Location |
|---|---|---|---|
| Reference (knowledge) | api-conventions, legacy-system-context | user-invocable: false (Claude-only) | Project (.claude/skills/) |
| Task (execution) | deploy, fix-issue, pr-summary | disable-model-invocation: true | Project (.claude/skills/) |
| Personal workflow | explain-code, debug-session | Default (auto + manual both) | Personal (~/.claude/skills/) |
| Org standard | code-review, security-audit | Deployed via managed settings | Enterprise |
The key here is clearly separating "reference skills" from "task skills". References are knowledge that Claude uses as context, while tasks are workflows that users explicitly execute. Mix them up and skills either fire at the wrong time or don't fire when needed.
How this differs from the previous claude-skills-guide post
The previous post (Claude Skills 33-Page Guide: Complete Breakdown) covered Anthropic's published PDF guide — focusing on the concept of skills, SKILL.md writing, and basic structure. This post covers the internal production experience that came after. It's a synthesis of official doc updates + Thariq's X thread + community analysis.




