Anthropic 내부 팀의 Claude Code 스킬 운용 구조 다이어그램

miro.medium.com

How Anthropic Manages Hundreds of Skills — The Internal Playbook Revealed

Anthropic shared how it organizes and operates hundreds of internal AI skills atDev

Thariq (@trq212) — Claude Code Skills Internal Lessons

Best Practices for Claude Code — Anthropic Official Docs

Extend Claude with Skills — Claude Code Docs

The people who built Claude Code shared how their own team actually uses skills. This isn't just a "here's how to build one" guide — it's a breakdown of what works and what doesn't after running hundreds of skills in production. It was written by Thariq Shihipar — an engineer on the Claude Code team and the person behind the 1.2M-view "Claude Code is All You Need."

3-Second Summary

Skills = reusable workflow assets → description determines the trigger → Keep under 500 lines → Verification loops make or break quality → Share across the team via git

What is this about?

Anthropic develops Claude Code internally while also being its biggest user. Nine teams — security, legal, infrastructure, frontend, data, and more — each build and use their own skills, totaling hundreds.

What was shared this time comes in two parts. First, Best Practices and a Skills guide added to the official Claude Code documentation. Second, internal operational experience that Thariq shared on X. If the existing 33-page PDF guide was about "how to build skills," this is about "what actually matters after running hundreds of them in production".

The core message is clear — a skill's success or failure is almost entirely determined by the design phase before writing any code. Define 3 use cases first, before writing a single line of code.

9 teams

Anthropic teams using skills internally

Hundreds

Skills in production use

Context window budget for skills

What's actually changing?

The existing 33-page guide was "here's what skills are and how to build them." This release is on a different level — it covers what works and what doesn't in practice, and how to manage skills at team scale.

	Existing 33p Guide	Internal Playbook (This Release)
Focus	Skill concepts and how to build them	Patterns and anti-patterns from running hundreds
Design principles	SKILL.md format explanation	Define 3 use cases first → then write
Quality control	Basic testing mentioned	3-stage verification: trigger / function / performance
Team sharing	Paths mentioned only	Enterprise → Personal → Project priority system + plugin distribution
Context management	Briefly mentioned	2% character budget rule + 500-line cap + progressive disclosure
Trigger control	Basic explanation	Forced triggers via Hooks + disable-model-invocation separation

What's especially impressive is the 70% rule. It's a pattern that appears consistently across Anthropic's internal teams — Claude Code reliably handles about 70% of implementation work, and humans handle the remaining 30%. That 30% is precisely skill design, verification loops, and edge case handling.

The essentials: How to get started

Start by defining 3 use cases
Before building a skill, write out 3 scenarios: "who says what, and what result should come out." Anthropic says skill quality drops dramatically when this step is skipped.
Example scenario: "Review this PR" → 3 parallel agents review security / code quality / efficiency → consolidated report
Write the description precisely
Claude loads skills via "progressive disclosure." First, only the name + description go into the system prompt (~100 tokens), and only when a user request matches does it read the full SKILL.md. If the description is vague, the skill won't trigger. If it's too broad, it'll fire at the wrong times.
Put a verification loop in the body
This is what Anthropic emphasizes most. Add an execute → verify → fix → re-verify loop as a checklist in the skill body. For example, a code generation skill should explicitly state "after generation, run lint → run tests → fix on failure → re-run."
Keep it under 500 lines
When SKILL.md gets too long, Claude misses instructions. Put detailed content in a references/ folder and keep only the table of contents and key directives in SKILL.md. If reference files are long, put a table of contents at the top so Claude can grasp the structure when it reads head -100.
Share with the team
Put skills in .claude/skills/ and commit to git — the whole team can use them. Personal ones go in ~/.claude/skills/, and org-wide distribution uses managed settings. Priority order is Enterprise > Personal > Project.

5 Lessons Anthropic Learned in Production

Extracted from the official docs and Thariq's posts — key lessons from operating hundreds of skills.

Lesson 1: Don't write "When to Use" in the SKILL.md body

By the time the body loads, the trigger decision has already been made from the description. Writing "this skill is used when~" in the body gets read only after it's already fired, wasting context. Put only "how to execute" in the body.

Lesson 2: Skills with side effects must be manual-trigger only

Set disable-model-invocation: true. Skills like deployment, commits, or sending Slack messages should never auto-fire with Claude saying "looks like the code is ready, I'll deploy it."

Lesson 3: Too many skills means nothing works

Claude manages the skill list within a 2% budget (minimum 16,000 characters) of the context window. If skills exceed this budget, some get excluded entirely. Check with /context.

Lesson 4: Use Hooks to boost trigger probability

The #1 reason a skill doesn't fire is Claude failing to match the description. Hooks let you directly append skill recommendations to user input — like auto-inserting "CRITICAL SKILLS: session-management" when someone says "help me implement this feature."

Lesson 5: Use context: fork to protect the main context

Skills like research or code review read lots of files and pollute the main context. Setting context: fork runs them in an isolated sub-agent, returning only a summary of results.

Real-World Skill Structure Examples

Based on patterns Anthropic actually uses, here's an effective skill structure for team-scale operation.

Skill Type	Examples	Key Setting	Location
Reference (knowledge)	api-conventions, legacy-system-context	user-invocable: false (Claude-only)	Project (.claude/skills/)
Task (execution)	deploy, fix-issue, pr-summary	disable-model-invocation: true	Project (.claude/skills/)
Personal workflow	explain-code, debug-session	Default (auto + manual both)	Personal (~/.claude/skills/)
Org standard	code-review, security-audit	Deployed via managed settings	Enterprise

The key here is clearly separating "reference skills" from "task skills". References are knowledge that Claude uses as context, while tasks are workflows that users explicitly execute. Mix them up and skills either fire at the wrong time or don't fire when needed.

How this differs from the previous claude-skills-guide post

The previous post (Claude Skills 33-Page Guide: Complete Breakdown) covered Anthropic's published PDF guide — focusing on the concept of skills, SKILL.md writing, and basic structure. This post covers the internal production experience that came after. It's a synthesis of official doc updates + Thariq's X thread + community analysis.

🔗

Want to dig deeper?

Claude Code Skills Official Documentation

Everything from frontmatter to directory structure to advanced patterns.

Claude Code Best Practices

Patterns validated by Anthropic's internal teams. From CLAUDE.md to sub-agents to fan-out.

Anthropic's Guide to Good Skills — Korean Summary

A Korean-language tech blog summarizing the 33p PDF + official docs.

The Complete Guide to Building Skills for Claude (PDF)

The original 33-page official guide from Anthropic.

FAQ

How is this different from the previous claude-skills-guide post?

The previous post summarized Anthropic's 33-page PDF guide, covering skill concepts and basic writing methods. This post covers the internal production experience that came after — patterns, anti-patterns, and team sharing systems learned from running hundreds of skills.

What happens if you have too many skills?

Claude uses 2% of the context window (minimum 16,000 characters) as the budget for skill descriptions. If skills exceed this budget, some get excluded entirely. You can check with the /context command.

What exactly is a verification loop?

It's a checklist that automatically verifies results after skill execution. For example, with a code generation skill, you explicitly write 'generate → run lint → run tests → fix on failure → re-run' in the SKILL.md body. Anthropic says this single thing improves skill quality the most.

When should you use context: fork?

Use it for skills like research or code review that read lots of files. It runs in an isolated sub-agent so it doesn't pollute the main conversation's context. However, it only makes sense for skills with explicit tasks.

Written by Kevin

Dissecting AI tools and workflows from a developer's lens.

Did you find this reference helpful?

Get curated references delivered to your inbox weekly

Share this reference

Antioch — Meet the Cursor for Robot AI

Physical AI startups no longer need to rent warehouses or build million-dollar test facilities. Antioch brings software-speed development to robotics through cloud simulation — and just raised $8.5M seed to prove it.

Explore more AI workflow guides on similar topics

$20K and 12 AI Tools Built a $1.8B Telehealth Company — And Then the Red Flags Arrived

morningbrew.com

Medvi telehealth, AI startup leverage, GLP-1 startup, one-person unicorn, AI operations

$20K and 12 AI Tools Built a $1.8B Telehealth Company — And Then the Red Flags Arrived

Matthew Gallagher built Medvi, a GLP-1 telehealth startup, in 14 months with $20,000 and AI tools. 2 employees. 16.2% net margin. $401M in year one. Here's how the model works — and where it's breaking.

AI That Works While You Sleep — Automating Recurring Tasks with Claude Code Scheduled Task

substackcdn.com

What if your code review was already done when you woke up, and your newsletter

AI That Works While You Sleep — Automating Recurring Tasks with Claude Code Scheduled Task

What if your code review was already done when you woke up, and your newsletter sources were already organized? Here's how to automate recurring tasks with Claude Code Scheduled Task.

Next →Antioch — Meet the Cursor for Robot AI