firecrawl.dev

Every Web Scrape Is a Potential Injection: Firecrawl's Lockdown Mode

Firecrawl Lockdown, 프롬프트 인젝션, AI 에이전트 보안, web-agent, MCP 보안Dev

AI agents bypass model-card defenses — Comment and Control attack hits Claude Code, Gemini CLI, Copilot

Lockdown Mode: cache-only scraping for AI agent security

Lockdown Mode — Firecrawl Docs

On April 21, 2026, AI security researcher Aonan Guan and colleagues from Johns Hopkins slipped a one-line malicious command into a GitHub PR title. That single line was enough to make three automated code review agents — Anthropic's Claude Code Security Review, Google's Gemini CLI Action, and GitHub Copilot Agent — all dump their own API keys into PR comments.

Firecrawl's answer, released nine days later on April 30, is Lockdown Mode. Add a single flag to /scrape (lockdown: true) and it only returns results from already-indexed cache — no network requests to the target URL at all.

TL;DR

Agent outbound request → injectable channel → Lockdown uses index only → data exfiltration blocked

Why Lockdown, and Why Now?

"runtime is the blast radius." That's Enkrypt AI CSO Merritt Baer, quoted in VentureBeat, and it cuts right to the issue. An AI agent might clear safety barriers at the LLM reasoning layer, but control breaks down at the tool execution layer — bash runs, git pushes, API POSTs, web scrapes.

Web scraping in particular becomes an external channel the moment the LLM decides which URL to fetch. If an attacker plants instructions on a page, email, or issue, the agent scrapes that page, ingests the instructions, then fires an outbound request to a different domain — leaking context along the way. The URL's path, query string, and headers become a full data exfiltration channel.

A quick timeline of April 2026 makes the picture clear.

April 15
Microsoft Copilot Studio + Salesforce Agentforce breached by the same injection class. A new agentic AI CVE category emerges.
April 21
Comment and Control attack disclosed. All three — Claude Code, Gemini CLI, and Copilot — leak secrets via a single PR title. CVSS 9.4 Critical, but Anthropic's bounty was $100.
April 30
Firecrawl ships Lockdown Mode: a single flag that blocks all outbound requests, cache-only.
Same week
OX Security reports 200,000 MCP servers exposed externally. Firecrawl bundles web-agent open source + Lockdown in the same release.

What Actually Changes?

The difference between standard /scrape and Lockdown Mode comes down to one line.

Item	Standard `/scrape`	Lockdown Mode
Outbound request to target URL	Yes	Never
robots.txt fetch	Yes	Blocked (engine layer)
Zero Data Retention	Add-on fee	Default + no surcharge
Cache miss behavior	Live scrape	`SCRAPE_LOCKDOWN_CACHE_MISS` error
Coverage	—	9 SDKs + CLI + MCP

The last two rows are what matter. There's no silent fallback on cache miss — meaning there's no "not in cache, let me just grab it real quick" escape hatch. If a URL isn't cached, the caller finds out immediately, and external requests stay at zero.

And the same flag rolls out across 9 SDKs, the CLI, and MCP. The biggest operational win: no need for separate security policies per tool. One lockdown: true covers everything.

Heads Up: Lockdown only blocks the network. Model calls are a separate problem. Lockdown Mode closes the outbound scrape channel — it doesn't block injections already embedded in cached content. Content sanitization is still your responsibility.

Getting Started

Seed your cache first
Lockdown only returns already-indexed pages. Run your target domains through normal mode at least once to get them into cache.
Add the lockdown: true flag
Same across Python, Node, Go, Rust, Java,.NET, Ruby, PHP, and Elixir SDKs. CLI uses --lockdown; MCP server takes the same flag as a tool argument.
Handle cache misses
Catch SCRAPE_LOCKDOWN_CACHE_MISS errors and route them to an ops queue, or set up a separate worker to pre-seed. No silent fallback is intentional behavior.
Check the pricing
Cache hit = 5 credits, miss = 1 credit, ZDR surcharge waived. Security mode isn't more expensive than standard — that's by design.
Pair it with web-agent
Firecrawl's web-agent (open source, MIT, 1.1k stars, Deep Agents-based), released the same week, sets Lockdown as the default mode — bringing autonomous agent outbound risk structurally to zero.

FAQ

Does Lockdown work for crawl (/crawl) or search (/search) too?

Currently /scrape only. /crawl, /map, /extract, and /search all require outbound requests by nature, so they're out of scope for lockdown. The standard pattern for security-sensitive autonomous agents is two-step: seed target URLs from /search results in normal mode first, then run follow-up analysis with /scrape --lockdown.

Anthropic, OpenAI, and Google are all building their own runtime guards — do you really need Firecrawl Lockdown on top?

VentureBeat's April 21 analysis nailed it. All three companies' system cards only measure and publish model-layer injection resistance — they don't quantify resistance at the tool execution layer (scrape, shell, API calls). Anthropic has some runtime guards in Claude Code Auto Mode, but they're not documented in the system card, making them hard to verify. The key is that model-layer guards and tool-layer infrastructure are separate problems. Lockdown is one answer for the tool layer.

What about real-time data — news, pricing? Is Lockdown off the table?

Lockdown serves cache up to two years old. It's not a fit for real-time data. That said, it's exceptionally strong for cases where the URL itself leaks information — competitor pages, internal hostnames, URLs with identifiers baked into the path. The recommended pattern: separate your real-time and security workflows, and run only the latter through Lockdown.

If I self-host the open source version, can I still use Lockdown?

Firecrawl itself is open source (MIT), but Lockdown depends on Firecrawl's index and cache infrastructure. To get the same guarantees on a self-hosted setup, you'd need to build your own cache layer — or just use Firecrawl's hosted mode. The web-agent framework is separately self-hostable.

Deep Dive Resources

Lockdown Mode Official Blog The launch post written by Firecrawl co-founder Eric Ciarla himself. Covers four use cases and the cache matching rules — the most accurate primary source. firecrawl.dev

Comment and Control Technical Disclosure Aonan Guan's original disclosure. Shows exactly which PR title line was used against Claude Code, Gemini CLI, and Copilot — and why it worked. Full reproduction cases included. venturebeat.com

Firecrawl web-agent The open source autonomous agent released the same month. Reference architecture for pairing with Lockdown. github.com/firecrawl/web-agent

Lockdown Official Docs Cache matching rules, billing model, and the SCRAPE_LOCKDOWN_CACHE_MISS response spec. docs.firecrawl.dev

FAQ

Does Lockdown work for crawl (/crawl) or search (/search) too?

Currently /scrape only. /crawl, /map, /extract, and /search all require outbound requests by nature, so they're out of scope for lockdown. The standard pattern for security-sensitive autonomous agents is two-step: seed target URLs from /search results in normal mode first, then run follow-up analysis with /scrape --lockdown.

Anthropic, OpenAI, and Google are all building their own runtime guards — do you really need Firecrawl Lockdown on top?

What about real-time data — news, pricing? Is Lockdown off the table?

If I self-host the open source version, can I still use Lockdown?

Written by Rush

Tracking where business meets AI.

Did you find this reference helpful?

Get curated references delivered to your inbox weekly

Share this reference

Antioch — Meet the Cursor for Robot AI

Physical AI startups no longer need to rent warehouses or build million-dollar test facilities. Antioch brings software-speed development to robotics through cloud simulation — and just raised $8.5M seed to prove it.

Explore more AI workflow guides on similar topics

$20K and 12 AI Tools Built a $1.8B Telehealth Company — And Then the Red Flags Arrived

morningbrew.com

Medvi telehealth, AI startup leverage, GLP-1 startup, one-person unicorn, AI operations

$20K and 12 AI Tools Built a $1.8B Telehealth Company — And Then the Red Flags Arrived

Matthew Gallagher built Medvi, a GLP-1 telehealth startup, in 14 months with $20,000 and AI tools. 2 employees. 16.2% net margin. $401M in year one. Here's how the model works — and where it's breaking.

AI That Works While You Sleep — Automating Recurring Tasks with Claude Code Scheduled Task

substackcdn.com

What if your code review was already done when you woke up, and your newsletter

AI That Works While You Sleep — Automating Recurring Tasks with Claude Code Scheduled Task

What if your code review was already done when you woke up, and your newsletter sources were already organized? Here's how to automate recurring tasks with Claude Code Scheduled Task.

Next →Antioch — Meet the Cursor for Robot AI