Harvey's AI legal agents kept making the same mistake — for 3 weeks straight. File-type quirks, tool workarounds — wiped clean after every session. After Dreaming? Completion rates jumped 6x.
Why doesn't an agent remember what it learned yesterday?
Honestly, this is the first wall you hit when you actually run agents in production. You nail the prompt, wire up the tools, and it seems to work — then a few days later the same mistakes start repeating.
The reason is simple. AI agents have no cross-session memory by default. Every session starts from a blank slate. Harvey's case is textbook — their agents rediscovered the same file-format quirks and tool-call workarounds every single session, failed, and then forgot again.
The classic fixes were either: write the memory yourself (prompt engineering / system prompts) or fine-tune the model. The first breaks down at scale. The second costs a fortune and takes months. Dreaming found the middle ground. Let the agent refine its own memory.
How is Dreaming different from regular memory?
One line: regular memory is written by the developer. Dreaming is rewritten by the agent itself.
Dreaming is a background process that fires between agent sessions. It scans past sessions and memory stores, looking for exactly 3 types of patterns.
- Recurring mistakes
If the agent keeps making the same error, Dreaming extracts the failure pattern. For Harvey, this was file-format quirks and tool-call failure modes. - Workflows the agent converges on
Dreaming captures the approaches agents naturally gravitate toward across sessions. These "validated workflows" get saved as playbooks — so the next agent skips trial-and-error and starts from proven methods. - Preferences shared across the team
When multiple agents work as a team, patterns discovered by one agent get propagated to the whole fleet. This surfaces insights no single agent could see alone.
Anthropic's Alex Albert framed it this way:
"A very similar thing is happening with dreaming — instead of you manually creating the skill from your experience working with Claude, the model is doing it."
— Alex Albert, Anthropic Research Product
No code changes. No model weight updates. The output is just plain-text notes and playbooks — text files you can read, edit, or delete at any time. Developers can choose auto-update mode or require human review before any change takes effect.
| Manual memory | Fine-tuning | Dreaming | |
|---|---|---|---|
| Operator effort | Write it yourself | Data prep + training | Automatic (review option) |
| Learning scope | Single agent | Whole model | Shared across agent team |
| Cost | Labor cost | Very high | Included in Managed Agents |
| Auditability | High | Low | High (text files) |
| Latency to reflect | Instant | Weeks to months | Auto between sessions |
Two more features announced alongside Dreaming at Code with Claude 2026 on May 6:
Also launched: Outcomes + Multiagent Orchestration
Outcomes (public beta) — Developers define success rubrics. A separate grader agent evaluates output against the rubric in an isolated context window. Internal tests showed up to +10 points task success rate, +8.4% on.docx and +10.1% on.pptx outputs.
Multiagent orchestration (public beta) — A lead agent decomposes complex tasks across specialist subagents running in parallel. Netflix is using this to analyze logs from hundreds of builds simultaneously.
How to get started
- Access Claude Managed Agents
Managed Agents is Anthropic's cloud-hosted agent runtime, launched April 9, 2026. Access it through the Claude developer dashboard with an API key. Currently available for Teams and Enterprise plans. - Enable Memory first
Dreaming works alongside the Memory feature — Memory needs to be enabled first. Memory is in public beta and available immediately. You can configure memory scope per agent or per team in settings. - Define success criteria with Outcomes
For Dreaming to know what counts as a mistake, you need a definition of success. Set up a rubric in Outcomes first. Concrete criteria like "output must be.docx format" or "summary must be under 500 words" work best. - Request Dreaming access
Dreaming is still in research preview — you'll need to submit an access request via the Claude developer dashboard. Once approved, choose between auto-update and review-before-apply modes. - Monitor memory changes
Check the playbooks your agents are writing in Claude Console regularly. Look for unintended pattern learning and edit or delete memory entries as needed — they're plain text files.
Dreaming is still in research preview
Outcomes, multiagent orchestration, Memory, and Webhooks are all in public beta and available now. Dreaming alone requires a separate access request. Also worth noting: persistent structured memory expands the attack surface for prompt-injection and memory-poisoning attacks. If your agent processes untrusted external content, consider the risk of memory contamination.
Dig deeper
New in Claude Managed Agents: dreaming, outcomes, and multiagent orchestration Anthropic's official blog post. Full details on how Dreaming, Outcomes, and multiagent orchestration work, plus Harvey, Netflix, Wisedocs, and Spiral case studies. claude.com
Scaling Managed Agents: Decoupling the brain from the job Anthropic engineering blog on the architecture behind Managed Agents — virtualizing agent components the way an OS virtualizes hardware. anthropic.com
Anthropic Launches Dreaming for Claude Agents at Code with Claude 2026 Detailed English review with Harvey, Wisedocs, Spiral, and Netflix case breakdowns and Outcomes performance metrics. letsdatascience.com
Anthropic's Claude Agents Can Now "Dream" Technical background and security analysis including memory-poisoning attack vectors. quasa.io
Anthropic will let its managed agents dream The New Stack's coverage of how Dreaming positions Anthropic against OpenAI and Google in the agent infrastructure race. thenewstack.io
Anthropic adds self-improving 'dreaming' system to Claude Managed Agents YourStory's overview of the announcement with additional context on the competitive landscape. yourstory.com




