That "what's for lunch?" message you left on Slack at your old job? It's being sold as AI training data. For up to $100,000 a pop.
TL;DR
Internal communication data from defunct startups — Slack archives, emails, Jira tickets — is being traded as premium training data for AI agents. Shutdown specialists like SimpleClosure broker these deals, with nearly 100 transactions completed in the past year. The catch: employees never consented to any of it.
What's Going On?
AI companies have struck a new vein of training gold: Slack archives, email threads, Jira tickets, and internal documents from companies that no longer exist — what the industry calls "operational exhaust."
Why dead companies specifically? Simple. Early LLMs trained on public internet data — Wikipedia, Reddit, digitized books. But according to former OpenAI chief scientist Ilya Sutskever, all of that was exhausted by late 2024. What the industry is building now is "agentic AI" — models that actually do work. Training those models requires data showing how real people actually work: making decisions, collaborating, solving problems.
Enter "reinforcement learning gyms" (RL gyms) — simulated office environments built from real company data where AI agents practice doing work. Anthropic has reportedly discussed spending up to $1 billion on RL gyms this year, and there are already 50+ startups in the space.
AfterQuery, for example, sells off-the-shelf "worlds" like "Big Tech World" and "Finance World" to AI labs. A sample task: plan a coworker's birthday party — except another coworker is secretly planning one too, and the AI agent has forgotten when the birthday actually is. It has to message colleagues, investigate, and decide whether to collaborate or bail.
Why Does This Matter?
The market is being driven by startup shutdown specialists. SimpleClosure launched Asset Hub this week, and competitor Sunset offers similar services.
| Public Web Data | Defunct Startup Internal Data | |
|---|---|---|
| Data type | Wikipedia, Reddit, news articles | Slack chats, emails, Jira tickets, code commits |
| Availability | Exhausted by late 2024 | Steady supply from ongoing startup failures |
| Work context | Fragmented | High-context: decisions, workflows, collaboration linked together |
| Agentic AI suitability | Low | High — reflects real work processes |
| Price per deal | Nearly free (crawling) | $10K–$100K per company |
| Privacy risk | Public data | Identifiable employees, no consent obtained |
According to Sunset CEO Brendan Mahony, pricing depends on company size, age, and "data richness" — how well internal data links together. A Jira ticket tied to a specific code commit is worth more than a standalone document. Healthcare and finance data commands a premium.
Privacy is the elephant in the room. Marc Rotenberg, founder of the Center for AI and Digital Policy, warns: "People have become so dependent on these new internal messaging tools like Slack… It's not generic data. It's identifiable people." The Center sent a letter to the Senate Commerce Committee urging the FTC to step up oversight.
Companies promise anonymization, but a 2020 OpenAI/Google study showed that LLMs can memorize training data verbatim and regurgitate it with the right prompts. Scrubbing PII from a career's worth of work data "isn't an on-off switch," industry experts warn.
What You Should Do About It
- Check your data rights when leaving a job
Re-read your employment agreement and NDA. IP assignment and "selling internal comms to third parties" are separate issues. If there's no explicit consent clause, you may have grounds to push back. - Stop putting sensitive info in Slack and email
Review your habits around sharing personal data — SSNs, health info, salary details — in work messengers. Digital footprints outlive companies. - Companies: Establish data disposal policies
Plan how internal data gets handled during shutdown. GDPR, CCPA, and emerging regulations should inform your deletion and sales policies. - Founders: Know what your data is worth
If you're winding down, explore platforms like SimpleClosure's Asset Hub or Sunset. But verify their anonymization is truly "rock solid" before signing off. - Watch the RL gym market
This is core infrastructure for the agentic AI era. Key players: Prime Intellect (valued at $1B+), Fleet (in talks at $750M valuation).
Go Deeper
Forbes — AI's New Training Data: Your Old Work Slacks and Emails
The definitive piece with firsthand interviews from SimpleClosure's CEO and cielo24's former CEO. Full details on deal sizes and anonymization processes.
Gizmodo — Failed Companies Are Selling Old Slack Chats
Concise summary connecting the data-selling trend to Gallup survey results on employees' ethical resistance to AI.
TechSpot — Data from Failed Startups Finds Second Life
The best structural breakdown of how RL gyms work and why agentic AI specifically needs this kind of data.
Fast Company — Shuttered Startups Are Selling Old Slack Chats
Strong employee-perspective coverage citing Gallup and Checkr surveys on workplace privacy concerns.
OpenAI/Google Study — Extracting Training Data from LLMs (2020)
The foundational research proving LLMs can memorize and output training data verbatim. The technical basis for why anonymization alone isn't enough.




