McKinsey's Lilli Got Hacked in 2 Hours. It Wasn't an AI Problem.
On March 9, 2026, security startup CodeWall disclosed that its autonomous AI agent had fully compromised McKinsey’s internal AI platform, Lilli — in under two hours, with no credentials, no insider access, and no human in the loop.
The headline number: 46.5 million chat messages, in plaintext, including strategy discussions, M&A activity, and client work. 728,000 files. 57,000 user accounts. Full read-write access.
The vulnerability: a JSON key SQL injection on an unauthenticated endpoint that OWASP ZAP didn’t catch.
This is not an AI story. It’s a deployment story that AI made catastrophic.
How it happened
Lilli is genuinely impressive infrastructure. Built for McKinsey’s 43,000+ employees, processing 500,000+ prompts a month, RAG over 100,000+ internal documents, used by 70% of the firm for client work. Launched 2023, named after the first professional woman the firm hired in 1945.
CodeWall pointed their offensive agent at it with a domain name and nothing else. Here’s the timeline:
Step 1: Surface mapping. The agent found API documentation publicly exposed — 200+ endpoints, fully documented. Most required authentication. Twenty-two didn’t.
Step 2: SQL injection. One unprotected endpoint wrote user search queries to the database. The values were safely parameterised. The JSON keys — the field names — were concatenated directly into SQL. Standard scanners don’t flag this. The CodeWall agent did.
Step 3: Blind enumeration. The agent ran 15 iterations, each error message revealing more about the query shape. The agent’s chain of thought when the first real employee identifier appeared: “WOW!” When the full scale became clear: “This is devastating.”
Step 4: Full access. 46.5 million messages. 728,000 files. 57,000 user accounts. The agent then chained the SQL injection with an IDOR vulnerability to access individual employees’ search histories — revealing what specific consultants were actively working on.
The part that’s worse than the database
Reading 46 million messages is catastrophic. But the agent had write access.
Lilli’s system prompts — the instructions that define how the AI behaves for every one of its 45,000 users — were stored in the same compromised database. 95 configurations across 12 model types: how Lilli answered questions, what it refused, how it cited sources, what guardrails it followed.
An attacker with write access to those prompts doesn’t just read your data. They rewrite your AI’s behavior silently, at scale, for every user. Remove safety guardrails. Inject false information into answers. Redirect outputs. This is prompt injection at the infrastructure level — not manipulating a single conversation, but rewriting the AI’s core instructions for an entire organization.
McKinsey was notified, engaged a third party (found no evidence of prior unauthorized access), and patched. But the window existed.
This is not a McKinsey problem
McKinsey is not a four-person startup that didn’t know better. They have security teams. They had a responsible disclosure policy on HackerOne. They built a genuinely sophisticated internal AI platform.
And they missed:
- 22 unauthenticated endpoints in production
- JSON key concatenation into SQL
- System prompts stored alongside user data
- No separation between the AI configuration layer and the data layer
If this is happening at McKinsey, it is happening at companies with far less mature security practices that are rushing to ship internal AI for business-critical workflows right now.
The uncomfortable reality: the AI layer expanded the attack surface without adding corresponding security review. The API endpoints existed because Lilli needed them. The documentation was exposed because developers needed to build against it. The system prompts were in the database because that’s where application configuration lives. None of these decisions were individually unreasonable — together, they composed into a critical vulnerability that an AI agent found in two hours.
What the CodeWall agent actually did
It’s worth being precise about this because it matters for how you think about your own exposure.
The agent didn’t use prompt injection on Lilli. It didn’t jailbreak the model. It didn’t social-engineer an employee. It did what a thorough pentester would do — found public docs, mapped unauthenticated endpoints, probed for injection flaws, enumerated the database, chained vulnerabilities — but autonomously, in two hours, at machine speed.
The novelty isn’t the technique. It’s that AI agents have made this level of thoroughness the default for attackers, not the exception. A human pentester with two hours would not have found and chained all of this. The agent did.
This is the shift: security assumptions built around what a human attacker can accomplish in a given time window are no longer valid.
What to actually do
1. Authenticate everything. No exceptions for “internal” or “low-risk” endpoints. If it’s callable, it requires auth.
2. Parameterise keys, not just values. Standard SQL parameterisation protects values. Column names and table names interpolated as strings are still injectable. If your queries build any structural SQL from user input, audit them.
3. Store system prompts separately from user data. Your AI’s behavioral configuration is as sensitive as your private keys. It should not live in the same database — let alone the same table — as user content.
4. Treat your API docs as public. If they’re accessible to any authenticated user, assume they’re accessible to attackers. Document what you intend to expose; remove or gate what you don’t.
5. Run an AI-specific attack surface review. Standard AppSec reviews and tools like OWASP ZAP will miss AI-layer vulnerabilities. The JSON key injection that breached Lilli wasn’t flagged by ZAP. You need humans (or agents) who know what AI deployment surfaces look like to find what automated scanners miss.
For more on securing the AI layer specifically: AGENTS.md as an attack surface, sandboxing AI coding agents, Crust — security gateway for AI agents, and the Claude Code security wake-up call.
The bar has changed. Attackers now have autonomous agents that methodically enumerate your AI platform’s attack surface in the time it takes to have a meeting about it. Your security review needs to keep pace.
Source: CodeWall — How We Hacked McKinsey’s AI Platform · The Register · Promptfoo analysis