Abstract image of construction area
Represented-Logo

I Joked About This on Stage — Then McKinsey Got Hacked

5
min read

When McKinsey's AI platform "Lilli" was breached in March 2026, 46.5 million chat messages and 728,000 confidential files were exposed — not through sophisticated AI manipulation, but through unauthenticated API endpoints and basic SQL injection. In this post, Keno Dreßel argues that the real lesson isn't about AI security. It's about old engineering failures meeting a relentless new class of attacker.

Image of blog author
Keno Dreßel

In March 2026, a security startup deployed an autonomous AI agent against McKinsey's internal AI platform — and within two hours, it had full read and write access to the entire production database. The haul: 46.5 million chat messages, hundreds of thousands of confidential client files, and even the system prompts controlling the AI's behavior. But this wasn't an exotic AI attack. It was unauthenticated endpoints, SQL injection, and unencrypted data — vulnerabilities older than most of the developers who built the platform. In this post, Keno Dreßel — who had been warning about exactly this scenario on conference stages across Europe — breaks down what went wrong, why it matters, and why the security bar for AI systems needs to go up, not just stay the same.

CharactersScreenComponent

For over a year now, I've been giving a talk called "The Dark Side of LLMs" at conferences across Europe — at WeAreDevelopers, API Days, ThoughtWorks, and various meetups. The talk always follows the same arc: yes, LLMs are exciting. Yes, they unlock incredible new capabilities. But no, they don't excuse you from everything we already know about building secure software.

One slide in particular always gets a laugh. It shows someone performing a good old SQL injection — not through a web form, but through a chatbot. The audience chuckles. It feels absurd. Almost too on-the-nose.

Then, in March 2026, it actually happened. At McKinsey.

What Happened

A security startup called CodeWall set an autonomous AI agent loose on McKinsey's internal AI platform, "Lilli." The agent found publicly exposed API documentation, including 22 endpoints that required no authentication whatsoever. Through those endpoints, it discovered that JSON keys were being concatenated directly into SQL queries — a textbook injection vulnerability. Within two hours and without any human intervention, the agent had full read and write access to the entire production database.

The scale of the breach is staggering: 46.5 million chat messages containing discussions about strategy, mergers and acquisitions, and client engagements — all in plaintext. 728,000 files with confidential client data. 57,000 user accounts. And 95 system prompts — the instructions that control how the AI behaves, what it refuses, and what guardrails it follows.

Here's the thing that matters most: none of this was an "AI hack." There was no adversarial prompt injection, no model manipulation, no exotic attack vector. It was unauthenticated endpoints, missing input validation, and unencrypted data. The AI chatbot was just the front door to a house with no locks.

But here's what is new: the attacker. CodeWall's agent tried every endpoint, every malformed JSON key, every combination, tirelessly, systematically, without breaks. A human pentester might have tried a few things and moved on. The agent didn't. It found the one crack in 22 endpoints that a human might never have bothered to test. The hack was basic. The relentlessness was not.

New Technology, Old Rules

This is the core message I've been delivering on stages for over a year now: new technology doesn't just mean new rules. It also means the old rules matter more than ever.

When teams rush to ship AI features, and right now, everyone is rushing, they build new surfaces on top of existing infrastructure. New APIs, new data pipelines, new user-facing interfaces. But in the excitement, they forget that the infrastructure underneath still needs the same security rigor it always did. The AI creates blind spots. It makes teams feel like they're working on something fundamentally different, when in reality they're still building software. And software has rules.

LLMs don't replace your stack. They extend it. Every API endpoint, every database, every authentication layer underneath still needs to be airtight. The LLM is just another layer on top — and every layer you add without securing what's below it is a liability.

Old Problems, New Wrapping

Let's look at what actually went wrong at McKinsey — not as an AI failure, but as an engineering failure.

Exposed APIs with no authentication. 22 public endpoints, zero auth. This isn't a problem unique to AI platforms — it's an API governance problem. But when you're racing to ship a conversational AI product, proper endpoint security becomes "we'll add it later." Later came too late.

SQL injection — in 2026. The attack vector was JSON keys concatenated directly into SQL. This is a vulnerability class older than most junior developers on any given team. It worked because nobody expected the AI chatbot's input pipeline to be a SQL injection surface. But that's exactly the point: every new interface your AI creates is a new attack surface. Treat it like one.

Plaintext sensitive data. Millions of chat messages about the most sensitive topics a consulting firm handles — M&A, strategy, client engagements — stored without encryption. When you build an AI that people actually use, it accumulates data fast. If you don't treat that data store as a high-value target from day one, you're building a honeypot.

System prompts in the blast radius. This is where things get genuinely new. McKinsey's system prompts, the instructions that defined how Lilli answered questions, what guardrails it followed, what it refused to do, were stored in the same database the attacker accessed. An attacker with write access could have rewritten how the AI thinks. This is the one area where AI does introduce a fundamentally new type of risk: the ability to manipulate not just data, but the system's behavior itself. Isolate your configuration. Always.

The Bigger Picture

If McKinsey — a firm that advises Fortune 500 companies on digital transformation, ships an AI platform with unauthenticated endpoints and injectable SQL, what does that tell us about the state of the industry?

It tells us that the rush to AI is creating a generation of products where security is an afterthought. The technology is new. The mistakes are ancient. And the consequences are getting bigger, because AI systems sit at the intersection of everything: user data, internal knowledge, business logic, and system behavior.

And here's the uncomfortable truth: we can't afford to stay at the same level of vigilance. We have to be better. The vulnerabilities haven't changed, but the things hunting for them have. When an AI agent can scan, probe, and exploit 22 endpoints in two hours without a coffee break, the security bar doesn't stay where it was — it goes up. Every shortcut, every "we'll fix it later," every endpoint you forgot to lock down is now being found faster than ever. The margin for error just got a lot thinner.

What I Keep Saying

I've been repeating this on stages and in conversations with clients for a while now. The McKinsey incident just made the point for me, louder than any slide ever could.

Don't let the excitement around AI make you forget what good engineering looks like. And more than that, raise your game. The attackers already have. Authenticate your endpoints. Sanitize your inputs. Encrypt your data. Limit access. Isolate your system configuration. Automate your security testing. And build it properly from the start, not after the breach makes the headlines.

At squer.io, that's how we approach it, whether we're building AI-powered systems or anything else. Technology changes. The fundamentals don't. But the speed at which they get exploited? That just changed forever.

Keno Dreßel is Head of Product, Data & AI at SQUER Solutions, where he leads teams building AI-enabled products with security and engineering rigor as a foundation. He regularly speaks about LLM security risks at conferences across Europe.