top of page

The Security Control That Disappears When You Need It Most

  • Mar 30
  • 11 min read
Controlling the pipe isn't the problem. It is what's inside it that matters
Controlling the pipe isn't the problem. It is what's inside it that matters

“The attacker only needs to be right once. The defender has to be right every time.”

- Convention, repeated until nobody questions it

“What if we flipped it?”

- The question this article exists to answer


There is a particular kind of security architecture that has become so embedded in how we think about data protection that questioning it feels almost rude. It goes something like this: build walls around the data. Encrypt it at rest. Encrypt it in transit. Control who can access it. Monitor the perimeters. Trust the infrastructure. Hope the contracts hold.


For decades, this worked well enough. Data lived in databases and file systems that your team controlled. It moved between servers you owned, through networks you monitored, to endpoints you managed. The walls were imperfect, but they were yours. You knew where the doors were because you built them.


Then AI arrived and walked straight through every wall in the building.


Not maliciously. Not through some exotic vulnerability. AI walked through your walls because you held the door open and invited it in. You had to. For an AI model to reason on your data, to generate the output you are paying for, the data must be presented in cleartext at the point of inference. Decrypted. Readable. Fully exposed. No vendor did anything wrong here. Large language models are built this way. The data has to be naked for the AI to see it.


Which means every security control you built around the pipe (the encryption, the access controls, the zero-retention contracts, the SOC 2 certifications) is temporarily absent at the exact moment the data is being used. The one moment that matters most is the one moment your security architecture takes a break.


Consider for a moment how strange that is. We have collectively spent billions on infrastructure security, and in my view the entire model depends on an assumption that the data will only be processed inside trusted environments by trusted systems. The moment you send data to an external AI platform, that assumption collapses. Not partially. Completely.


The pipe is not the problem

To be clear, there is nothing wrong with encryption. TLS, AES-256, end-to-end encryption in transit. These are necessary, well-engineered, and should be in place everywhere. Pipe protection works. The trouble is that it stops working at the exact point where AI begins. The gap between “protected in transit” and “exposed at inference” is where the entire risk sits.


Think about it this way. Imagine you are transporting a confidential document. You put it in a locked briefcase (encryption at rest). You hire an armoured courier to deliver it (encryption in transit). The courier arrives at the destination, the briefcase is opened, and the document is placed on a desk in a room that you do not own, operated by people you have never met, governed by a contract that says they promise not to read it. That contract is your zero-retention agreement. That room is the inference layer.


Every control you applied was about the journey. None of it protected the document at the destination. And the destination is the entire point of the exercise.


That is what protecting the pipe looks like in practice. The pipe (the infrastructure, the transport layer, the access controls, the contractual commitments) covers everything between the data’s origin and its use. But the data itself arrives at the point of inference unchanged, unprotected, and fully readable. A model vulnerability, a training data extraction attack, a misconfigured retention policy, a subpoena in a foreign jurisdiction, a breach of the downstream platform. Any one of these exposes your data in its original form.


The pipe did its job perfectly. The data was never protected at all.


Run through the security architecture of any major AI platform and the pattern is identical. Harvey, the legal AI platform valued at over $11 billion, encrypts data in transit and at rest. They hold SOC 2, ISO 27001. They enforce zero-retention inference contracts and require FIDO2 hardware token authentication. Best-in-class pipe protection, by any reasonable standard. And yet, for Harvey’s AI to do its job (analysing client-privileged legal documents, drafting contract clauses, reviewing case law) the data must be presented to the model in cleartext. Every security control is momentarily absent at the point of inference. Harvey’s security posture is genuinely strong. That’s precisely the point. Even the best pipe has a structural ceiling, and AI operates right at that ceiling.


Two camps, and only one of them knows the

other exists

Every security vendor, every AI platform, every DLP tool in the market falls into one of two categories. Those that protect the pipe, and those that protect the data. The overwhelming majority are in the first camp. Firewalls. VPNs. Encryption layers. Role-based access controls. Zero-trust network architectures. Data loss prevention tools that scan outbound traffic for patterns and block what looks suspicious. These are pipe protectors. They secure the infrastructure through which data travels and assume that if the infrastructure holds, the data is safe.


The second camp is almost empty. I believe that’s not for lack of logic, but because until AI forced data out of controlled environments and into third-party inference engines, the pipe was genuinely sufficient. The threat model simply never demanded anything else.


Now it does. And the market has not caught up.


The gap here is conceptual, not technical. Most organisations evaluating AI security are asking the wrong question. They are asking: “How do we control access to the AI platform?” when they should be asking: “What happens to our data once it arrives there?”


The first question leads you to more pipe. Better access controls, tighter policies, stricter contracts. All of which are good. None of which change the fundamental reality that the data is unprotected at the point of inference.


The second question leads you somewhere different entirely. It leads you to ask whether the data itself can be transformed before it leaves your environment, such that what arrives at the AI platform is useful for inference but useless if compromised. Not redacted (which breaks the AI’s ability to reason). Not tokenised (which signals to an attacker that the data has been altered). But replaced with contextually realistic alternatives that preserve the AI’s analytical utility while making the original identities, values, and relationships invisible.


That is what it means to protect the data, not the pipe. The protection travels with the data. Downstream systems can fail, contracts can be breached, vendors can have a bad day. The data remains safe regardless, because the protection is intrinsic to the data itself.


Everyone has to be lucky all the time

There is a phrase that security professionals have been repeating for so long it has become almost liturgical: “The attacker only needs to be right once. The defender has to be right every time.” It is usually invoked to explain why breaches are inevitable, why defence is inherently harder than offence, why the house always loses in the long run.


What is interesting is that the same asymmetry applies not just to attackers and defenders, but to the security architecture itself. Every organisation relying on pipe protection is playing the defender’s side of this equation against their own security stack. Every layer has to hold. Every contract has to be honoured. Every retention policy has to be enforced. Every access control has to function correctly. Every downstream vendor has to maintain their certifications. Every employee has to follow the policy (which, as I have written about before, is roughly as reliable as asking everyone in the office to wash their hands by putting up a sign in the toilets).


One layer breaks and everything is exposed. The architecture might be beautifully designed. The data behind it was never changed. It sits in its original, readable form, waiting for the one moment when the last lock fails.


Think about the operational reality of maintaining this model at scale. A typical enterprise using AI across legal, finance, HR, and customer operations might be sending data to half a dozen AI platforms, each with its own security posture, its own retention policies, its own jurisdictional exposure, its own subcontractors. Every one of those platforms is a pipe that needs to hold. Every one of those vendor contracts is a promise that needs to be kept. Every one of those compliance certifications needs to remain current, audited, and accurate. The surface area is not shrinking. With every new AI integration, it expands.


This is passive security. Security by accumulation. Stack enough layers, negotiate enough contracts, tick enough compliance boxes, and hope the aggregate holds. Every vendor selling you passive security needs you to be lucky every single time. Every firewall rule, every access policy, every API configuration, every downstream vendor’s SOC 2 audit. All of it has to hold, simultaneously, indefinitely. The defender’s dilemma stops being a theoretical framing when you run the numbers. For every organisation using pipe-based security for AI workloads, this is operational reality.


Transform once. Be right forever.

Now consider the alternative. Instead of building ever more elaborate defences around untransformed data, you transform the data once, at the boundary of your environment, before it ever leaves. Sensitive identifiers (names, account numbers, addresses, case references, medical record numbers) are replaced with contextually realistic aliases. Not gibberish. Not [REDACTED]. Not tokens that scream “this data has been protected.” Realistic, plausible alternatives that allow the AI to reason normally while making the original values invisible.


Ahmed Al-Mazrouei becomes Khalid Al-Mansouri. Account number 1234567 becomes 2671958. The AI processes the data, generates its output, and the aliases are reversed on the way back in.


The original data never left your environment. What the AI saw was useful fiction.


In this model, the defender’s dilemma inverts. You do not need to be lucky every time. You needed to be right once, at the point of transformation. After that, the data carries its own protection. The downstream platform gets breached? The attacker finds records that look entirely genuine but contain no real identities. A model is compromised? The training data it absorbed is fictional. The zero-retention promise turns out to be worthless? So is whatever was retained. A foreign jurisdiction subpoenas the AI provider’s records? They hand over plausible nonsense.


None of this depends on anything outside your control. The AI vendor’s contract, their infrastructure, their employees’ compliance with policy. All of that becomes someone else’s problem. Your protection comes down to one decision, made once, at your boundary.


There is, I think, an elegance to this that matters beyond the technical argument. Security teams are drowning in complexity. Every new AI platform means another vendor assessment, another set of contractual negotiations, another monitoring surface, another set of promises to track.


Active security at the data layer collapses that complexity into a single control point. You still want encryption, you still want access controls. But with active security in place, the entire downstream stack becomes survivable. A downstream failure goes from catastrophic to irrelevant.


They have to stay lucky. You only have to be right once.


Why this changes how you evaluate everything

Once you see this distinction (pipe protection versus data protection, passive security versus active security) it becomes very difficult to unsee it. Every vendor presentation, every compliance checklist, every security architecture review starts resolving into one of two categories. Is this protecting the infrastructure? Or is this protecting the data itself?


The answers are illuminating. Your encryption? Pipe. Your firewall rules? Pipe. Your zero-trust architecture? Pipe. Your vendor’s SOC 2 Type II certification? Pipe. Their ISO 27001? Pipe. Their contractual commitment to zero-retention inference? Pipe. Their GDPR data processing agreement? Pipe. All necessary. All incomplete. All dependent on every layer holding simultaneously.


Active security, transforming the data itself before it leaves your boundary, is the layer that makes the rest of the stack survivable. The pipe still matters. But when (not if) a pipe breaks, the data that flows through the crack is already protected. The breach happens, and the exposure doesn’t. The attacker gets in, and what they find is useless.


The regulatory landscape has already moved here. Regulators are increasingly asking the question that passive security cannot answer. GDPR Article 32 requires “appropriate technical and organisational measures.” The EU AI Act requires demonstrable data governance controls. The UK ICO has been clear that contractual measures alone are insufficient for international data transfers. The FCA and PRA are asking regulated firms to demonstrate technical controls over AI data processing, not just policies. The direction of travel is unmistakable: regulators want to know what you did to the data, not what you promised about the pipe.


Pseudonymisation, specifically, is one of the few technical measures explicitly named in GDPR as a safeguard. Far from a nice-to-have, pseudonymisation is a recognised, regulation-grade control. And for AI workloads, where data must leave your environment to be processed, it could well be the only control that actually addresses the regulator’s core concern: what happened to the personal data at the point it was exposed to a third-party system?


The question nobody seems to be asking yet

Here is what I believe to be the most underexplored dimension of this entire discussion. We have an industry that has spent the better part of two decades building increasingly sophisticated pipe protection. Billions in investment. Thousands of vendors. Entire compliance frameworks built around the assumption that controlling the infrastructure is equivalent to protecting the data. And for the most part, it was. The two were effectively synonymous when data lived inside your perimeter.


AI broke that synonymy. The data now routinely leaves your perimeter, by design, as a feature, because the whole point is for a third-party model to process it. And yet the security conversation has barely shifted. We are still evaluating AI platforms primarily on the strength of their pipe: their certifications, their contracts, their retention policies, their encryption layers. We are still asking “is the pipe secure?” when the question that actually matters is “is the data protected even if the pipe isn’t?”


In my honest opinion, the most interesting risk in AI data governance is not a breach. It is the system working exactly as designed, with data that nobody thought to protect because the threat model assumed the pipe would hold. The data was never stolen. It was processed, in cleartext, by a system that did precisely what it was asked to do, in an environment you do not control, under a jurisdiction you may not have considered, with protections that disappeared the moment the model needed to read the input.


Nobody failed at execution. The structure was wrong from the start.


And structure failures are the hardest to fix because they are invisible to everyone operating inside the frame. When your entire security evaluation process equates infrastructure controls with data protection, you will keep investing in better pipes and wondering why the regulator keeps asking harder questions. Your pipes are fine. The regulator has simply moved to a different frame, one where the question has become “what did you do to the data before it left?”


That is the structure shift I believe is happening now. And the organisations that recognise it early will not just be better protected, they may well be the ones setting the terms of the conversation for everyone else.


Two principles. One decision.

So here is where we land.


Principle one: protect the data, not the pipe. Every organisation using AI externally should be asking what happens to their data at the point of inference, not just how it gets there. If the answer is “it arrives in cleartext and we rely on the vendor’s infrastructure to protect it,” then the data is not protected. The pipe is. And the pipe is not the data.


Principle two: transform once and be right forever. The defender’s dilemma is real, but it is not inevitable. Active security, transforming data at the boundary before it leaves your environment, converts a perpetual gamble into a single decision. The downstream stack fails? The data remains safe. The vendor gets breached? The exposure is fictional. The regulator asks what technical measures you applied? You have an answer that does not begin with “we trusted the contract.”


Neither principle is complicated, technically obscure, or anywhere near PhD territory. What they require is a willingness to look at the existing security model for AI and ask one uncomfortable question: every control we have protects the pipe, the pipe is absent at the moment the data is actually used, so what exactly are we protecting?


The organisations that ask that question now will be the ones that don’t have to answer a much harder question later.


That is the landscape now. The only real question is whether your data protection strategy is built for a world where AI stays inside your walls, or the world we actually live in.


Rob Westmacott is the founder of Contextul and the creator of AliasPath™, an active data security layer for AI workloads. He writes about AI governance, data protection, and the gap between policy and enforcement

 
 
 

Comments


bottom of page