Workflow AutomationOperationsZapiern8nMake

How to stop an automation from creating duplicates

An automation creates duplicates when a trigger re-fires, a step retries after a write went through, or two runs overlap. The fix is a deduplication gate in front of every side effect, keyed on a value derived from the business event such as an order ID, not on a value generated inside the run. The destinations operators write to, like Google Sheets, Airtable, a CRM, or an inbox, accept no idempotency key of their own, so you enforce it in the workflow.

Alexey YushkinFounder, GENERAL INFORMATICS3 min read

An automation creates duplicates for one of three reasons: the trigger fired more than once, a step retried after a write had already gone through, or two scheduled runs read the same record before either finished. The fix is the same in all three cases. Put a deduplication check in front of every step that creates, sends, or charges, and key that check on a value derived from the business event, like an order ID or a webhook's event ID, not on a fresh value generated inside the run. The catch most guides skip: the tools operators actually write to, Google Sheets, Airtable, a CRM, an inbox, accept no idempotency key of their own, so you enforce the check in the workflow, not the destination.

That last point is the whole reason this is harder than it looks. Stripe and a handful of billing and email APIs accept an Idempotency-Key header and collapse repeats for you. The systems where most duplicates actually land do not. Append a row to a sheet twice and you get two rows. Send a Gmail twice and the customer gets two emails. There is no header to fix that. The dedup logic has to live one step earlier, in your flow, before the side effect runs.

Where duplicates actually come from

Before you build anything, name the source. The three causes need the same fix but they look different in the run history, and knowing which one you have tells you what to key on.

The first is a re-fired trigger. Webhooks use at-least-once delivery, which means a provider that does not get a fast 2xx back will send the same event again. A form behind a slow page gets submitted twice by an impatient user. A polling trigger with a too-short interval reads the same new record on two consecutive polls. In every case the automation starts twice for one real-world event.

The second is a retry of a non-idempotent write. The step failed, the platform re-ran it, and the original attempt had actually succeeded. This is the one that turns one failed run into two charges. It has its own full treatment in when should an automation retry a failed step, so I will not re-derive it here. The short version: a write that creates or sends something is only safe to retry if a repeat of it is recognized as the same operation.

The third is an overlapping read window. Two scheduled runs, or a manual run on top of a scheduled one, both query "records added since last time," and because neither has finished writing its results back, they both pick up the same record and both process it. This is the quiet one. It does not show up as an error anywhere. You just find two of something and cannot explain it. The duplicate-trigger and webhook-double-send failure modes are covered alongside the others in why automations silently break.

The fix is a key derived from the event, not generated in the run

Here is the part the top results get wrong. Most guides tell you to "add an idempotency key" and then show a node that generates a UUID. A UUID generated inside the run is useless for this. It is different on every execution, so the second attempt's key never matches the first, and the gate that was supposed to catch the duplicate waves it straight through. You have built a lock and thrown away the only thing it was meant to recognize.

A dedup key has to be deterministic: identical every time the same business event happens, and distinct across different events. That means you derive it from the data, not invent it. If the same order is processed twice, both runs must compute the same key from that order. The simplest correct keys are the IDs the source system already assigns.

Side effectBad key (changes every run)Key derived from the event
Create a CRM contacta fresh UUIDthe lead's email, lowercased and trimmed
Send an order confirmationa random message IDorder_id plus "confirmation"
Append a row to a sheetthe current row count or now()the source record's primary ID
Post a Slack alert for a ticketthe timestampticket_id plus the status it alerts on
Charge a carda UUID generated per attemptthe invoice or order ID

Read the right-hand column and the pattern is obvious: the key is something that was already true about the event before your automation ran. The order had an ID. The webhook carried an event.id. The lead had an email. You are not creating identity, you are reusing the identity that already exists. Once you have that key, the gate is mechanical. Before the side effect, look the key up in a store. If it is there, stop. If it is not, record it, then run the step. The only real design decisions left are where the store lives and how long it remembers, and the platforms differ sharply on both.

What each tool dedupes for you, and what it does not

The dangerous assumption is that your platform already handles this. It handles a slice of it, at the trigger, and leaves the rest to you. Here is the honest map.

ToolWhat it dedupes nativelyWhat it does not
ZapierPolling triggers dedupe on the item's id field; a polled item fires once and Zapier remembers seen IDsWebhook (Catch Hook) triggers and every action step. Nothing after the trigger is deduped for you
n8nThe Remove Duplicates node, in "Remove Items Processed in Previous Executions" mode, remembers a key across runs and drops repeatsIt is not automatic. You place the node, pick the key field, and set the scope (per-node or per-workflow) yourself
MakeNothing at the action levelYou build the gate with a Data store: search for the key, branch if found, write the key when you proceed
Destination API (Stripe and similar)An Idempotency-Key header collapses repeats and returns the original result for about 24 hoursOnly where the API supports it. Most CRMs, sheets, and email sends do not, so the header has nowhere to go

Zapier's trigger dedup is genuinely useful and worth understanding precisely: it works only when your data carries a unique id, and it covers the trigger, not the actions downstream. If your Zap's trigger is a webhook rather than a poll, even that protection is off. n8n gives you the most direct tool of the three, a node built for exactly this, but it does nothing until you add it and tell it what to key on. Make leans on its Data store, which is the same hand-built pattern as a check-then-write in any other system. The through-line is simple. Trigger-level dedup is partial and conditional. Action-level dedup, the kind that stops a duplicate charge or a duplicate email, is on you in every one of these tools.

This is also why "just turn on the platform's duplicate handling" is not an answer. There is no switch that covers the action steps. There is a node, a Data store, or an API header, and each one needs a deterministic key you chose on purpose.

How long should the gate remember a key

A dedup store is only as good as its memory. Remember a key for too short a window and the late duplicate sails through after the entry has expired. Remember it too long with a key that is supposed to repeat, and you block a legitimate run.

The sizing rule: remember a key for at least as long as the duplicate can plausibly arrive. Stripe sets the reference point by pruning idempotency keys after about 24 hours, which is tuned to a normal retry window. Match that logic to your source. If a webhook provider re-delivers failed events for up to a day, your store has to outlast a day. If a polling overlap can only happen within a few minutes, a short window is fine and keeps the store small.

Then check the nature of the key itself. A key that should occur exactly once in the life of the business, like an order ID or an invoice number, can be remembered for the life of that record with no downside. A key that is meant to recur, like a daily-summary job keyed only on the date, must be allowed to fire again tomorrow, so the key needs the date baked in rather than a fixed string. Get this backwards and you build the opposite bug: an automation that refuses to run when it should, which is harder to notice than a duplicate because nothing happens at all. Pair the gate with a clear run record so that when you do find two of something, you can tell which run created which, and when.

What to do next

Open your highest-stakes automation and find every step that creates, sends, charges, or deletes. For each one, answer two questions in order. What deterministic value identifies the business event behind this step, and where will I store that value to check it before the step runs. If the answer to the first is "a UUID I generate in the flow," you do not have a dedup key yet, you have a placeholder. Go back to the source data and find the ID that was already there.

Then decide the window. Match it to how long a duplicate can arrive from your specific trigger, and make sure any key that is meant to recur carries the recurring part inside it. We build this gate into every workflow automation system we ship: a deterministic key on the event, a check-then-write in front of every side effect, and a window sized to the trigger, so a re-delivery or a retry collapses into a single action instead of a second charge. If you have an automation that keeps producing duplicates and you cannot pin down which of the three sources is doing it, send us the flow and we will trace it.

Frequently Asked Questions

SOURCES & CITATIONS

  1. Idempotent requests Stripehttps://docs.stripe.com/api/idempotent_requests
  2. How deduplication works in Zapier Zapierhttps://docs.zapier.com/platform/build/deduplication
  3. Remove Duplicates node documentation n8n Docshttps://docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.removeduplicates/

About Alexey Yushkin

Alexey is the founder of GENERAL INFORMATICS LLC. He designs and ships AI and automation systems for businesses and operators across the US.

Connect on LinkedIn

Related reading

Want this kind of system in your business?

We build practical AI and automation systems for operators. Send us your current workflow and we will show you what to automate first.

Request a Workflow Review