Why does my Zapier or n8n workflow run twice?

Two common causes. A webhook provider re-delivered the same event, because most webhooks promise at-least-once delivery and will retry. Or, on self-hosted n8n in queue mode, the main process executed the job alongside a worker. The fix for the first is an idempotency check keyed on the event id; the fix for the second is setting N8N_DISABLE_PRODUCTION_MAIN_PROCESS=true so only workers run jobs.

What is an idempotency key in plain terms?

A stable, unique id for a single event that you check before you do anything irreversible. If you have already processed that id, you skip the step. It is what stops a retried webhook from charging a card or sending an invoice twice. Stripe, for example, recommends logging the event ids you have processed and ignoring repeats.

How do I stop a flow from charging a customer twice?

Put the dedupe check before the side-effecting step, not after. Take the upstream event id, look it up against a store of ids you have already handled, and only proceed if it is new. In n8n use the Remove Duplicates node, in Zapier a Storage or Filter check on the id, in Make a Data Store lookup.

Where do failed automation runs go if I do not handle them?

By default, many failures just disappear from view. The item never arrives and no one notices until a customer complains. Make can send failed runs to an Incomplete Executions queue with the Break directive, n8n can route failures to an error workflow, and Zapier surfaces them in Zap History with optional Autoreplay. Set one of these up so failures are visible and retryable.

Does this matter if I only run a few automations?

Yes, because the cost is per incident, not per volume. A single double-charge or a single lead that never reached your CRM can cost more than the automation saved all month. Low volume does not protect you; it just means the failure is rarer and easier to miss when it happens.

Why automations silently break: 6 failure modes and fixes

Most automations do not fail with a red error. They fail quietly. The same invoice goes out twice, a lead never reaches the CRM, a customer gets charged again when a webhook retries. These are not random bugs. They are six recurring failure modes, and each one has a specific fix in n8n, Zapier, and Make. If your flow touches money, messages, or records, you have to design for them before they cost you a client.

The hard part is that these failures are invisible while you build. A workflow that double-sends looks perfect in testing, because you only fired the trigger once. The problem shows up later, in production, when the provider retries or two records land in the same second. Here is the full list, what each one looks like from the operator's seat, and how to close it in each tool.

The six ways an automation fails silently

Read this as symptoms first. You almost never see the technical cause. You see the result a customer or a teammate reports.

Failure mode	What you actually see	Root cause
Duplicate trigger firing	The flow ran twice for one event	The trigger fired more than once, or two processes both picked up the job
Partial failure mid-flow	Half the steps ran, half did not	A step failed after an earlier irreversible step already committed
No idempotency key	The same source record processed twice	Nothing checks whether this exact item was already handled
Webhook retry double-send	A duplicate charge, email, or message	The provider re-delivered the same event, as most webhooks are allowed to
Polling race condition	New records skipped or reprocessed	The source API returned items out of order or without a stable cursor
No dead-letter handling	An item silently never arrived	A run failed and the failed item was discarded with no error you would notice

Five of these six are about doing something twice or not at all. That is the whole game in automation reliability. The systems you connect mostly promise to deliver each message at least once, which is a polite way of saying sometimes more than once. Your job is to make a second delivery harmless.

The fix grid: n8n, Zapier, and Make

Most reliability advice is written for one platform, or for developers wiring raw webhooks in code. Operators run no-code and low-code tools. This is the same six failure modes mapped to the specific node, setting, or pattern that fixes each one in the three tools small businesses actually use.

Failure mode	Fix in n8n	Fix in Zapier	Fix in Make
Duplicate trigger	On self-hosted queue mode set `N8N_DISABLE_PRODUCTION_MAIN_PROCESS=true` so only workers run jobs; on Cloud this is handled	Polling triggers dedupe on the `id` field automatically; instant (REST hook) triggers do not, so add a Filter or Storage check	Check a Data Store for the record key before the action, or tighten the schedule so two runs cannot overlap
Partial failure	Order steps so the irreversible one is last; add an Error Trigger workflow to catch and log the rest	Reorder so the irreversible step is last; Autoreplay retries transient failures	Set the failing module to Break so the run goes to Incomplete Executions and resumes from that module, not the start
No idempotency key	Remove Duplicates node, "Remove Items Processed in Previous Executions", keyed on the event id (history defaults to 10,000 items)	Use the source record's stable `id` as the dedupe key; for updates, synthesize `id + "-" + updatedAt`	Data Store keyed on the event id; look it up and only continue if it is new
Webhook retry double-send	Dedupe on the provider's event id before the side-effecting node	Storage check on the event id before the action step	Data Store lookup on the event id before the action
Polling race	Sort the source newest-first, or dedupe on a monotonic key with Remove Duplicates	Return items in reverse chronological order keyed on `id`; that is how Zapier dedupes polling	Sort the search module and store a last-seen id or timestamp as a cursor
No dead-letter	Error Trigger workflow that writes failures to a table or Slack; set node-level retries	Watch Zap History and route failures to a "failed items" sheet; enable Autoreplay on paid plans	Break directive sends failures to the Incomplete Executions queue for retry; enable storing incomplete executions in scenario settings

This grid is the article. The rest is one worked example and the checklist you can run on a flow you already have.

A worked example: the Stripe payment that charges twice

Take a common small-business flow. A customer pays through Stripe, and your automation creates an invoice record and posts a "new sale" message to Slack. You test it once, it works, you ship it.

Two weeks later a customer emails: they got two invoices for one purchase. Your Slack channel shows the sale twice. Here is what happened. Stripe states plainly that a webhook endpoint "might occasionally receive the same event more than once," and it retries delivery for up to three days with exponential backoff in live mode. Your flow has no memory. Every time the event arrives, it creates another invoice.

The fix is one step, placed first. After the webhook trigger, before the invoice and the Slack message, you add a dedupe gate keyed on the Stripe event id, the value that starts with evt_. If you have seen that id before, stop. If it is new, record it and continue. In n8n that is the Remove Duplicates node set to remove items processed in previous executions. In Zapier it is a Storage lookup or a Filter on the event id. In Make it is a Data Store check. Stripe itself recommends exactly this: log the event ids you have processed, and do not process already-logged events.

One detail matters for the store you dedupe against. Because Stripe retries for up to three days, the record of handled ids has to outlive that window. n8n's Remove Duplicates node keeps a default history of 10,000 items, which is fine for most small-business volume. If you genuinely process more than 10,000 events in three days, raise the history size or move the dedupe to a database table. Most operators never hit that, but it is the kind of thing that bites at the worst time, so size it on purpose.

The order of operations problem

Idempotency stops repeats. It does not save you from a flow that dies halfway. That is the partial-failure mode, and the cheapest defense costs nothing: reorder your steps.

Put the irreversible action last. If your flow charges a card, sends an email, or creates an external record, do everything that can fail cheaply first, then do the one thing you cannot take back. A flow that validates the data, looks up the customer, builds the message, and only then sends it will fail before the send when something is wrong. A flow that sends first and validates second has already done the damage when it errors.

When you cannot reorder, catch the failure instead of letting it vanish. This is the difference between Make's Break directive and its default behavior. Make gives you five error directives: Ignore, Resume, Commit, Rollback, and Break. Break is the one that matters for reliability, because it sends the failed run to an Incomplete Executions queue where you can fix the issue and reprocess it without losing the data. The default is to discard. n8n's equivalent is an Error Trigger workflow that fires when any run fails and logs the item somewhere you will see it. Zapier surfaces failures in Zap History, with Autoreplay to retry transient ones on paid plans. Pick one per tool and turn it on. A failure you can see and retry is an inconvenience. A failure that disappears is a lost customer.

The six-line pre-launch reliability check

Before you turn on any flow that touches money, messages, or customer records, run it against these six questions. We use a version of this on every client build, including the lead-routing pipelines we ship for clients.

Does every irreversible step have an idempotency key, a stable id you check before acting?
Is the irreversible action the last step, after everything that can fail cheaply?
If the trigger is a webhook, are you assuming it can fire the same event more than once?
When a run fails halfway, does the item land somewhere visible and retryable?
On self-hosted n8n in queue mode, is N8N_DISABLE_PRODUCTION_MAIN_PROCESS=true so the main process is not double-running jobs?
Do you get an alert when a run fails, or do failures only surface when a customer complains?

If you cannot answer yes to a line, that line is your next hour of work. Question one and question four catch the most expensive failures, so start there.

How to harden a flow you already have

You do not have to rebuild anything. Open your highest-stakes automation, the one that touches payments or leads, and walk it through the grid above. Find the first side-effecting step and ask what happens if the trigger fires twice. If the answer is "two of something the customer sees," add the dedupe gate. Then check where a failed item goes today. If the answer is "nowhere," wire up the error path for your tool.

This is unglamorous work, and it is the difference between an automation that saves you time and one that quietly creates cleanup. We build workflow automation systems with these defenses in from the start, and we also run free reviews of flows operators have already built. Bring the one that scares you most, and we will tell you where it can fail twice.

Why automations silently break: 6 failure modes and fixes

The six ways an automation fails silently

The fix grid: n8n, Zapier, and Make

A worked example: the Stripe payment that charges twice

The order of operations problem

The six-line pre-launch reliability check

How to harden a flow you already have

Frequently Asked Questions

SOURCES & CITATIONS

About Alexey Yushkin

Related reading

How to stop an automation from creating duplicates

Rolling back a broken automation isn't recovery

Webhook or polling trigger: which should you use?

Want this kind of system in your business?