Retry

A retry is re-attempting a failed delivery. In webhook systems, retries happen on a backoff schedule (typically exponential) until the delivery succeeds or exhausts its budget. After exhaustion, the delivery moves to a DLQ.

Exponential backoff

Naive constant-interval retries (every minute, forever) overload downstream systems and don't recover from longer outages well. Exponential backoff doubles the wait between retries: 30s, 1m, 5m, 15m, 1h, 4h, 12h, 24h. Early retries catch transient blips; later retries catch longer outages without hammering recovering systems.

Most providers retry with backoff:

Stripe — retries for up to 3 days
GitHub — stores failed deliveries for manual or API-driven redelivery; automatic redelivery requires your own scheduled workflow
Slack — retries 3 times: nearly immediately, after 1 minute, and after 5 minutes
Shopify — retries failed deliveries up to 8 times in about 4 hours, then the subscription can be removed if failures continue

Why retries need idempotency

Every retry can produce a duplicate from the consumer's perspective. The first attempt might have succeeded but the response was lost; the retry hits the consumer again. Without idempotency, the consumer processes the same event twice.

For agents specifically, retries are expensive (tokens) and have side effects (tool calls). Idempotency at both the relay layer and the agent's tool-call layer is the discipline that keeps cost and side effects bounded.

For the broader retry discussion: Webhook DLQs: design and recovery patterns.

Related terms

Back to glossary

Glossary · Reliability

Exponential backoff

Why retries need idempotency

Related terms

Product

Partners

Integrations

Resources