Design: making retries visible in the UI
Patterns we use so automatic retries feel predictable and honest instead of random and frustrating.
Retries are one of our favorite reliability tools.
They are also a common source of user confusion.
From the system’s perspective:
- we see transient errors and timeouts
- we add retries with backoff
- graphs smooth out
From the user’s perspective, retries can look like:
- a button that sometimes works and sometimes doesn’t
- a spinner that hangs for a while and then resolves
- duplicate actions they didn’t mean to trigger twice
This post is about how we design UI around retries so they feel predictable and honest.
Constraints
- We already had retries implemented in many places.
- We didn’t want to expose all the internal complexity.
- Some flows were more sensitive (payments, irreversible actions) than others.
What we changed
We focused on three questions:
- What does the user see while we’re retrying?
- How do we avoid doing the same work twice?
- What do we tell the user when we stop?
1. Show that work is in progress, not stuck
For actions where we retry automatically, we:
- show a clear "in progress" state (e.g., "Saving…" rather than a generic spinner)
- avoid disabling all feedback; buttons may be disabled, but we show text and, when appropriate, a small status indicator
If the action might take more than a couple of seconds, we:
- use language that sets expectations ("This can take up to a minute")
- consider exposing some sense of attempts ("Still trying…") without numbers that imply guarantees
2. Make actions idempotent from the user’s point of view
We design flows so that:
- pressing a button twice doesn’t run the action twice in a harmful way
- automatic retries reuse the same request identity under the hood
In the UI, this means:
- we show the action as "pending" rather than making the user guess whether clicking again is safe
- we prevent obvious double-submits by:
- disabling the primary action while work is active
- showing a clear way to cancel, when safe
3. Be honest when we give up
Retries can’t last forever.
When we stop, we:
- tell the user what happened in plain language
- make a concrete suggestion ("You can try again" or "We’ll keep trying and email you")
- avoid implying that the action may or may not have partially succeeded without clarifying what we know
Where possible, we:
- show the current state of the underlying entity ("Your change has not been saved")
- log enough context that support can tell what happened later
4. Separate user errors from system errors
Retries don’t help with user mistakes.
We distinguish:
- errors where another attempt might succeed (network, transient server issues)
- errors where the input is invalid or the action is blocked (permissions, validation)
The UI reflects this:
- user errors get specific, actionable messages and no automatic retries
- system errors get bounded retries and clear status copy
This keeps users from feeling like the system is "just trying again" on input they can fix.
5. Make retry behavior discoverable for support
Support often gets the first report when retries behave badly.
We surfaced retry behavior in internal tools:
- a short note in the support view for key flows ("We retry this action up to N times over M minutes")
- recent status for retry-heavy actions ("Last attempt at HH:MM", "Next scheduled retry")
This helps support:
- set expectations with users
- know when it’s safe to advise "try again" vs escalate
Results / Measurements
We looked at a few signals after updating our retry patterns in key flows:
- support tickets that boiled down to "did my click work?" dropped
- we saw fewer duplicate submissions in flows where we added clearer pending states and idempotent handling
- user research sessions showed less confusion around long-running operations once the UI set better expectations
These weren’t dramatic numbers, but they were enough to confirm that the design changes were moving us in the right direction.
Takeaways
- Retries are a UX concern, not just a backend implementation detail.
- Users need to see that the system is trying, not stuck.
- Idempotent actions and clear pending states prevent double work.
- Honest messages when we stop retrying build more trust than ambiguous failures.