DESIGN2022-09-08BY MARA SABOGAL

Building async flows that feel responsive

Patterns we use to make async work—emails, background jobs, slow checks—feel predictable instead of flaky.

designasyncuxreliability

Async flows are where reliability and UX intersect.

From the user’s perspective, they look like:

"We’ll email you a link"
"Your export is on the way"
"We’re processing your request"

From the system’s perspective, they’re a chain of:

queued jobs
retries
dependencies

If we design only for the backend, async work can feel mysterious:

sometimes the email shows up
sometimes the export never arrives
sometimes the "processing" screen hangs and then silently succeeds later

This post is about patterns we use so async flows feel predictable instead of flaky.

Constraints

We already had a job system; we weren’t replacing it.
Many async flows touched external providers (email, storage, payment), so some variability was unavoidable.
We needed patterns that worked both on desktop and mobile, in low-connectivity conditions.

What we changed

We focused on three areas:

Clear promises to the user.
Visible state and progress.
Recovery paths when async work fails.

1. Clear promises

We started by tightening the language we used.

Instead of vague statements like "We’re working on it," we:

specified what would happen ("We’ll email you a link to download your export.")
set expectations about time ("This usually takes a few minutes.")
made failure explicit ("If nothing arrives in 15 minutes, you can retry from this page.")

In the UI, this means:

fewer loading spinners with no context
more short, direct sentences about what to expect

2. Visible state and progress

We mapped each async flow to explicit states:

pending
in progress
completed
failed (with reason)

Then we made those states visible:

a history or "recent activity" section where users can see pending and completed tasks
status badges that reflect backend state
timestamps ("requested at", "completed at")

For longer-running tasks, we added:

progress indicators when we had real progress information
otherwise, a simple acknowledgment that the request is still active

This reduced support tickets that boiled down to "Did my request go through?"

3. Recovery paths

Async flows will fail sometimes.

We designed recovery as part of the flow instead of an afterthought:

a clear "try again" action that doesn’t duplicate work unsafely
idempotent operations on the backend so retries are safe
specific error messages ("we couldn’t reach your email provider") when appropriate

We also made it easy for support to see the same states users see.

When someone writes in about a missing email or export, support can:

look up the request
see its current state and error history
trigger a safe retry if that’s part of the design

Results / Measurements

After we rolled out these patterns to a few key flows, we saw:

Fewer "did it work?" tickets. When users could see pending requests and states, they were less likely to ask if a button click "took."
Clearer incident impact. During incidents affecting email or job queues, we could quantify how many async flows were delayed and communicate that clearly.
Better user outcomes on retries. Idempotent backends plus explicit retry buttons meant users could resolve some issues on their own.

We also learned where more work was needed:

some flows still lacked visible history, especially older ones
time estimates were tricky; we had to avoid giving false precision

Takeaways

Async work is part of the UX. If users can’t see it or recover from failures, it feels unreliable.
Clear promises about what will happen and roughly when reduce anxiety.
Explicit states and history make both user support and incident response easier.
Designing safe retries and idempotent operations turns some failures into minor bumps instead of dead ends.