Story: the accessibility regression our visual tests missed
Visual regression tests said everything was fine. Keyboard users and screen readers disagreed. This is how we found and fixed the gap.
What happened
A release changed the layout of a critical settings page.
It wasn’t glamorous work: we were cleaning up a long, scrolling form into something more modular.
We had modern tooling in place:
- visual regression tests for the main flows
- automated checks for basic contrast and ARIA attributes
- a design system with accessible components
The screenshots looked fine.
The automated checks passed.
The page shipped.
A few days later, support started forwarding a specific pattern of complaints:
- "I can’t tell what field I’m in."
- "When I tab, the focus jumps in a weird order."
- "My screen reader is reading parts of the form out of order."
All from people using keyboard navigation or assistive tech.
Our visual tests had passed because, visually, the page looked better.
The regression lived in the interaction model.
Where the tests stopped
Our visual tests compared before/after screenshots.
They verified:
- labels were present
- fields aligned roughly the same way
- major elements didn’t disappear
Our automated accessibility checks verified:
- required ARIA attributes existed
- color contrast stayed above thresholds
What they didn’t verify was:
- focus order
- keyboard traps
- the reading order for screen readers
We had assumed that using the design system correctly was enough.
We learned it wasn’t, especially once we started composing components into more complex layouts.
What we changed
1. Add focus order to the definition of done
We updated our design and engineering checklists for any form or multi-step flow:
- tab order matches visual and logical order
- focus indicators are visible and consistent
- modals and overlays trap focus correctly and restore it when closed
Designers started annotating focus paths in their specs for complex screens.
Engineers added simple tests where possible:
- unit or integration tests that simulate key presses and assert which element is focused
2. Include screen reader passes in critical flows
We don’t have the capacity to run full manual audits on every screen.
We do have the capacity to run focused checks on critical flows (like settings, billing, and recovery):
- use a screen reader to navigate the page
- listen for confusing or out-of-order announcements
- verify that grouped fields (like address sections) are announced in a way that makes sense
We treat this like performance spot-checks: targeted, repeatable, and part of the definition of done for those flows.
3. Extend our automation just enough
We added a small layer on top of our existing tests:
- smoke tests that move focus through a page and assert that it doesn’t get trapped
- checks that headings and landmarks appear in an expected order
These don’t replace manual checks, but they catch the most obvious regressions.
4. Capture accessibility bugs as operational issues
In our incident and bug triage, we:
- started tagging accessibility regressions explicitly
- looked at their impact in terms of support load and task completion, not just "UI correctness"
For this incident, we measured:
- increased support contacts from keyboard-only users
- completion rates for the settings change before and after the fix
This gave the issue the same weight we’d give to a performance regression.
5. Feed the lessons back into the design system
The specific bug involved a component used across multiple pages.
We:
- fixed the component to enforce a safer default focus order
- updated its documentation with explicit accessibility guidance
- added one example focused on keyboard-only navigation
The goal was to make the easiest way to use the component also the safest.
Takeaways
- Visual correctness is not interaction correctness.
- Automated accessibility checks are useful, but they don’t guarantee a good experience for keyboard and screen reader users.
- Adding a small number of focused manual checks in critical flows can catch high-impact regressions.
- Design systems need to encode not just how things look, but how they behave for different input and assistive technologies.