Decision record: Standardizing rollout knobs across services
We chose a common set of rollout controls so deploy tools, runbooks, and dashboards speak the same language across services.
Context
Each service had its own idea of "rollout controls."
Some used:
- percentage-based rollout
- region-based toggles
- cohort-specific flags
Others relied on:
- time-based feature gates
- manual config changes
Deploy tools exposed some—but not all—of these controls.
Runbooks referenced knobs that didn’t exist in the UI.
During incidents and high-risk launches, this inconsistency cost us time and confidence.
We decided to standardize a core set of rollout "knobs" across services.
Decision
We adopted a small, shared vocabulary and control surface for rollouts:
- Traffic percentage: gradually move a percentage of traffic from old to new behavior.
- Region or zone: enable or disable behavior per region/zone.
- Cohort: target specific cohorts (internal users, beta groups, plan tiers).
These knobs:
- are exposed consistently in deploy tools and dashboards
- map to underlying feature flags and configuration in a predictable way
- are documented in runbooks
Services can still have additional, domain-specific controls, but they build on top of this shared base.
Consequences
Upsides
- Clearer mental model. Engineers and on-call can:
- talk about "rolling back 50%" or "turning off in region X" using a shared language
- rely on deploy tools to support these operations consistently
- Better automation. Tools for safe rollouts and automated canaries can:
- assume these knobs exist
- work across more services with less custom logic
Downsides / costs
- Migration effort. Teams need to:
- map existing flags and configs to the standard knobs
- update deploy workflows and runbooks
- Limit on bespoke patterns. Not every clever rollout pattern fits neatly, and some will need adaptation.
Guardrails
We defined:
- how these knobs interact (e.g., cohort rollout within a region-based rollout)
- expectations for observability (dashboards show breakdowns along these dimensions)
We also agreed that services could propose extensions, but the core knobs should remain small and stable.