Decision record: Centralizing feature flag evaluation
We decided to move flag evaluation into a shared service instead of letting every client decide on its own.
Context
Our feature flag usage grew organically.
At first, flags lived in a single service. Evaluation happened close to where the behavior changed.
Over time, more systems added flags:
- web and mobile clients
- backend services
- batch jobs
Each environment had its own way to fetch and evaluate flags.
The result:
- inconsistent behavior across platforms
- unclear ownership for which flags existed and what they did
- complex incident debugging when a flag flip affected only some clients
We faced two concrete problems during incidents:
- Flags behaved differently in different environments because local evaluation code drifted.
- It was hard to answer the simple question "What is the current state of this flag everywhere?"
Decision
We decided to centralize feature flag evaluation into a shared service and treat that service as part of the core architecture.
Concretely:
- flag definitions, targeting rules, and evaluation logic live in one place
- clients (web, mobile, backend) call the flag service or use thin SDKs that delegate evaluation to it
- the service exposes clear APIs and auditing for flag state and changes
Clients may cache results or precompute values, but they no longer implement their own evaluation logic beyond simple, documented cases.
Consequences
Upsides
- Consistency. The same rules apply across environments.
- Visibility. We can answer:
- which flags exist
- who owns them
- what their current values are in different segments
- Operational control. During incidents, we can:
- see who changed what, when
- flip flags globally or per-segment through a single interface
Downsides / costs
- New dependency. The flag service becomes another critical system. We need:
- SLOs for its availability and latency
- clear failover and degradation behavior
- Migration work. Existing clients must:
- remove local evaluation code
- adopt the shared APIs or SDKs
- Performance considerations. We have to:
- ensure low-latency responses
- design caching strategies that don’t reintroduce inconsistency
Guardrails
We set a few rules:
- Clients must define what happens when the flag service is unavailable (default behaviors).
- New flags must include metadata (owner, lifetime, emergency action) and be created through the shared system.
- The flag service must be observable: metrics for latency, error rates, and evaluation volume per service.
The decision does not require every toggle in the system to go through this service.
It focuses on flags that affect user-visible behavior or incident response.