SECURITY2022-06-17BY STORECODE

Decision record: Moving secrets out of env vars

We decided to move application secrets out of long-lived environment variables and into a managed secrets system.

securitysecretsconfigurationoperations

Context

Our applications, like many, started with configuration in environment variables.

This included:

Over time, this pattern accumulated sharp edges:

The operational cost showed up during incidents:

we could not quickly rotate a compromised credential
we didn’t have a single view of which services depended on which secrets
local development and staging sometimes used production-like secrets by accident

We needed a more disciplined approach.

We decided to move sensitive application secrets out of long-lived environment variables and into a managed secrets system.

Concretely:

Secrets are stored in a dedicated secrets manager, not in application configs or deployment manifests.
Applications retrieve secrets at startup (or on demand) via authenticated calls to the secrets system.
Environment variables remain for non-sensitive configuration and for references (e.g., secret names), not for secret values themselves.

We evaluated several options and chose one that:

The specific tool matters less than the properties: auditable, revocable, and scriptable.

Easier rotation. We can rotate secrets centrally and roll them out without rebuilding images or editing multiple config files.
Better auditing. We have logs of when and where secrets are accessed.
Tighter access control. Each service has a scoped identity that grants access only to the secrets it needs.
Safer development environments. We can provision lower-privilege secrets for non-production use without copying production values.

Operational complexity. Applications must handle the failure modes of the secrets system (e.g., transient unavailability).
Migration work. We needed to:
- identify all existing secrets in environment variables
- move them into the secrets system
- update applications to fetch them correctly
Bootstrap questions. The system that fetches secrets needs its own trust path (e.g., an instance identity or initial credential).

To keep the new system from becoming another source of drift, we set a few rules:

New services must use the secrets system from the start.
Adding or changing a secret requires updating a small inventory document that maps secrets to services.
Incident runbooks include a "secrets" section:
- where the secret lives
- how to rotate it
- how to verify the new value is in use

We also decided not to move everything:

Non-sensitive configuration stays in environment variables for simplicity.
We avoid using the secrets system for values that change constantly or are better represented as data in a database.

This decision makes some workflows more explicit, but it pays off during the rare but critical moments when a secret must change quickly and safely.