CULTURE2025-12-20BY STORECODE

Using LLM tools to assist incident retrospectives

How we use LLM-based tools to help with retrospectives—clustering themes, drafting sections—while keeping humans in charge of conclusions.

cultureincidentsretrospectivesllm

Running good incident retrospectives is work.

For complex incidents, we may have:

long chat threads
multiple dashboards and screenshots
several partial narratives from different teams

Our goal in a retro is to:

understand what happened
agree on what we learned
identify changes we want to make

We experimented with using internal LLM tools to help with some of the mechanics, without outsourcing judgment.

Constraints

Incident participants must feel safe; we don’t want tools that sound like they are grading people.
We redact or summarize sensitive data before sending anything to a model.
Humans own the final retro document and action items.

What we changed

1. Use tools to cluster themes, not blame

We feed sanitized incident artifacts into an internal tool that can:

cluster related comments or observations
highlight repeated patterns (e.g., "missing owner", "confusing alert")

We explicitly do not ask for:

"root cause" statements
judgments about individuals or teams

The output is a set of suggested themes, which facilitators can accept, merge, or ignore.

2. Draft structure, not conclusions

We ask tools to propose:

a candidate outline for the retro doc (sections, headings)
bullet-point summaries of events we already agree on

Facilitators then:

edit for accuracy and nuance
fill in analysis and conclusions

This reduces the time spent retyping known facts, leaving more time for discussion.

3. Keep action-item generation human-led

We avoid asking tools to "suggest fixes."

Instead, we:

use clustered themes as prompts in the meeting ("we saw several mentions of X")
let participants propose and debate actions

This keeps ownership of work with the people who will do it.

4. Be transparent about usage

We are explicit in retro invites and docs about:

which tools we’re using
what inputs they see
what outputs they produce

Participants can opt out of having certain parts of conversations used as inputs.

Results / Measurements

We look for:

whether retros happen more reliably for significant incidents
whether participants feel they have more time for discussion
whether themes feel more consistent across incidents

Early feedback:

facilitators appreciated help with organizing raw material
participants valued seeing patterns highlighted across incidents
some skepticism remained, which we treat as healthy pressure to keep tools scoped

Takeaways

LLM tools can help with the mechanics of retrospectives—clustering and drafting—but should not drive conclusions.
Being transparent about what tools do and don’t do helps maintain trust.
Keeping action items and analysis human-led keeps ownership where it belongs.