- Canary deployment should be used.
- Open question for: what subset of servers should act as canary?
- As implemented, checks are not suitable for installation steps, because a failing check does not prevent a subsequent check from running within the same stage (maybe should be a scap3 task?).
- It's not clear to me, when scap deploy offers to roll back (e.g., due to a check error), if checks are run after the rollback. Needs investigation.
xref: https://wikitech.wikimedia.org/wiki/Incidents/2024-09-16_logstash_unavailability
xref: T374880