Below you will find pages that utilize the taxonomy term “Stress”
The Hidden Costs of On-Call: False Alarms
The video of my LISA17 talk is posted on YouTube.
On-call teams, postmortems, and costs of downtime are well-covered topics of DevOps. What’s not spoken of is the costs of false alarms in your alerting. The team’s ability to effectively handle true issues is hindered by this noise. What are these hidden costs, and how do you eliminate false alarms?
While you’re at LISA17, how many monitoring emails do you expect to receive?
Reducing the Stresses of On-Call
Being on-call is stressful. It feels like the future of the company–or at the very least your job–depends on your vigilance. When will the pager alert come? How bad will it be?
Where is this stress coming from? Urgency - Typically on-call only has a certain amount of time to respond to an incident. The idea of being late to respond is stressful for many. There’s also an implied urgency in that down = bad, so services should be restored as quickly as possible