Audit notification settings for monitoring systems
Systematically check notification features of all of our monitoring services. For each system, we should answer the following questions:
- What conditions, services, or subsystems are monitored?
- Are urgent, high severity conditions instantly distinguishable from minor ones?
- Who is able to act on the alert? Are they subscribed to the system? Are they subscribed in a manner that will immediately get their attention (e.g. phone/SMS) for high severity alerts?
Monitoring services that are relevant to subsystems that we are responsible for include:
- GraceDB
- Nagios/Icigna
- Sentry
DoD: a table in the docs or on a Wiki page stating the answers to the above questions for all monitoring services.