look into grafana alerts with mattermost integration
Grafana alerting never worked out well for us. However, we can do something similar to how the GWiStat bot works: ie, query databases via the Grafana API in a cgi script and push messages to Mattermost when certain alert thresholds are reached for example:
- median latency across jobs goes above 15 seconds
- max RAM across jobs goes above 4 GB
- max time since last metric goes above 10 seconds
- loud missed injections
This would be a great way to learn about Influx, Grafana, and writing CGI scripts.
Edited by Rebecca Ewing