OpenShift 4: AlertManager + Slack Configuration

https://unsplash.com/@chrisliverani

Objective

Sending alert to Slack and defined proper channel for each alert type.

Pre-Requisites:

  • Three channels created in slack
    • #watchdog
    • #clusterwatch
  • Incoming webhook configured for each of the channel.

Default openshift-monitoring stack only monitor platform specific object and target. For application, there is technology-preview features here.

Steps

1. Get each of the webhook URL from above pre-req.

2. Create alertmanager.yaml as below example:

"global":
  "resolve_timeout": "5m"
"receivers":
- "name": "watchdog_receiver"
  "slack_configs":
  - "api_url": "https://hooks.slack.com/services/XXXX/XXXX/XXXX"
    "channel": "#watchdog"
- "name": "cluster_receiver"
  "slack_configs":
  - "api_url": "https://hooks.slack.com/services/XXXX/XXXX/XXXX"
    "channel": "#clusterwatch"
# set a proper receiver if you want a default receiver for all alerts
- "name": "null"
"route":
  "group_by": ['alertname', 'cluster', 'service']
  # Below interval as for lab purpose, please set proper value for your environment.
  "group_interval": "1m"
  "group_wait": "10s"
  "repeat_interval": "2m"
  # set a proper receiver if you want a default receiver for all alerts
  "receiver": "null"
  "routes":
  - "match":
      "alertname": "Watchdog"
    "receiver": "watchdog_receiver"
    
  - "match_re":
      "namespace": "^(openshift|kube)$"
    "receiver": "cluster_receiver"
  # To get alertname, browse to Prometheus UI > Alerts 
  - "match_re":
      "alertname": "^(Cluster|Cloud|Machine|Pod|Kube|MCD|Alertmanager|etcd|TargetDown|CPU|Node|Clock|Prometheus|Failing|Network|IPTable)$"
    "receiver": "cluster_receiver"
  - "match":
      "severity": "critical"
    "receiver": "cluster_receiver"

3. Replace a new secret:

# oc -n openshift-monitoring create secret generic alertmanager-main --from-file=alertmanager.yaml --dry-run -o=yaml |  oc -n openshift-monitoring replace secret --filename=-

4. Optionally restart all AlertManager pod:

# oc delete po -l alertmanager=main -n openshift-monitoring

Resources

  • https://github.com/prometheus/alertmanager/blob/master/doc/examples/simple.yml
  • https://prometheus.io/docs/alerting/configuration

Muhammad Aizuddin Zali

Red Hat APAC-SEATH Senior Platform Consultant for OpenShift.

You may also like...

%d bloggers like this: