OpenShift 4: AlertManager + Slack Configuration

Objective

Sending alert to Slack and defined proper channel for each alert type. Pre-Requisites:

Three channels created in slack #watchdog #clusterwatch
Incoming webhook configured for each of the channel.

Default openshift-monitoring stack only monitor platform specific object and target . For application, there is technology-preview features [here](https://docs.openshift.com/container-platform/4.3/monitoring/monitoring-your-own-services.html) .

Steps

Get each of the webhook URL from above pre-req.
Create alertmanager.yaml as below example:

"global":
  "resolve_timeout": "5m"
"receivers":
- "name": "watchdog_receiver"
  "slack_configs":
  - "api_url": "https://hooks.slack.com/services/XXXX/XXXX/XXXX"
    "channel": "#watchdog"
- "name": "cluster_receiver"
  "slack_configs":
  - "api_url": "https://hooks.slack.com/services/XXXX/XXXX/XXXX"
    "channel": "#clusterwatch"
# set a proper receiver if you want a default receiver for all alerts
- "name": "null"
"route":
  "group_by": ['alertname', 'cluster', 'service']
  # Below interval as for lab purpose, please set proper value for your environment.
  "group_interval": "1m"
  "group_wait": "10s"
  "repeat_interval": "2m"
  # set a proper receiver if you want a default receiver for all alerts
  "receiver": "null"
  "routes":
  - "match":
      "alertname": "Watchdog"
    "receiver": "watchdog_receiver"
    
  - "match_re":
      "namespace": "^(openshift|kube)$"
    "receiver": "cluster_receiver"
  # To get alertname, browse to Prometheus UI > Alerts 
  - "match_re":
      "alertname": "^(Cluster|Cloud|Machine|Pod|Kube|MCD|Alertmanager|etcd|TargetDown|CPU|Node|Clock|Prometheus|Failing|Network|IPTable)$"
    "receiver": "cluster_receiver"
  - "match":
      "severity": "critical"
    "receiver": "cluster_receiver"

Replace a new secret:

# oc -n openshift-monitoring create secret generic alertmanager-main --from-file=alertmanager.yaml --dry-run -o=yaml |  oc -n openshift-monitoring replace secret --filename=-

Optionally restart all AlertManager pod:

# oc delete po -l alertmanager=main -n openshift-monitoring

Resources

Muhammad Aizuddin Zali

RHCA | AppDev & Platform Consultant | DevSecOps

Note

Disclaimer: The views expressed and the content shared in all published articles on this website are solely those of the respective authors, and they do not necessarily reflect the views of the author’s employer or the techbeatly platform. We strive to ensure the accuracy and validity of the content published on our website. However, we cannot guarantee the absolute correctness or completeness of the information provided. It is the responsibility of the readers and users of this website to verify the accuracy and appropriateness of any information or opinions expressed within the articles. If you come across any content that you believe to be incorrect or invalid, please contact us immediately so that we can address the issue promptly.