OpenShift 4: DHCPD High Availability

Objective

One of the support component to install OpenShift 4 is the DHCP. We need to ensure this DHCP is resilient and highly available hence this article.

  • Using DHCP to provide:
    • static lease
    • denied any unknown client
    • unlimited lease time (anti-pattern for hardening)
    • master-slave clustering
    • mac-address based PXE boot config serving
    • securing the cluster communication with omapi.

NOTE: Ensure both server hooked to same NTP node, clock skew is critical in any clustered systems.

HOSTNAMEIP ADDRESSROLES
dhcp01.local.bytewise.my192.168.50.80Primary
dhcp02.local.bytewise.my192.168.50.81Secondary

Steps

1. Install packages on both nodes:

#> dnf -y install dhcp-server tftp-server syslinux

2. On dhcp01 node, populate /etc/dhcp/dhcpd.conf:

Some of this (PXE configs) DHCP configuration coming from another post.

#### Clustering main settings ####
failover peer "dhcp-failover-peer" {
  primary; # i am primary server
  address 192.168.50.80; # this is my address
  port 647;
  peer address 192.168.50.81; # this is my slave address
  peer port 647;
  max-response-delay 60;
  max-unacked-updates 10;
  mclt 3600;
  split 255;  # dont split address.

}

#### Securing this cluster ####
omapi-port 7911;
omapi-key omapi_secret;

key omapi_secret {
     algorithm hmac-md5;
     secret changemeusingdnsseckeygen==;
}

#### Main service settings ####
ignore client-updates;
authoritative;
allow booting;
allow bootp;
deny unknown-clients; # dont serve any mac address not defined here
default-lease-time -1; # infinite lease to mimick static IP as closest as it gets.
max-lease-time -1; # infinite lease to mimick static IP as closest as it gets.

subnet 192.168.50.0 netmask 255.255.255.0 {
 
        option routers 192.168.50.1;
        option domain-name-servers 192.168.50.30;
        option ntp-servers time.unisza.edu.my;
        option domain-search "local.bytewise.my","ocp4.local.bytewise.my";
        filename "pxelinux.0";
        next-server 192.168.50.30;
        host bootstrap { hardware ethernet 52:54:00:7d:2d:b1; fixed-address 192.168.50.60; option host-name "bootstrap"; }
        host master01 { hardware ethernet 52:54:00:7d:2d:b2; fixed-address 192.168.50.61; option host-name "master01"; }
        host master02 { hardware ethernet 52:54:00:7d:2d:b3; fixed-address 192.168.50.62; option host-name "master02"; }
        host master03 { hardware ethernet 52:54:00:7d:2d:b4; fixed-address 192.168.50.63; option host-name "master03"; }
        host worker01 { hardware ethernet 52:54:00:7d:2d:b5; fixed-address 192.168.50.64; option host-name "worker01"; }
        host worker02 { hardware ethernet 52:54:00:7d:2d:b6; fixed-address 192.168.50.65; option host-name "worker02"; }
        host worker03 { hardware ethernet 52:54:00:7d:2d:c1; fixed-address 192.168.50.66; option host-name "worker03"; }
        pool {
          failover peer "dhcp-failover-peer";
          range 192.168.50.60 192.168.50.80;
         }
}

3. On dhcp02, populate /etc/dhcp/dhcpd.conf:

#### Clustering main settings ####
failover peer "dhcp-failover-peer" {
  secondary; # i am secondary server
  address 192.168.50.81; # this is my ip
  port 647;
  peer address 192.168.50.80; # this is my primary server address
  peer port 647;
  max-response-delay 60;
  max-unacked-updates 10;
  mclt 3600;
  #split 255;  # dont split address. secondary dont need this line.

}

#### Securing this cluster ####
omapi-port 7911;
omapi-key omapi_secret;

key omapi_secret {
     algorithm hmac-md5;
     secret 1289hsdfoinaf0913n12==;
}

#### Main service settings ####
ignore client-updates;
authoritative;
allow booting;
allow bootp;
deny unknown-clients; # dont serve any mac address not defined here
default-lease-time -1; # infinite lease to mimick static IP as closest as it gets.
max-lease-time -1; # infinite lease to mimick static IP as closest as it gets.

subnet 192.168.50.0 netmask 255.255.255.0 {

        option routers 192.168.50.1;
        option domain-name-servers 192.168.50.30;
        option ntp-servers time.unisza.edu.my;
        option domain-search "local.bytewise.my","ocp4.local.bytewise.my";
        filename "pxelinux.0";
        next-server 192.168.50.30; 
        host bootstrap { hardware ethernet 52:54:00:7d:2d:b1; fixed-address 192.168.50.60; option host-name "bootstrap"; }
        host master01 { hardware ethernet 52:54:00:7d:2d:b2; fixed-address 192.168.50.61; option host-name "master01"; }
        host master02 { hardware ethernet 52:54:00:7d:2d:b3; fixed-address 192.168.50.62; option host-name "master02"; }
        host master03 { hardware ethernet 52:54:00:7d:2d:b4; fixed-address 192.168.50.63; option host-name "master03"; }
        host worker01 { hardware ethernet 52:54:00:7d:2d:b5; fixed-address 192.168.50.64; option host-name "worker01"; }
        host worker02 { hardware ethernet 52:54:00:7d:2d:b6; fixed-address 192.168.50.65; option host-name "worker02"; }
        host worker03 { hardware ethernet 52:54:00:7d:2d:c1; fixed-address 192.168.50.66; option host-name "worker03"; }
        pool {
          failover peer "dhcp-failover-peer";
          range 192.168.50.50 192.168.50.59;
         }
}

4. On both dhcp nodes, allow firewalld:

#> firewall-cmd --add-port=647/tcp --permanent
#> firewall-cmd --reload

5. Now enable and start both dhcpd instance and inspect the logs:

[email protected] ~]# systemctl enable dhcpd --now

[[email protected] ~]# journalctl  -f  -u dhcpd
-- Logs begin at Fri 2020-03-13 03:04:55 EDT. --
Mar 13 04:37:45 dhcp01.local.bytewise.my dhcpd[7339]: Sending on   Socket/fallback/fallback-net
Mar 13 04:37:45 dhcp01.local.bytewise.my dhcpd[7339]: failover peer dhcp-failover-peer: I move from normal to startup
Mar 13 04:37:45 dhcp01.local.bytewise.my dhcpd[7339]: Server starting service.
Mar 13 04:37:45 dhcp01.local.bytewise.my systemd[1]: Started DHCPv4 Server Daemon.
Mar 13 04:37:47 dhcp01.local.bytewise.my dhcpd[7339]: failover peer dhcp-failover-peer: peer moves from normal to communications-interrupted
Mar 13 04:37:47 dhcp01.local.bytewise.my dhcpd[7339]: failover peer dhcp-failover-peer: I move from startup to normal
Mar 13 04:37:47 dhcp01.local.bytewise.my dhcpd[7339]: balancing pool 5585ba479de0 192.168.50.0/24  total 10  free 5  backup 5  lts 0  max-own (+/-)1
Mar 13 04:37:47 dhcp01.local.bytewise.my dhcpd[7339]: balanced pool 5585ba479de0 192.168.50.0/24  total 10  free 5  backup 5  lts 0  max-misbal 2
Mar 13 04:37:47 dhcp01.local.bytewise.my dhcpd[7339]: failover peer dhcp-failover-peer: peer moves from communications-interrupted to normal
Mar 13 04:37:47 dhcp01.local.bytewise.my dhcpd[7339]: failover peer dhcp-failover-peer: Both servers normal


[[email protected] ~]# systemctl  enable dhcpd --now

[[email protected] ~]# journalctl  -f -u dhcpd
-- Logs begin at Fri 2020-03-13 03:05:35 EDT. --
Mar 13 04:38:15 dhcp02.local.bytewise.my dhcpd[6348]: Sending on   Socket/fallback/fallback-net
Mar 13 04:38:15 dhcp02.local.bytewise.my dhcpd[6348]: failover peer dhcp-failover-peer: I move from normal to startup
Mar 13 04:38:15 dhcp02.local.bytewise.my dhcpd[6348]: Server starting service.
Mar 13 04:38:15 dhcp02.local.bytewise.my systemd[1]: Started DHCPv4 Server Daemon.
Mar 13 04:38:15 dhcp02.local.bytewise.my dhcpd[6348]: failover peer dhcp-failover-peer: peer moves from normal to communications-interrupted
Mar 13 04:38:15 dhcp02.local.bytewise.my dhcpd[6348]: failover peer dhcp-failover-peer: I move from startup to normal
Mar 13 04:38:15 dhcp02.local.bytewise.my dhcpd[6348]: balancing pool 565389d31db0 192.168.50.0/24  total 10  free 5  backup 5  lts 0  max-own (+/-)1
Mar 13 04:38:15 dhcp02.local.bytewise.my dhcpd[6348]: balanced pool 565389d31db0 192.168.50.0/24  total 10  free 5  backup 5  lts 0  max-misbal 2
Mar 13 04:38:15 dhcp02.local.bytewise.my dhcpd[6348]: failover peer dhcp-failover-peer: peer moves from communications-interrupted to normal
Mar 13 04:38:15 dhcp02.local.bytewise.my dhcpd[6348]: failover peer dhcp-failover-peer: Both servers normal

6. Let stop the dhcp01 and monitor the dhcp02 logs:

[[email protected] ~]# systemctl stop dhcpd
[[email protected] ~]# systemctl status dhcpd
● dhcpd.service - DHCPv4 Server Daemon
   Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: man:dhcpd(8)
           man:dhcpd.conf(5)

Mar 13 04:45:55 dhcp01.local.bytewise.my dhcpd[7370]: failover peer dhcp-failover-peer: Both servers normal
Mar 13 04:46:15 dhcp01.local.bytewise.my dhcpd[7370]: peer dhcp-failover-peer: disconnected
Mar 13 04:46:15 dhcp01.local.bytewise.my dhcpd[7370]: failover peer dhcp-failover-peer: I move from normal to communications-interrupted
Mar 13 04:46:15 dhcp01.local.bytewise.my dhcpd[7370]: failover peer dhcp-failover-peer: peer moves from normal to normal
Mar 13 04:46:15 dhcp01.local.bytewise.my dhcpd[7370]: failover peer dhcp-failover-peer: I move from communications-interrupted to normal
Mar 13 04:46:15 dhcp01.local.bytewise.my dhcpd[7370]: failover peer dhcp-failover-peer: Both servers normal
Mar 13 04:46:15 dhcp01.local.bytewise.my dhcpd[7370]: balancing pool 55bc12c73070 192.168.50.0/24  total 10  free 5  backup 5  lts 0  max-own (+/-)1
Mar 13 04:46:15 dhcp01.local.bytewise.my dhcpd[7370]: balanced pool 55bc12c73070 192.168.50.0/24  total 10  free 5  backup 5  lts 0  max-misbal 2
Mar 13 04:46:45 dhcp01.local.bytewise.my systemd[1]: Stopping DHCPv4 Server Daemon...
Mar 13 04:46:45 dhcp01.local.bytewise.my systemd[1]: Stopped DHCPv4 Server Daemon.


[[email protected] ~]# journalctl  -n 3 -f -u dhcpd
-- Logs begin at Fri 2020-03-13 03:05:35 EDT. --
Mar 13 04:46:15 dhcp02.local.bytewise.my dhcpd[6583]: failover peer dhcp-failover-peer: Both servers normal
Mar 13 04:46:45 dhcp02.local.bytewise.my dhcpd[6583]: peer dhcp-failover-peer: disconnected
Mar 13 04:46:45 dhcp02.local.bytewise.my dhcpd[6583]: failover peer dhcp-failover-peer: I move from normal to communications-interrupted


7. Let start dhcp01 again and see the log stating back to normal communication on dhcp02:

[[email protected] ~]# systemctl start dhcpd
[[email protected] ~]# systemctl status dhcpd
● dhcpd.service - DHCPv4 Server Daemon
   Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; disabled; vendor preset: disabled)
   Active: active (running) since Fri 2020-03-13 04:50:55 EDT; 53s ago
     Docs: man:dhcpd(8)
           man:dhcpd.conf(5)
 Main PID: 7390 (dhcpd)
   Status: "Dispatching packets..."
    Tasks: 1 (limit: 26213)
   Memory: 5.3M
   CGroup: /system.slice/dhcpd.service
           └─7390 /usr/sbin/dhcpd -f -cf /etc/dhcp/dhcpd.conf -user dhcpd -group dhcpd --no-pid

Mar 13 04:50:55 dhcp01.local.bytewise.my dhcpd[7390]: Sending on   Socket/fallback/fallback-net
Mar 13 04:50:55 dhcp01.local.bytewise.my dhcpd[7390]: failover peer dhcp-failover-peer: I move from normal to startup
Mar 13 04:50:55 dhcp01.local.bytewise.my dhcpd[7390]: Server starting service.
Mar 13 04:50:55 dhcp01.local.bytewise.my systemd[1]: Started DHCPv4 Server Daemon.
Mar 13 04:50:55 dhcp01.local.bytewise.my dhcpd[7390]: failover peer dhcp-failover-peer: peer moves from normal to communications-interrupted
Mar 13 04:50:55 dhcp01.local.bytewise.my dhcpd[7390]: failover peer dhcp-failover-peer: I move from startup to normal
Mar 13 04:50:55 dhcp01.local.bytewise.my dhcpd[7390]: balancing pool 562518153070 192.168.50.0/24  total 10  free 5  backup 5  lts 0  max-own (+/-)1
Mar 13 04:50:55 dhcp01.local.bytewise.my dhcpd[7390]: balanced pool 562518153070 192.168.50.0/24  total 10  free 5  backup 5  lts 0  max-misbal 2
Mar 13 04:50:55 dhcp01.local.bytewise.my dhcpd[7390]: failover peer dhcp-failover-peer: peer moves from communications-interrupted to normal
Mar 13 04:50:55 dhcp01.local.bytewise.my dhcpd[7390]: failover peer dhcp-failover-peer: Both servers normal



[[email protected] ~]# journalctl   -f -u dhcpd
-- Logs begin at Fri 2020-03-13 03:05:35 EDT. --
Mar 13 04:46:15 dhcp02.local.bytewise.my dhcpd[6583]: failover peer dhcp-failover-peer: Both servers normal
Mar 13 04:46:45 dhcp02.local.bytewise.my dhcpd[6583]: peer dhcp-failover-peer: disconnected
Mar 13 04:46:45 dhcp02.local.bytewise.my dhcpd[6583]: failover peer dhcp-failover-peer: I move from normal to communications-interrupted
Mar 13 04:50:55 dhcp02.local.bytewise.my dhcpd[6583]: failover peer dhcp-failover-peer: peer moves from normal to normal
Mar 13 04:50:55 dhcp02.local.bytewise.my dhcpd[6583]: failover peer dhcp-failover-peer: I move from communications-interrupted to normal
Mar 13 04:50:55 dhcp02.local.bytewise.my dhcpd[6583]: failover peer dhcp-failover-peer: Both servers normal
Mar 13 04:50:55 dhcp02.local.bytewise.my dhcpd[6583]: balancing pool 55f8f66e0040 192.168.50.0/24  total 10  free 5  backup 5  lts 0  max-own (+/-)1
Mar 13 04:50:55 dhcp02.local.bytewise.my dhcpd[6583]: balanced pool 55f8f66e0040 192.168.50.0/24  total 10  free 5  backup 5  lts 0  max-misbal 2

Summary

Having highly available DHCP cluster is essential to ensure client will get its IP without interruption when one of the nodes goes down due to whatever reason. In this article we have cover how to achieve those HA and minimize impact due to one DHCP server went down.

(Cover Image : https://unsplash.com/@barkiple)

Muhammad Aizuddin Zali

Red Hat APAC-SEATH Senior Platform Consultant for OpenShift.

You may also like...

%d bloggers like this: