Unanimous Cluster Health Checks
- Last Updated: October 8, 2024
- 2 minute read
- LoadMaster
- LoadMaster GA
- Documentation
When configuring an FQDN, one of the options which can be configured is Unanimous Cluster Health Checks. If this option is enabled, if any IP addresses fail health checking - other FQDN IP addresses which belong to the same cluster is marked as down. When Unanimous Cluster Health Checks is enabled, the IP addresses which belong to the same cluster within a specific FQDN are either all up or all down. For example, example.com has addresses 172.21.58.101, 172.21.58.102 and 172.21.58.103 which all belong to cluster cl58:
- If 172.21.58.101 fails, the unanimous policy forces 172.21.58.102 and 172.21.58.103 down as well.
- When 172.21.58.101 comes back, the unanimous policy brings back 172.21.58.102 and 172.21.58.103 along with it.
At any given time – either all three addresses are available or all three addresses are down.
The same approach applies for site failure mode with manual recovery. Manual recovery causes a failed address to be disabled, so the administrator can re-enable it after fixing the problem. When Unanimous Cluster Health Checks is enabled, all three addresses are disabled.
The unanimous policy ignores disabled addresses. So, if you know that an address is down, and for whatever reason you want to continue using the other addresses that belong to the same cluster, you can disable the failed address and the unanimous policy will not force down the other addresses with it.
When Unanimous Cluster Health Checks are enabled, some configuration changes may cause FQDN addresses to be forced down or brought back up. For example, if an address is forced down and you remove it from the cluster while the unanimous policy is in effect, the address should come back up. Similarly, if you add an address to a cluster where the unanimous policy is in effect and one of the addresses is down, the new address should be forced down. This change may not occur immediately, but it should happen the next time health checking occurs.
If there are addresses with the Checker set to None combined with addresses that have health checking configured – addresses with no health checking will not be forced down, but they can be forcibly disabled if the Site Recovery Mode is set to Manual. For example, say there are three addresses:
- 172.21.58.101 with a Checker of Cluster Checks
- 172.21.58.102 with a Checker of Cluster Checks
- 172.21.58.103 with a Checker of None
If site failure handling is off or automatic, the failure of 172.21.58.101 causes 172.21.58.102 to be forced down, but 172.21.58.103 remains up. The rationale is that if you do not want health checking on 172.21.58.103 then it should remain up.
However, if the Site Recovery Mode is set to Manual, failure of 172.21.58.101 causes both 172.21.58.102 and 172.21.58.103 to be disabled, along with 172.21.58.101. For site recovery – all addresses are disabled, even the ones with no health checking configured. This is to keep traffic away from the problem data center until the system administrators fix it. This does not conflict with having addresses with no health checking because you can have an address that is up but disabled.