5.4. Configuring Failover Domains
A failover domain is a named subset of cluster nodes that are eligible to run a cluster service in the event of a node failure. A failover domain can have the following characteristics:
Unrestricted — Allows you to specify that a subset of members are preferred, but that a cluster service assigned to this domain can run on any available member.
Restricted — Allows you to restrict the members that can run a particular cluster service. If none of the members in a restricted failover domain are available, the cluster service cannot be started (either manually or by the cluster software).
Unordered — When a cluster service is assigned to an unordered failover domain, the member on which the cluster service runs is chosen from the available failover domain members with no priority ordering.
Ordered — Allows you to specify a preference order among the members of a failover domain. Ordered failover domains select the node with the lowest priority number first. That is, the node in a failover domain with a priority number of "1" specifies the highest priority, and therefore is the most preferred node in a failover domain. After that node, the next preferred node would be the node with the next highest priority number, and so on.
Failback — Allows you to specify whether a service in the failover domain should fail back to the node that it was originally running on before that node failed. Configuring this characteristic is useful in circumstances where a node repeatedly fails and is part of an ordered failover domain. In that circumstance, if a node is the preferred node in a failover domain, it is possible for a service to fail over and fail back repeatedly between the preferred node and another node, causing severe impact on performance.
The failback characteristic is applicable only if ordered failover is configured.
Changing a failover domain configuration has no effect on currently running services.
Failover domains are not required for operation.
By default, failover domains are unrestricted and unordered.
In a cluster with several members, using a restricted failover domain can minimize the work to set up the cluster to run a cluster service (such as httpd
), which requires you to set up the configuration identically on all members that run the cluster service). Instead of setting up the entire cluster to run the cluster service, you must set up only the members in the restricted failover domain that you associate with the cluster service.
To configure a preferred member, you can create an unrestricted failover domain comprising only one cluster member. Doing that causes a cluster service to run on that cluster member primarily (the preferred member), but allows the cluster service to fail over to any of the other members.
To configure a failover domain, use the following procedures:
Open /etc/cluster/cluster.conf
at any node in the cluster.
Add the following skeleton section within the rm
element for each failover domain to be used:
<failoverdomains>
<failoverdomain name="" nofailback="" ordered="" restricted="">
<failoverdomainnode name="" priority=""/>
<failoverdomainnode name="" priority=""/>
<failoverdomainnode name="" priority=""/>
</failoverdomain>
</failoverdomains>
The number of failoverdomainnode
attributes depends on the number of nodes in the failover domain. The skeleton failoverdomain
section in preceding text shows three failoverdomainnode
elements (with no node names specified), signifying that there are three nodes in the failover domain.
In the
failoverdomain
section, provide the values for the elements and attributes. For descriptions of the elements and attributes, refer to the
failoverdomain section of the annotated cluster schema. The annotated cluster schema is available at
/usr/share/doc/cman-X.Y.ZZ/cluster_conf.html
(for example
/usr/share/doc/cman-3.0.12/cluster_conf.html
) in any of the cluster nodes. For an example of a
failoverdomains
section, refer to
Example 5.8, “A Failover Domain Added to cluster.conf
”.
Update the config_version
attribute by incrementing its value (for example, changing from config_version="2"
to config_version="3">
).
Save /etc/cluster/cluster.conf
.
(Optional) Validate the file with against the cluster schema (cluster.rng
) by running the ccs_config_validate
command. For example:
[root@example-01 ~]# ccs_config_validate
Configuration validates
Run the cman_tool version -r
command to propagate the configuration to the rest of the cluster nodes.
Example 5.8. A Failover Domain Added to cluster.conf
<cluster name="mycluster" config_version="3">
<clusternodes>
<clusternode name="node-01.example.com" nodeid="1">
<fence>
<method name="APC">
<device name="apc" port="1"/>
</method>
</fence>
</clusternode>
<clusternode name="node-02.example.com" nodeid="2">
<fence>
<method name="APC">
<device name="apc" port="2"/>
</method>
</fence>
</clusternode>
<clusternode name="node-03.example.com" nodeid="3">
<fence>
<method name="APC">
<device name="apc" port="3"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice agent="fence_apc" ipaddr="apc_ip_example" login="login_example" name="apc" passwd="password_example"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="example_pri" nofailback="0" ordered="1" restricted="0">
<failoverdomainnode name="node-01.example.com" priority="1"/>
<failoverdomainnode name="node-02.example.com" priority="2"/>
<failoverdomainnode name="node-03.example.com" priority="3"/>
</failoverdomain>
</failoverdomains>
</rm>
</cluster>
The failoverdomains
section contains a failoverdomain
section for each failover domain in the cluster. This example has one failover domain. In the failoverdomain
line, the name (name
) is specified as example_pri
.In addition, it specifies no failback (failback="0"
), that failover is ordered (ordered="1"
), and that the failover domain is unrestricted (restricted="0"
).