Master node failure recovery

Other than storage-related services (database, git server, and object storage), all core Anaconda Enterprise services are resilient to master node failure.

To maintain operation of Enterprise in the event of a master node failure, /opt/anaconda/ on the master node should be located on a redundant disk array or backed up frequently to avoid data loss. See Backup/Restore for more information.

To restore Anaconda Enterprise operations in the event of a master node failure, complete the following steps:

  1. Create a new master node
  2. Restore data from a backup

Create a new master node

Follow the installation process for adding a new cluster node, described in the section Unattended installation.

NOTE: To create the new master node, select --role=master instead of --role=worker.

Restore data from backup

After the installation of the new master node is complete, follow the instructions in the restore data section.