Manual recovery

For KVM/libvirt compute node recovery refer to section above, while the guide below may be applicable for other hypervisors.

Working with host information

The first step is to identify the vms on the affected hosts, using tools such as a combination of nova list and nova show or euca-describe-instances. Here's an example using the EC2 API - instance i-000015b9 that is running on node np-rcc54:

	i-000015b9 at3-ui02 running nectarkey (376, np-rcc54) 0 m1.xxlarge 2012-06-19T00:48:11.000Z 115.146.93.60

First, you can review the status of the host using the nova database, some of the important information is highlighted below. This example converts an EC2 API instance ID into an openstack ID - if you used the nova commands, you can substitute the ID directly. You can find the credentials for your database in /etc/nova.conf.

SELECT * FROM instances WHERE id = CONV('15b9', 16, 10) \G;
*************************** 1. row ***************************
              created_at: 2012-06-19 00:48:11
              updated_at: 2012-07-03 00:35:11
              deleted_at: NULL
...
                      id: 5561
...
             power_state: 5
                vm_state: shutoff
...
                hostname: at3-ui02
                    host: np-rcc54
...
                    uuid: 3f57699a-e773-4650-a443-b4b37eed5a06
...
              task_state: NULL
...

Recover the VM

Armed with the information of VMs on the failed host, determine which compute host the affected VMs should be moved to. In this case, the VM will move to np-rcc46, which is achieved using this database command:

UPDATE instances SET host = 'np-rcc46' WHERE uuid = '3f57699a-e773-4650-a443-b4b37eed5a06';

Next, if using a hypervisor that relies on libvirt (such as KVM) it is a good idea to update the libvirt.xml file (found in /var/lib/nova/instances/[instance ID]). The important changes to make are to change the DHCPSERVER value to the host ip address of the nova compute host that is the VMs new home, and update the VNC IP if it isn't already 0.0.0.0.

Next, reboot the VM:

$ nova reboot --hard 3f57699a-e773-4650-a443-b4b37eed5a06

In theory, the above database update and nova reboot command are all that is required to recover the VMs from a failed host. However, if further problems occur, consider looking at recreating the network filter configuration using virsh, restarting the nova services or updating the vm_state and power_state in the nova database.

Log a bug against this page

Legal notices