How to migrate instances from Xen to KVM

This post is a short one, basically it contains only two steps. The goal is to be able to launch a Xen instance on a KVM host.

Currently, we are testing KVM as hypervisor for our private (OpenStack) cloud as an alternative to our existing Xen hosts. Of course we try to avoid building new instances or import a bunch of snapshots into Glance. Instead we would like to keep the existing instances, modified to be able to run on KVM.

There’s actually not too much to do: if you try to boot a Xen instance on a KVM host, you’ll see dracut run into timeouts because it can’t find the disk(s), something like

[...] Warning: dracut-initqueue timeout - starting timeout scripts

and then it’ll get stuck trying to map devices.

The solution is to install virtio kernel modules into the image. So start the instance back on Xen host again, find the modules by running

xen-vm:~ # find /lib/modules/ -name *virt*
/lib/modules/4.4.36-8-default/kernel/sound/pci/oxygen/snd-virtuoso.ko
/lib/modules/4.4.36-8-default/kernel/drivers/char/virtio_console.ko
/lib/modules/4.4.36-8-default/kernel/drivers/char/hw_random/virtio-rng.ko
/lib/modules/4.4.36-8-default/kernel/drivers/gpu/drm/virtio
/lib/modules/4.4.36-8-default/kernel/drivers/gpu/drm/virtio/virtio-gpu.ko
/lib/modules/4.4.36-8-default/kernel/drivers/dma/virt-dma.ko
/lib/modules/4.4.36-8-default/kernel/drivers/virtio
/lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio_balloon.ko
/lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio_ring.ko
/lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio_input.ko
/lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio_pci.ko
/lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio_mmio.ko
/lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio.ko
/lib/modules/4.4.36-8-default/kernel/drivers/block/virtio_blk.ko
/lib/modules/4.4.36-8-default/kernel/drivers/net/virtio_net.ko
/lib/modules/4.4.36-8-default/kernel/drivers/net/caif/caif_virtio.ko
/lib/modules/4.4.36-8-default/kernel/drivers/scsi/virtio_scsi.ko
/lib/modules/4.4.36-8-default/kernel/net/9p/9pnet_virtio.ko
/lib/modules/4.4.36-8-default/kernel/virt

and then install them all or just a bunch:

xen-vm:~ # dracut --add-drivers "virtio_balloon virtio_ring virtio_input virtio_pci virtio_mmio virtio virtio_blk virtio_net caif_virtio virtio_scsi" --force

You can check the initrd if the modules have been installed:

xen-vm:~ # xzcat /boot/initrd-4.4.36-8-default | cpio -t | grep virt
lib/modules/4.4.36-8-default/kernel/drivers/block/virtio_blk.ko
lib/modules/4.4.36-8-default/kernel/drivers/net/caif/caif_virtio.ko
lib/modules/4.4.36-8-default/kernel/drivers/net/virtio_net.ko
lib/modules/4.4.36-8-default/kernel/drivers/scsi/virtio_scsi.ko
lib/modules/4.4.36-8-default/kernel/drivers/virtio
lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio_balloon.ko
lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio_input.ko
lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio.ko
lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio_mmio.ko
lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio_pci.ko
lib/modules/4.4.36-8-default/kernel/drivers/virtio/virtio_ring.ko

If you then launch this instance on a KVM host it should boot without problems.

There is one more thing to do if you have the instances running in OpenStack. In that case you need to manipulate the nova database, but be really careful about that!


Table instance_system_metadata

Find the entries for your instance, containing hypervisor_type and kernel_id (if applicable):

MariaDB [nova]> select id,instance_uuid,key,value from instance_system_metadata where instance_uuid='c0d108c5';
+-------+---------------+-----------------------+----------+
| id       | instance_uuid | key                                      | value    |
+-------+---------------+-----------------------+----------+
| 19811 | c0d108c5      | image_kernel_id              | 66a1daec    |
| 19812 | c0d108c5      | image_hypervisor_type  | xen                                          |
| 19813 | c0d108c5      | image_disk_format          | raw                                          |
...

Please note that I truncated the IDs for a clearer visualization.
Change the kernel_id and hypervisor_type:

MariaDB [nova]> update instance_system_metadata set value='' where id='19811';
MariaDB [nova]> update instance_system_metadata set value='kvm' where id='19812';

Table instances

The device names still contain references to Xen (/dev/xvda) and there’s another reference to the kernel_id:

MariaDB [nova]> select uuid,kernel_id,host,display_name,root_device_name from instances where uuid='c0d108c5';
+----------+-----------+----------+--------------+------------------+
| uuid                                     | kernel_id                            | host         | display_name | root_device_name |
+----------+-----------+----------+--------------+------------------+
| c0d108c5 | 66a1daec  | compute3 | xen-vm             | /dev/xvda        |
+----------+-----------+----------+--------------+------------------+

Change the values for kernel_id, root_device_name and the host:

MariaDB [nova]> update instances set kernel_id='' where uuid='c0d108c5';
MariaDB [nova]> update instances set root_device_name='/dev/vda' where uuid='c0d108c5';
MariaDB [nova]> update instances set host='compute1' where uuid='c0d108c5';

These changes were sufficient to launch a previous Xen instance on a KVM host. Please note that this was a test instance and the impact of the database manipulation is not really predictable if applied in a production environment. I’ll update this article if we face any problems regarding this topic. Please leave a comment if you have something to add or if there’s something unclear about the migration steps.

Update:

In the meantime we upgraded our whole private cloud to use KVM hypervisors, all of the instances could be migrated successfully from Xen to KVM. But during an upgrade of the OpenStack software we had to move instances from one compute node to another to upgrade them one by one. And there we faced the first problems with this kind of migration. When trying a cold migrate (or resize) of an instance nova shows this error:

control:~ # openstack server resize --flavor 746d48df-461f-4f26-bdd5-55968dd49952 77d26eb2-aa20-424c-aaac-30d1e412e79f
No valid host was found. No valid host found for resize

Digging into this with debug logs gives a hint where to look at (/var/log/nova/nova-scheduler.log):

[...]
2017-11-08 17:29:34.631 12009 DEBUG nova.scheduler.filters.image_props_filter [req-57bf3266-82d9-4934-8ac3-82fc4c4c4ef1 89c5dcc8793d4867bae22d50e51e16b3 90c403f317ee47feb0dad58461e76fb1 - - -] Instance contains properties ImageMetaProps(hw_architecture=<?>,[...] img_config_drive=<?>,img_hv_requested_version=<?>,img_hv_type='xen',[...] os_type=<?>) that are not provided by the compute node supported_instances [...] or hypervisor version 2009000 do not match _instance_supported [...]
[...] Filter ImagePropertiesFilter returned 0 hosts [...]

I highlighted the interesting part, this instance still has a reference to its former hypervisor_type. I had to update the “nova_api” database to be able to resize/migrate this instance. The output of the SQL query is too large for this page, I shortened it for visibility purposes.

# Get instance specs 
MariaDB [nova_api]> SELECT instance_uuid,spec FROM request_specs WHERE instance_uuid='77d26eb2-aa20-424c-aaac-30d1e412e79f';
[...]
| 77d26eb2-aa20-424c-aaac-30d1e412e79f | {"nova_object.version": [...] "ImageMetaProps", "nova_object.data": {"img_owner_id": "2e3c3f3822124a3fa9fd905164f519ae", "img_hv_type": "xen"}, "nova_object.namespace": "nova"},[...] "nova_object.namespace": "nova"} |
[...]
# Update database (with complete string for "spec")
MariaDB [nova_api]> UPDATE request_specs SET spec='{"nova_object.version": [...] "ImageMetaProps", "nova_object.data": {"img_owner_id": "2e3c3f3822124a3fa9fd905164f519ae", "img_hv_type": "kvm"}, "nova_object.namespace": "nova"},[...] "nova_object.namespace": "nova"}' where instance_uuid='77d26eb2-aa20-424c-aaac-30d1e412e79f';

The changed “img_hv_type” is highlighted again. This little change did the trick, the resize/cold migration work fine now.

This entry was posted in OpenStack, Virtualisation and tagged , , , . Bookmark the permalink.

Leave a Reply