SUSE Cloud: nova-compute down

This article refers to a three node environment based on SUSE Cloud 5, consisting of 1 admin, 1 control (both SLES11-SP3) and 1 compute node on SLES12.
If you decide to run your compute node on SLES12 it’s likely that you’ll be facing some difficulties getting your cloud working. One of them is the service nova-compute.

Background

When I finally managed to run the cloud installation script after spending a lot of time configuring the network and the repositories, the installation of my control and compute nodes was pretty easy. The service deployment was also quite simple, I was just using default settings. Then I had a working cloud, it seemed.

Possible Problems

I created a project, uploaded an image and wanted to start an instance. But the creation was cancelled with an error:

No valid host was found

You can find this message in /var/log/nova/nova-conductor.log on your control node:

NoValidHost exception with message: 'No valid host was found.'
Setting instance to ERROR state.

or in the nova-compute.log on the compute node.

Solution

I found a hint searching the web, a possible cause could be a stopped service. Indeed, the command nova-manage services list showed an error state:

Binary Host Zone Status State Updated_At
nova-conductor d00-16-3e-00-20-01 internal enabled  :-) 2015-05-12 15:08:52.887225
nova-compute d0c-c4-7a-06-71-f0 nova enabled XXX 2015-05-12 15:08:53.572586

In the log files on compute node there was a hint to this state:

May 11 10:07:07 d0c-c4-7a-06-71-f0 openstack-nova-compute[371]: Starting nova-compute..done
May 11 10:07:08 d0c-c4-7a-06-71-f0 libvirtd[1087]: End of file while reading data: Input/output error
May 11 10:07:08 d0c-c4-7a-06-71-f0 openstack-nova-compute[494]: Shutting down nova-compute..done

The nova-compute service was started and shutdown immediately. This message occurred repeatedly in the logs.
Digging a little deeper in the nova-compute.log I found another hint:

2015-05-11 10:07:08.100 380 AUDIT nova.service [-] Starting compute node (version 2014.2.3-2014.2.3.dev5)
2015-05-11 10:07:08.171 380 WARNING nova.virt.libvirt.driver [-] URI xen:/// does not support full set of host capabilities: this function is not supported by the connection driver: virConnectBaselineCPU
2015-05-11 10:07:08.172 380 WARNING nova.virt.libvirt.driver [-] The libvirt driver is not tested on xen/x86_64 by the OpenStack project and thus its quality can not be ensured.

It was just a warning, but libvirt could be a problem, I figured. My colleague helped me finding this bug.
I edited the affected lines in /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py on compute node and restarted the service with the command service openstack-nova-compute restart, nova-compute was running now.

I opened a service request for this issue, the result was a PTF I tested successfully. As far as I know, the changes still aren’t included in the SUSE channels. If I don’t install that PTF I still get the same messages.

Now that I’m working and experimenting with SUSE Cloud for a couple of weeks I have been facing the same error message for different reasons. For example if you upload an improper image and try to start an instance, you could get the same error message (“No valid host was found”). It’s always a good idea to check the logs on your compute node in /var/log/nova/nova-compute.log for possible causes.

This entry was posted in SUSE Cloud and tagged , , . Bookmark the permalink.

Leave a Reply