OpenStack with Ceph: failing nova snapshots

I have another experience to share with Stackers that use Ceph as their storage backend. In this environment Ceph is used for Glance, Nova and Cinder. I won’t get into configuration details, I assume you’re familiar with this kind of setup. So let’s dive into the issue.

I’m referring to OpenStack Ocata, this environment has been constantly upgraded over the last years, it started as a demo in Kilo and went productive in Liberty version. A couple of months ago it was upgraded from Mitaka to Ocata (via Newton) successfully. Then after some time I decided to do some cleaning, things like removing deprecated options from config files, applying new config files etc.

In particular, this log statement from glance-api.log caught my attention:

Option "show_multiple_locations" from group "DEFAULT" is deprecated
for removal. Its value may be silently ignored in the future.

I double-checked with the Ceph docs:

Any OpenStack version except Mitaka

If you want to enable copy-on-write cloning of images, also add under the [DEFAULT] section:

show_image_direct_url = True

For Mitaka only

To enable image locations and take advantage of copy-on-write cloning for images, add under the [DEFAULT] section:

show_multiple_locations = True
show_image_direct_url = True

So I decided to set that option to false and restarted the service. Everything seemed fine, nobody complained about anything (yet).

But apparently, they didn’t use the Nova snapshot function for a longer period of time. Because some weeks later a user tried to take a snapshot of a running instance and it failed with no helpful message in the Horizon dashboard (not really surprising). Uploading new images to Glance still worked just fine, also no complaints from the Cinder services.
Looking into the logs I found a hint:

[req-5bd2fef2-2155-4a89-b346-e20fb0b0d14a
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] Failed to snapshot image
 Traceback (most recent call last):
   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
line 1626, in snapshot
     purge_props=False)
   File "/usr/lib/python2.7/site-packages/nova/image/api.py", line 132,
in update
     purge_props=purge_props)
   File "/usr/lib/python2.7/site-packages/nova/image/glance.py", line
733, in update
     _reraise_translated_image_exception(image_id)
   File "/usr/lib/python2.7/site-packages/nova/image/glance.py", line
1050, in _reraise_translated_image_exception
     six.reraise(type(new_exc), new_exc, exc_trace)
   File "/usr/lib/python2.7/site-packages/nova/image/glance.py", line
731, in update
     image = self._update_v2(context, sent_service_image_meta, data)
   File "/usr/lib/python2.7/site-packages/nova/image/glance.py", line
745, in _update_v2
     image = self._add_location(context, image_id, location)
   File "/usr/lib/python2.7/site-packages/nova/image/glance.py", line
630, in _add_location
     location, {})
   File "/usr/lib/python2.7/site-packages/nova/image/glance.py", line
168, in call
     result = getattr(controller, method)(*args, **kwargs)
   File "/usr/lib/python2.7/site-packages/glanceclient/v2/images.py",
line 340, in add_location
     response = self._send_image_update_request(image_id, add_patch)
   File "/usr/lib/python2.7/site-packages/glanceclient/common/utils.py",
line 535, in inner
     return RequestIdProxy(wrapped(*args, **kwargs))
   File "/usr/lib/python2.7/site-packages/glanceclient/v2/images.py",
line 324, in _send_image_update_request
     data=json.dumps(patch_body))
   File "/usr/lib/python2.7/site-packages/glanceclient/common/http.py",
line 294, in patch
     return self._request('PATCH', url, **kwargs)
   File "/usr/lib/python2.7/site-packages/glanceclient/common/http.py",
line 277, in _request
     resp, body_iter = self._handle_response(resp)
   File "/usr/lib/python2.7/site-packages/glanceclient/common/http.py",
line 107, in _handle_response
     raise exc.from_response(resp, resp.content)
 ImageNotAuthorized: Not authorized for image
e99b2dfd-db33-4475-a51f-af4b913a7041.

Apparently, a function to add a location to an image could not be executed, that seems closely related to the respective config option I had deactivated…

During the snapshot creation I could see that the snapshot had been created successfully, but the flattening process failed as soon as Glance had to update the image properties (location details of the new image).
Following the stack trace I found the explanation in /usr/lib/python2.7/site-packages/nova/image/glance.py:

def _add_location(self, context, image_id, location):
        # 'show_multiple_locations' must be enabled in glance api conf
file.

So it’s not Mitaka anymore, but the code still requires this option? Strange, but alright, I gave it another shot. After setting it to true again, both cold and live snapshots worked just fine.
An answer from the mailing list cleared things up:

[…] show_multiple_locations is deprecated, but its removal has been
postponed, so you should continue to use it (but keep an eye on the
Glance release notes).
[…]

You can find more information about this here. Basically, it’s a security issue and it’s planned to solve this with policies, but it would require major refactoring.

But I’m still wondering, if show_multiple_locations was only relevant for Mitaka, but it doesn’t work without it in a newer version, how are other environments configured and do snapshots work in those environments? Please feel free to comment if you’d like to share your experience regarding this config option.

Disclaimer, and please leave a comment below

And as always with such articles: The steps above do work for us, but needn’t work for you. So if anything goes wrong while you try to reproduce above procedure, it’s not our fault, but yours. And it’s yours to fix it! But whether it works for you or not, please leave a comment below so that others will know.

This entry was posted in Ceph, OpenStack, Virtualisation and tagged , , , , . Bookmark the permalink.

Leave a Reply