OpenStack: Upgrade to high availability (Part V)

If you read the previous posts of this little series (hopefully it was interesting) you might be interested in this last post as well because it will conclude this series. I’ll focus on the main aspects we had to consider when the actual migration was about to happen. The pacemaker cluster was up and running as were all the required services.

To be able to import the upgraded database the Galera cluster had to be bootstrapped with empty OpenStack databases. Only the keystone-bootstrap commands were executed to create all the required directories under /etc/keystone (e.g. credential-keys, fernet-keys etc.):

keystone-manage fernet_setup --keystone-user keystone --keystone-group keystone
keystone-manage credential_setup --keystone-user keystone --keystone-group keystone
keystone-manage bootstrap --bootstrap-password KEYSTONE_PASS \
--bootstrap-admin-url http://controller.domain:5000/v3/ \
--bootstrap-internal-url http://controller.domain:5000/v3/ \
--bootstrap-public-url http://controller.domain:5000/v3/ \
--bootstrap-region-id RegionOne

Before the migration could start we declared a maintenance window for the old cloud environment and informed all colleagues to not create or delete any instances or doing anything similar that would alternate the database. A fresh database dump was created and imported into the “ocata-vm” to replay all the upgrade steps again. Luckily, it all went well and also much quicker this time since I had practiced it multiple times. :-)

The database was imported to the new controller successfully, I just had to edit the Keystone endpoints again to switch from ocata-vm to the virtual controller.domain. One service after the other was started, beginning with apache (Keystone and Horizon dashboard). There were several small issues to resolve but eventually the new Cloud environment was up and running, even the login for our LDAP domain users to the Horizon dashboard worked as expected! Now it was time to migrate the instances, including the switch from linuxbridge to openvswitch.

Neutron

The first thing to change was the Neutron database to reflect our new network architecture with openvswitch. We have a mix of self-service and (flat) provider networks, this is an excerpt of our network segments:

# Old neutron table
MariaDB [neutron]> select network_type,physical_network,segmentation_id from neutron.networksegments;
+--------------+------------------+-----------------+
| network_type | physical_network | segmentation_id |
+--------------+------------------+-----------------+
| flat         | dmz              | NULL            |
| vlan         | physnet1         | 1137            |
| vlan         | physnet1         | 1169            |
| flat         | prod             | NULL            |
| vlan         | physnet1         | 1199            |
| flat         | pxe              | NULL            |
| vlan         | physnet1         | 1130            |
| flat         | team             | NULL            |
| flat         | floating         | NULL            |
+--------------+------------------+-----------------+

The previously flat provider networks without a segmentation_id now required one and their network_type changed to “vlan” instead of “flat”.

# Update database for provider networks
MariaDB [neutron]> update networksegments set physical_network='provider',network_type='vlan',segmentation_id='300' where id='UUID';

# Update database for self-service networks
MariaDB [neutron]> update networksegments set physical_network=NULL,network_type='vxlan' where network_id='UUID';

# Repeat for all networks

# New neutron table
MariaDB [neutron]> select network_type,physical_network,segmentation_id from neutron.networksegments;
+--------------+------------------+-----------------+
| network_type | physical_network | segmentation_id |
+--------------+------------------+-----------------+
| vlan         | provider         | 100             |
| vxlan        | NULL             | 1137            |
| vxlan        | NULL             | 1169            |
| vlan         | provider         | 300             |
+--------------+------------------+-----------------+

The previous flat networks now were mapped to the br-provider we configured in openvswitch while all the self-service network traffic would be handled internally without a physical network. That seemed to work quite well, newly created instances had their interfaces successfully attached to openvswitch. For existing instances we had to figure out first what other changes were necessary, but I’ll get to that later.

Another thing to keep in mind is to disable routers for self-service networks in the old environment if you want floating IPs to be reachable in the new Cloud, otherwise the old control node will still reply to client requests and you’ll end up with broken connections. This was not a problem for provider networks, of course, because they all have external routers. I could have shutdown the old control node, but people were still working in the old environment since we wanted to migrate one machine by one. This procedure required some planning because some of our projects contain VMs in shared self-service networks, so we couldn’t just disable the router. All instances that weren’t required to be up all the time were shutdown so we could decrease the downtime for the rest of the instances in those networks.

Cinder

All existing volumes in the Cinder database still pointed to the old control node, more database changes were necessary to make those volumes manageable. But it was straight forward:

MariaDB [cinder]> update volumes set host=replace(host,'old-control', 'controller') where not status='deleted';

There’s one more thing to be aware of when Cinder is configured to be highly available: usually the Cinder back-end would be identified by the control node it was running on, so in our case we would end up with two different Cinder back-ends (controller01 and controller02) but only one would be up because it’s a stateful service. But in case of a fail-over volumes managed by the failed back-end would be unavailable (at least not manageable). The solution was to configure Cinder to use the same (virtual) hostname on both control nodes:

# cinder.conf
[DEFAULT]
[...]
host = controller

controller01:~ # openstack volume service list
+------------------+--------------------+------+---------+-------+
| Binary           | Host               | Zone | Status  | State |
+------------------+--------------------+------+---------+-------+
| cinder-scheduler | controller         | nova | enabled | up    |
| cinder-backup    | controller         | nova | enabled | up    |
| cinder-volume    | controller@rbd     | nova | enabled | up    |
| cinder-volume    | controller@rbd2    | nova | enabled | up    |
| cinder-volume    | controller@ceph-ec | nova | enabled | up    |
+------------------+--------------------+------+---------+-------+

This fix for Cinder was easy and left us with a fail-safe highly available Cinder service.

Nova

Now we’re getting to the most important part: migrating the instances to new compute nodes. To migrate an instance it would be shutdown properly in the old Cloud to prevent data loss. Then the new database had to be altered slightly (compute05 is one of the new installed compute nodes in the new environment). These are the required steps to make it work (comments inline):

# Update the (new) database
MariaDB [nova]> update instances set host='compute05',node='compute05.domain' where uuid='UUID';

# Hard reboot forces a recreation of the xml definition on that compute node
controller01:~ # nova reboot --hard <UUID>

# Because of cached network settings the instance boots with wrong interface
# excerpt from nova-compute.log
2020-08-12 INFO os_vif [...] Successfully plugged vif VIFBridge
(active=True,address=fa:16:3e:f9:82:bf,bridge_name='brq5b999016-b6',
has_traffic_filtering=True,id=b0483deb-f89b-4e59-8cdc-b285a22d2ef8,
network=Network(5b999016-b679-4e08-82e0-2d1c569015c2),
plugin='linux_bridge',port_profile=<?>,preserve_on_delete=True,vif_name='tapb0483deb-f8')

# The cached interface is stored here
MariaDB [nova]> select * from nova.instance_info_caches where instance_uuid='UUID';

# DON'T CHANGE THIS TABLE!
# Instead detach and reattach the interface

controller01:~ # nova interface-list UUID
+------------+--------------------------------------+--------------------------------------+----------------+-------------------+
| Port State | Port ID                              | Net ID                               |IP addresses    | MAC Addr          |
+------------+--------------------------------------+--------------------------------------+----------------+-------------------+
| ACTIVE     | b0483deb-f89b-4e59-8cdc-b285a22d2ef8 | 5b999016-b679-4e08-82e0-2d1c569015c2 | xxx.xxx.xxx.xx | fa:16:3e:f9:82:bf |
+------------+--------------------------------------+--------------------------------------+----------------+-------------------+

# Instances were shutdown before detaching and attaching interfaces
controller01:~ # nova interface-detach UUID b0483deb-f89b-4e59-8cdc-b285a22d2ef8

# nova-compute.log
Successfully unplugged vif VIFBridge(active=True,address=fa:16:3e:f9:82:bf,
bridge_name='brq5b999016-b6',has_traffic_filtering=True,id=b0483deb-f89b-4e59-8cdc-b285a22d2ef8,
network=Network(5b999016-b679-4e08-82e0-2d1c569015c2),plugin='linux_bridge',port_profile=<?>,
preserve_on_delete=True,vif_name='tapb0483deb-f8')

INFO nova.network.neutron [...] [instance: UUID] Port b0483deb-f89b-4e59-8cdc-b285a22d2ef8 from network info_cache 
is no longer associated with instance in Neutron. Removing from network info_cache.

# reattach interface
controller01:~ # nova interface-attach --port-id b0483deb-f89b-4e59-8cdc-b285a22d2ef8 UUID

# compute node reports
Successfully plugged vif VIFBridge(active=True,address=fa:16:3e:f9:82:bf,
bridge_name='qbrb0483deb-f8',has_traffic_filtering=True,id=b0483deb-f89b-4e59-8cdc-b285a22d2ef8,
network=Network(5b999016-b679-4e08-82e0-2d1c569015c2),plugin='ovs',
port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tapb0483deb-f8')

# start instance

You see, Neutron cleans up after itself and removes the network cache. Reattaching the interface resulted in a correct configuration and the instances were reachable again.

However, for some of our instances in self-service networks Neutron not only removed the cache but it deleted the port entirely! That was unexpected and I don’t really have a clue why it did that, but we still had the old control node with the port information available so we simply had to recreate the deleted port(s) for the instances and then attach them. For a few instances it wasn’t even necessary to detach the port, for some reason they booted with the correct interface configuration on the first attempt, and I have absolutely no idea why.

Although it was quite a lot of preparation and manual intervention it eventually worked, every single instance could be migrated successfully! There still are some unresolved issues, for example openvswitch seems to create more overhead than linuxbridge, resulting in broken connections for self-service networks. The default MTU size is 1500 but we debugged and figured out that our instances in self-service networks require an MTU size of 1422. Until we find a better solution our workaround is to change the network’s MTU size:

controller01:~ # openstack network set --mtu 1422 NETWORK_ID

That was it, our long lasting journey to upgrade our single-control-node environment to a highly available pacemaker cluster came to a successful end. I haven’t heard any complaints (yet) from my colleagues, so it seems to be still working quite well, I believe. :-)

I hope you read this with interest or even enjoyed it! As I already mentioned in a previous post all this work was prepared for our specific environment so the documented steps (most likely) will not apply to a different setup, so if anything goes wrong while you try to reproduce above procedure it’s not our fault, but yours. And it’s yours to fix it! But whether it works for you or not, please leave a comment below so that others will know.

This entry was posted in Ceph, High Availability, OpenStack and tagged , , , , . Bookmark the permalink.

Leave a Reply