Cephadm: migrate block.db/block.wal to new device

A couple of years ago, before cephadm took over Ceph deployments, we wrote an article about migrating the DB/WAL devices from slow to fast devices. The procedure has become much easier than it used to be, thanks to ceph-bluestore-tool (or alternatively, ceph-volume). Keep in mind that cephadm managed clusters are typically running in containers, so the migration of the DB/WAL needs to be performed within the OSD containers.

To keep it brief, I’ll only focus on the DB device (block.db), migrating to a separate WAL device (block.wal) is very similar.

# Create Volume Group
ceph:~ # vgcreate ceph-db /dev/vdf

# Create Logical Volume
ceph:~ # lvcreate -L 5G -n ceph-osd0-db ceph-db

# Set the noout flag for OSD.0
ceph:~ # ceph osd add-noout osd.0

# Stop the OSD
ceph:~ # ceph orch daemon stop osd.0

# Enter the OSD shell
ceph:~ # cephadm shell --name osd.0

# Get the OSD's FSID
[ceph: root@ceph /]# OSD_FSID=$(ceph-volume lvm list 0 | awk '/osd fsid/ {print $3}')
fb69ba54-4d56-4c90-a855-6b350d186df5

# Create the DB device
[ceph: root@ceph /]# ceph-volume lvm new-db --osd-id 0 --osd-fsid $OSD_FSID --target ceph-db/ceph-osd0-db

# Migrate the DB to the new device
[ceph: root@ceph /]# ceph-volume lvm migrate --osd-id 0 --osd-fsid $OSD_FSID --from /var/lib/ceph/osd/ceph-0/block --target ceph-db/ceph-osd0-db

# Exit the shell and start the OSD
ceph:~ # ceph orch daemon start osd.0


# Unset the noout flag
ceph:~ # ceph osd rm-noout osd.0

To verify the new configuration, you can inspect the OSD’s metadata:

ceph:~ # osd metadata 0 -f json | jq -r '.bluefs_dedicated_db,.devices'
1
vdb, vdf

We can confirm that we now have a dedicated db device.

You can also check the OSD’s perf dump:

ceph:~ # ceph tell osd.0 perf dump bluefs | jq -r
'.[].db_total_bytes,.[].db_used_bytes'
5368700928
47185920

That’s it, the OSD’s DB is now on a different device!

Migrate DB back to main device

If you’re looking for the other way around, that’s also possible. Although this works with ceph-volume as well, for the sake of variety I’ll show the way with ceph-bluestore-tool:

# Set the noout flag
ceph:~ # ceph osd add-noout osd.0

# Stop the OSD
ceph:~ # ceph orch daemon stop osd.0

# Enter the shell
ceph:~ # cephadm shell --name osd.0

# Migrate DB to main device

[ceph: root@ceph /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-0/ --command bluefs-bdev-migrate --devs-source /var/lib/ceph/osd/ceph-0/block.db --dev-target /var/lib/ceph/osd/ceph-0/block
inferring bluefs devices from bluestore path
 device removed:1 /var/lib/ceph/osd/ceph-0/block.db

# IMPORTANT: Remove the DB's Logical Volume before you start the OSD! Otherwise the OSD will use it again because of the LV tags.
ceph:~ # lvremove /dev/ceph-db/ceph-osd0-db

# Alternatively, delete the LV tags of the DB LV before starting the OSD.

# Start the OSD
ceph:~ # ceph orch daemon start osd.0

# Unset the noout flag
ceph:~ # ceph osd rm-noout osd.0

Verify the results:

ceph:~ # ceph osd metadata 0 -f json | jq -r '.bluefs_dedicated_db,.devices'
0
vdb

The provided steps were performed on a Reef cluster (version 18.2.4).

Disclaimer: As always with such articles: The steps above do work for us, but needn’t work for you. So if anything goes wrong while you try to reproduce above procedure, it’s not our fault, but yours. And it’s yours to fix it! But whether it works for you or not, please leave a comment below so that others will know.

This entry was posted in Ceph, cephadm and tagged , , , , . Bookmark the permalink.

Leave a Reply