Project

General

Profile

Actions

Bug #63858

open

ceph-bluestore-tool bluefs-bdev-expand doesn't adjust OSD free space when NCB mode is in use

Added by Gary Ritzer 4 months ago. Updated 28 days ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
quincy, reef
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We are using Rook v1.10 with Ceph v17.2.6 on a hand-built Kubernetes cluster in AWS (not using EKS). We needed to increase the storage capacity of this cluster, so I increased the size of the EBS volumes attached to our nodes from 50GiB per to 150GiB, and noticed that after deleting the pod for osd.0 the `expand-bluefs`container ran, logged 2 errors about reading the OSD label, but seemingly did not `fail` as the pod continued to start up and the OSD became available. The "using 100GiB" is clearly wrong as it isn't possible to have 100GiB in use on a volume that was 50Gib. As you can see in the `ceph osd df` output, the total size of the OSD was updated correctly but the miscalculated space in use is also shown.

# Logs from expand-bluefs container
inferring bluefs devices from bluestore path
1 : device size 0x2580000000 : using 0x1902b80000(100 GiB)
Expanding DB/WAL...
1 : expanding  from 0xc80000000 to 0x2580000000
2023-12-19T16:20:26.575+0000 7f95c1d9b880 -1 bluestore(/var/lib/ceph/osd/ceph-0) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-0: (21) Is a directory
2023-12-19T16:20:26.575+0000 7f95c1d9b880 -1 bluestore(/var/lib/ceph/osd/ceph-0) unable to read label for /var/lib/ceph/osd/ceph-0: (21) Is a directory
# Before resize:
bash-4.4$ ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP  META     AVAIL    %USE  VAR   PGS  STATUS
 2   nvme  0.04880   1.00000   50 GiB   38 MiB  7.9 MiB   0 B   31 MiB   50 GiB  0.08  1.05   57      up
 0   nvme  0.04880   1.00000   50 GiB   35 MiB  9.9 MiB   0 B   25 MiB   50 GiB  0.07  0.96   62      up
 1   nvme  0.04880   1.00000   50 GiB   37 MiB   11 MiB   0 B   26 MiB   50 GiB  0.07  1.00   60      up
 4   nvme  0.04880   1.00000   50 GiB   37 MiB   11 MiB   0 B   26 MiB   50 GiB  0.07  1.00   64      up
                       TOTAL  200 GiB  147 MiB   39 MiB   0 B  107 MiB  200 GiB  0.07
MIN/MAX VAR: 0.96/1.05  STDDEV: 0.00

# After resize
bash-4.4$ ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP    META     AVAIL    %USE   VAR   PGS  STATUS
 2   nvme  0.04880   1.00000   50 GiB   39 MiB  7.9 MiB     0 B   31 MiB   50 GiB   0.08  0.00   57      up
 0   nvme  0.04880   1.00000  150 GiB  100 GiB   10 MiB  12 KiB  6.2 MiB   50 GiB  66.68  2.00   62      up
 1   nvme  0.04880   1.00000   50 GiB   37 MiB   11 MiB     0 B   26 MiB   50 GiB   0.07  0.00   60      up
 4   nvme  0.04880   1.00000   50 GiB   37 MiB   11 MiB     0 B   26 MiB   50 GiB   0.07  0.00   64      up
                       TOTAL  300 GiB  100 GiB   40 MiB  13 KiB   89 MiB  200 GiB  33.38
MIN/MAX VAR: 0.00/2.00  STDDEV: 33.30

This issue is very repeatable for us - at first it happened on a larger cluster similar to this, but the change in used size pushed the OSD into the `nearfull` state. I have seen this issue reported to Rook, but in that case the user created new OSDs to get past it (as I have done), but the issue still remains and is a serious issue for us.

Also, while this is being investigated, is there some sort of workaround that we could use short of creating new OSDs?


Related issues 2 (1 open1 closed)

Copied to bluestore - Backport #64091: reef: ceph-bluestore-tool bluefs-bdev-expand doesn't adjust OSD free space when NCB mode is in useResolvedActions
Copied to bluestore - Backport #64092: quincy: ceph-bluestore-tool bluefs-bdev-expand doesn't adjust OSD free space when NCB mode is in useIn ProgressActions
Actions #1

Updated by Igor Fedotov 4 months ago

  • Project changed from Ceph to bluestore
  • Category deleted (ceph cli)
Actions #2

Updated by Igor Fedotov 4 months ago

  • Status changed from New to Triaged
  • Backport set to quincy, reef

The issue is caused by no allocmap update after device expansion. This is specific to NCB mode of operations which updates that allocmap on graceful OSD shutdown and rebuilds after non-graceful one.

Actions #3

Updated by Igor Fedotov 4 months ago

Also, while this is being investigated, is there some sort of workaround that we could use short of creating new OSDs?

Once bluefs-bdev-expand command has been applied - start 'expanded' OSD and kill the relevant process with 'kill -9'. Then start OSD again. This will trigger allocmap rebuild (which might take a few minutes depending on volume utilization and disk's performance).

After such a rebuild actual space utilization should be recovered.

Actions #4

Updated by Igor Fedotov 4 months ago

  • Subject changed from ceph-bluestore-tool bluefs-bdev-expand not calculating OSD allocated size correctly to ceph-bluestore-tool bluefs-bdev-expand doesn't adjust OSD free space when NCB mode is in use
Actions #5

Updated by Igor Fedotov 4 months ago

  • Status changed from Triaged to Fix Under Review
  • Pull request ID set to 54990
Actions #6

Updated by Gary Ritzer 4 months ago

Igor Fedotov wrote:

Also, while this is being investigated, is there some sort of workaround that we could use short of creating new OSDs?

Once bluefs-bdev-expand command has been applied - start 'expanded' OSD and kill the relevant process with 'kill -9'. Then start OSD again. This will trigger allocmap rebuild (which might take a few minutes depending on volume utilization and disk's performance).

After such a rebuild actual space utilization should be recovered.

Thank you Igor. Because I am running in Kubernetes the kill command has no effect, but killing the docker container for the OSD from the docker host does. Adding this note in case anyone else hits this.

Actions #7

Updated by Igor Fedotov 3 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #8

Updated by Backport Bot 3 months ago

  • Copied to Backport #64091: reef: ceph-bluestore-tool bluefs-bdev-expand doesn't adjust OSD free space when NCB mode is in use added
Actions #9

Updated by Backport Bot 3 months ago

  • Copied to Backport #64092: quincy: ceph-bluestore-tool bluefs-bdev-expand doesn't adjust OSD free space when NCB mode is in use added
Actions #10

Updated by Backport Bot 3 months ago

  • Tags set to backport_processed
Actions #11

Updated by Gary Ritzer about 1 month ago

Hi. I have not seen a release for Quincy that contains this fix, is there an ETA for that?

Actions #12

Updated by Igor Fedotov about 1 month ago

Gary Ritzer wrote:

Hi. I have not seen a release for Quincy that contains this fix, is there an ETA for that?

Hi Gary,

the Quincy backport PR is pending QA, see https://github.com/ceph/ceph/pull/55776
Hopefully to be included in the next minor Quincy release.

Actions

Also available in: Atom PDF