Project

General

Profile

Bug #18768

rbd rm on empty volumes 2/3 sec per volume

Added by Ben England about 7 years ago. Updated almost 6 years ago.

Status:
Closed
Priority:
Low
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

speed of "rbd rm" command slows to 2/3 second when 10000 RBD volumes are being deleted, but "rbd create" remains below 150 msec. These are EMPTY volumes, why so slow?

This may impact use of RBD volumes with Kubernetes containers, where there could be a LOT of RBD volumes.

RHCS 2.1 ~= Ceph Jewel
RPMs end with:
*-10.2.3-13.el7cp.x86_64
RHEL 7.3
kernel 3.10.0-514.el7.x86_64

in https://s3.amazonaws.com/ben-england/ceph/
rbd-create-delete-scaling-2017-02-01.png - the data
mk-rbd-vols.sh - script to create N RBD volumes in a pool
rm-rbd-vols.sh - script to delete N RBD volumes in a pool

  1. bash ./mk-rbd-vols 100 my-storage-pool
  2. bash ./rm-rbd-vols 100 my-storage-pool

Note that the time it takes to delete is increasing sharply from 1000 to 10000 RBD volumes, suggesting that it is no longer scalable when you reach this level.

It's annoying that RBD tries to provide a progress indicator for deletes by default. I would think this would be something you would do only if the user specifically requested it. What do incremental percentages mean for empty volume anyway? Suggest that rbd either drop the progress indicator only output incremental percentage if the removal takes longer than 3-5 seconds and then only update it every 3-5 seconds.

History

#1 Updated by Jason Dillaman about 7 years ago

  • Status changed from New to Need More Info
  • Priority changed from Normal to Low

@Ben:

(1) "These are EMPTY volumes": while they are technically empty, when you create 10G images w/o the object map feature enabled, "rbd rm" needs to issue deletes against 2560+ (potential) objects.
(2) "It's annoying that RBD tries to provide a progress indicator for deletes by default": see point (1) above for the rationale of why it's enabled by default. Feel free to add the "--no-progress" optional if you don't want feedback.
(3) Can you provide timings for each invocation of "rbd" so that there is more clarity into your dataset?

#2 Updated by Nathan Cutler about 7 years ago

  • Target version deleted (v11.2.1)

#3 Updated by Ben England about 7 years ago

while --no-progress worked, it didn't help. And when I tried to follow your suggestion:

  1. rbd create --size 10G --image-feature layering --image-feature object-map ben/v0015
    librbd: cannot use object map without exclusive lock

I read a little more and found out that this doesn't work unless you have exclusive-lock feature, so I tried that and it succeeded, but then I could not map the device:

  1. rbd create --size 10G --image-feature layering --image-feature object-map --image-feature exclusive-lock ben/v0016
  1. rbd map ben/v0016
    rbd: sysfs write failed
    RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable".
    In some cases useful info is found in syslog - try "dmesg | tail" or so.
    rbd: map failed: (6) No such device or address

So I want these RBD devices to be mapped with kernel RBD because they are intended to be used by Linux containers. How can I do what you suggest?

#4 Updated by Jason Dillaman about 7 years ago

@Ben: I wasn't suggesting that "--no-progress" would improve speed, I was responding to your strong opinions.

For the short-term, you are correct re: what features krbd supports. Longer term is (1) krbd will get support for object map (it already has support for exclusive lock), and (2) it appears that kubernetes and other container management systems are moving towards qemu-nbd / qemu-tcmu implementations instead which would eliminate the use of krbd and instead would switch to librbd.

#5 Updated by Ben England about 7 years ago

But you were totally correct about object-map feature - this cuts time of removal from approx 0.4 sec to approx 0.12 sec, or ~3.3x faster. at this point, removes and creates take roughly the same amount of time, which is what I'd expect.

In this short 100-volume test, I just added "time" in front of rbd command, log that to a file, extract the elapsed time in seconds, and calculate statistics from that.

With --object-map:

[root@gprfs033-10ge ~]# ./extract-time.sh /tmp/run2 | ./statistics
100 = number of samples
0.128330 = sample mean
0.005879 = standard deviation
0.124000 = min value
0.129000 = 80th percentile
0.131000 = 90th percentile
0.167000 = max value

without --object-map:

[root@gprfs033-10ge ~]# ./extract-time.sh /tmp/run3 | ./statistics
100 = number of samples
0.394690 = sample mean
0.004950 = standard deviation
0.388000 = min value
0.397000 = 80th percentile
0.398000 = 90th percentile
0.425000 = max value

#6 Updated by Ben England almost 6 years ago

This issue was submitted against kernel RBD in Ceph Jewel, but the kernel RBD implementation has changed. The object map feature improved performance on RBD volume delete to be on par with RBD volume create. I think we have enough progress on this issue to close it.

#7 Updated by Jason Dillaman almost 6 years ago

  • Status changed from Need More Info to Closed

Also available in: Atom PDF