Bug #51463

blocked requests while stopping/starting OSDs

Added by Manuel Lausch over 2 years ago. Updated almost 2 years ago.

3 - minor
we run into a lot of slow requests. (IO blocked for several seconds) while stopping or starting one or more OSDs. With Nautilus this wasn't a issue at all.
We set the slow op warning to 5 seconds because our application (which uses librados native) has a timeout of 6 seconds.

I could drill it down to this new introduced read leases:

For testing I set the mentioned option "osd_pool_default_read_lease_ratio" from default 0.8 to 0.2 which obviously resolve this issue. But I don't know if there are other implications in setting this down.

As well I wonder if this read leases could or should not be invalidated in case of osd up/down events.

I could reproduce this issue with a quite small test clusters with 3 Nodes รก 5 OSDs
I tested it with ceph octopus 15.2.13


#1 Updated by Josh Durgin over 2 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (OSD)

#2 Updated by Josh Durgin over 2 years ago

  • Priority changed from Normal to High

#3 Updated by Manuel Lausch over 2 years ago

This is still a issue- In the newest Pacific release (16.2.5) as well

The developer documentation mentioned above talks about preventing reads from stray OSDs. I'am aware that this could be a issue while stopping OSDs. But I don't understand why I get laggy PGs wen starting OSDs.

please have a look at this issue.This is a showstopper which prevents us from upgrade our clusters from luminous to octopus ore pacific.

#4 Updated by Neha Ojha over 2 years ago

Is it possible for you to share your test reproducer with us? It would be great if we could run it against a vstart cluster.

#5 Updated by Manuel Lausch over 2 years ago


Simple cluster with 5 nodes 125 OSDs in total
one pool replicated size 3, min_size 1

at least this in the ceph.conf
osd op complaint time = 5

running some benchmark. for example
rados bench -p testbench 100 write --no-cleanup

And now stop / start some OSDs.
In my case I will see in ceph -s output, and in the ceph.log also, some slow op alerts after stopping and starting some osds. Up to about 10 - 15 seconds blocked.
The client (ceph bench in this case) recognizes this slow operations as well.

please let me know if you need further information

#6 Updated by Sage Weil over 2 years ago

  • Status changed from New to Need More Info

I easily reproduced this with 'osd fast shutdown = false' (vstart default), but was unable to do so with 'osd fast shutdown = true' (the normal default).

#7 Updated by Manuel Lausch over 2 years ago

Hi Sage,

I tested it with fast shutdown enabled (default) and disabled. In both cases I got slow ops (longer than 5 seconds) after stopping one OSDs.

attached three ceph.log snippets.

with enabled fast shutdown (default) it took about 2 seconds until the first "immediately failed" message appeared after stopping one OSD. On bigger clusters this took some more time which is why I disabled the fast shutdown per default.

with disabled fast_shutdown I get immediately the down message and the cluster begins its peering. And then the slow op messages begin.

After starting the OSD again I got in both cases slow ops too

#8 Updated by Maximilian Stinsky about 2 years ago

I think we hit the same issue while upgrading our nautilus cluster to pacific.
While I did not hit this when testing the upgrade in our lab environment, we saw a low of slow requests and laggy pg's after we upgraded the first couple of osd hosts in our production cluster.
After we finished the upgrade I can start/stop/restart osd's without any issue but while doing the upgrade we had slight impact because of the slow requests and laggy pg's.

#9 Updated by Radoslaw Zarzynski almost 2 years ago

We suspect this ticker is actually a duplicate of
If somebody could test the fix, it would really helpful.

#11 Updated by Radoslaw Zarzynski almost 2 years ago

  • Status changed from Need More Info to Resolved

