Bug #51463
closed
blocked requests while stopping/starting OSDs
Added by Manuel Lausch almost 3 years ago.
Updated about 2 years ago.
Description
Hi,
we run into a lot of slow requests. (IO blocked for several seconds) while stopping or starting one or more OSDs. With Nautilus this wasn't a issue at all.
We set the slow op warning to 5 seconds because our application (which uses librados native) has a timeout of 6 seconds.
I could drill it down to this new introduced read leases: https://docs.ceph.com/en/latest/dev/osd_internals/stale_read/
For testing I set the mentioned option "osd_pool_default_read_lease_ratio" from default 0.8 to 0.2 which obviously resolve this issue. But I don't know if there are other implications in setting this down.
As well I wonder if this read leases could or should not be invalidated in case of osd up/down events.
I could reproduce this issue with a quite small test clusters with 3 Nodes รก 5 OSDs
I tested it with ceph octopus 15.2.13
Manuel
Files
- Project changed from Ceph to RADOS
- Category deleted (
OSD)
- Priority changed from Normal to High
This is still a issue- In the newest Pacific release (16.2.5) as well
The developer documentation mentioned above talks about preventing reads from stray OSDs. I'am aware that this could be a issue while stopping OSDs. But I don't understand why I get laggy PGs wen starting OSDs.
please have a look at this issue.This is a showstopper which prevents us from upgrade our clusters from luminous to octopus ore pacific.
Is it possible for you to share your test reproducer with us? It would be great if we could run it against a vstart cluster.
Sure.
Simple cluster with 5 nodes 125 OSDs in total
one pool replicated size 3, min_size 1
at least this in the ceph.conf
osd op complaint time = 5
running some benchmark. for example
rados bench -p testbench 100 write --no-cleanup
And now stop / start some OSDs.
In my case I will see in ceph -s output, and in the ceph.log also, some slow op alerts after stopping and starting some osds. Up to about 10 - 15 seconds blocked.
The client (ceph bench in this case) recognizes this slow operations as well.
please let me know if you need further information
- Status changed from New to Need More Info
I easily reproduced this with 'osd fast shutdown = false' (vstart default), but was unable to do so with 'osd fast shutdown = true' (the normal default).
Hi Sage,
I tested it with fast shutdown enabled (default) and disabled. In both cases I got slow ops (longer than 5 seconds) after stopping one OSDs.
attached three ceph.log snippets.
with enabled fast shutdown (default) it took about 2 seconds until the first "immediately failed" message appeared after stopping one OSD. On bigger clusters this took some more time which is why I disabled the fast shutdown per default.
with disabled fast_shutdown I get immediately the down message and the cluster begins its peering. And then the slow op messages begin.
After starting the OSD again I got in both cases slow ops too
I think we hit the same issue while upgrading our nautilus cluster to pacific.
While I did not hit this when testing the upgrade in our lab environment, we saw a low of slow requests and laggy pg's after we upgraded the first couple of osd hosts in our production cluster.
After we finished the upgrade I can start/stop/restart osd's without any issue but while doing the upgrade we had slight impact because of the slow requests and laggy pg's.
- Status changed from Need More Info to Resolved
Also available in: Atom
PDF