Bug #51463
closedblocked requests while stopping/starting OSDs
0%
Description
Hi,
we run into a lot of slow requests. (IO blocked for several seconds) while stopping or starting one or more OSDs. With Nautilus this wasn't a issue at all.
We set the slow op warning to 5 seconds because our application (which uses librados native) has a timeout of 6 seconds.
I could drill it down to this new introduced read leases: https://docs.ceph.com/en/latest/dev/osd_internals/stale_read/
For testing I set the mentioned option "osd_pool_default_read_lease_ratio" from default 0.8 to 0.2 which obviously resolve this issue. But I don't know if there are other implications in setting this down.
As well I wonder if this read leases could or should not be invalidated in case of osd up/down events.
I could reproduce this issue with a quite small test clusters with 3 Nodes รก 5 OSDs
I tested it with ceph octopus 15.2.13
Manuel
Files
Updated by Josh Durgin almost 3 years ago
- Project changed from Ceph to RADOS
- Category deleted (
OSD)
Updated by Manuel Lausch over 2 years ago
This is still a issue- In the newest Pacific release (16.2.5) as well
The developer documentation mentioned above talks about preventing reads from stray OSDs. I'am aware that this could be a issue while stopping OSDs. But I don't understand why I get laggy PGs wen starting OSDs.
please have a look at this issue.This is a showstopper which prevents us from upgrade our clusters from luminous to octopus ore pacific.
Updated by Neha Ojha over 2 years ago
Is it possible for you to share your test reproducer with us? It would be great if we could run it against a vstart cluster.
Updated by Manuel Lausch over 2 years ago
Sure.
Simple cluster with 5 nodes 125 OSDs in total
one pool replicated size 3, min_size 1
at least this in the ceph.conf
osd op complaint time = 5
running some benchmark. for example
rados bench -p testbench 100 write --no-cleanup
And now stop / start some OSDs.
In my case I will see in ceph -s output, and in the ceph.log also, some slow op alerts after stopping and starting some osds. Up to about 10 - 15 seconds blocked.
The client (ceph bench in this case) recognizes this slow operations as well.
please let me know if you need further information
Updated by Sage Weil over 2 years ago
- Status changed from New to Need More Info
I easily reproduced this with 'osd fast shutdown = false' (vstart default), but was unable to do so with 'osd fast shutdown = true' (the normal default).
Updated by Manuel Lausch over 2 years ago
- File disabled_fastshutdown_stop_osd-ceph.log disabled_fastshutdown_stop_osd-ceph.log added
- File default_stop_osd-ceph.log default_stop_osd-ceph.log added
- File default_start_osd-ceph.log default_start_osd-ceph.log added
Hi Sage,
I tested it with fast shutdown enabled (default) and disabled. In both cases I got slow ops (longer than 5 seconds) after stopping one OSDs.
attached three ceph.log snippets.
with enabled fast shutdown (default) it took about 2 seconds until the first "immediately failed" message appeared after stopping one OSD. On bigger clusters this took some more time which is why I disabled the fast shutdown per default.
with disabled fast_shutdown I get immediately the down message and the cluster begins its peering. And then the slow op messages begin.
After starting the OSD again I got in both cases slow ops too
Updated by Maximilian Stinsky about 2 years ago
I think we hit the same issue while upgrading our nautilus cluster to pacific.
While I did not hit this when testing the upgrade in our lab environment, we saw a low of slow requests and laggy pg's after we upgraded the first couple of osd hosts in our production cluster.
After we finished the upgrade I can start/stop/restart osd's without any issue but while doing the upgrade we had slight impact because of the slow requests and laggy pg's.
Updated by Radoslaw Zarzynski about 2 years ago
We suspect this ticker is actually a duplicate of https://tracker.ceph.com/issues/53327.
If somebody could test the fix, it would really helpful.
Updated by Manuel Lausch about 2 years ago
yes. This is fixed with this two tasks:
https://tracker.ceph.com/issues/53327
https://tracker.ceph.com/issues/53326
Updated by Radoslaw Zarzynski about 2 years ago
- Status changed from Need More Info to Resolved