blocked requests while stopping/starting OSDs
we run into a lot of slow requests. (IO blocked for several seconds) while stopping or starting one or more OSDs. With Nautilus this wasn't a issue at all.
We set the slow op warning to 5 seconds because our application (which uses librados native) has a timeout of 6 seconds.
I could drill it down to this new introduced read leases: https://docs.ceph.com/en/latest/dev/osd_internals/stale_read/
For testing I set the mentioned option "osd_pool_default_read_lease_ratio" from default 0.8 to 0.2 which obviously resolve this issue. But I don't know if there are other implications in setting this down.
As well I wonder if this read leases could or should not be invalidated in case of osd up/down events.
I could reproduce this issue with a quite small test clusters with 3 Nodes á 5 OSDs
I tested it with ceph octopus 15.2.13
#3 Updated by Manuel Lausch over 2 years ago
This is still a issue- In the newest Pacific release (16.2.5) as well
The developer documentation mentioned above talks about preventing reads from stray OSDs. I'am aware that this could be a issue while stopping OSDs. But I don't understand why I get laggy PGs wen starting OSDs.
please have a look at this issue.This is a showstopper which prevents us from upgrade our clusters from luminous to octopus ore pacific.
#5 Updated by Manuel Lausch over 2 years ago
Simple cluster with 5 nodes 125 OSDs in total
one pool replicated size 3, min_size 1
at least this in the ceph.conf
osd op complaint time = 5
running some benchmark. for example
rados bench -p testbench 100 write --no-cleanup
And now stop / start some OSDs.
In my case I will see in ceph -s output, and in the ceph.log also, some slow op alerts after stopping and starting some osds. Up to about 10 - 15 seconds blocked.
The client (ceph bench in this case) recognizes this slow operations as well.
please let me know if you need further information
#7 Updated by Manuel Lausch over 2 years ago
- File disabled_fastshutdown_stop_osd-ceph.log View added
- File default_stop_osd-ceph.log View added
- File default_start_osd-ceph.log View added
I tested it with fast shutdown enabled (default) and disabled. In both cases I got slow ops (longer than 5 seconds) after stopping one OSDs.
attached three ceph.log snippets.
with enabled fast shutdown (default) it took about 2 seconds until the first "immediately failed" message appeared after stopping one OSD. On bigger clusters this took some more time which is why I disabled the fast shutdown per default.
with disabled fast_shutdown I get immediately the down message and the cluster begins its peering. And then the slow op messages begin.
After starting the OSD again I got in both cases slow ops too
#8 Updated by Maximilian Stinsky about 2 years ago
I think we hit the same issue while upgrading our nautilus cluster to pacific.
While I did not hit this when testing the upgrade in our lab environment, we saw a low of slow requests and laggy pg's after we upgraded the first couple of osd hosts in our production cluster.
After we finished the upgrade I can start/stop/restart osd's without any issue but while doing the upgrade we had slight impact because of the slow requests and laggy pg's.