Project

General

Profile

Bug #51463

blocked requests while stopping/starting OSDs

Added by Manuel Lausch over 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

we run into a lot of slow requests. (IO blocked for several seconds) while stopping or starting one or more OSDs. With Nautilus this wasn't a issue at all.
We set the slow op warning to 5 seconds because our application (which uses librados native) has a timeout of 6 seconds.

I could drill it down to this new introduced read leases: https://docs.ceph.com/en/latest/dev/osd_internals/stale_read/

For testing I set the mentioned option "osd_pool_default_read_lease_ratio" from default 0.8 to 0.2 which obviously resolve this issue. But I don't know if there are other implications in setting this down.

As well I wonder if this read leases could or should not be invalidated in case of osd up/down events.

I could reproduce this issue with a quite small test clusters with 3 Nodes รก 5 OSDs
I tested it with ceph octopus 15.2.13

Manuel

disabled_fastshutdown_stop_osd-ceph.log View (23.9 KB) Manuel Lausch, 11/03/2021 09:25 AM

default_stop_osd-ceph.log View (23.3 KB) Manuel Lausch, 11/03/2021 09:25 AM

default_start_osd-ceph.log View (12.7 KB) Manuel Lausch, 11/03/2021 09:25 AM

History

#1 Updated by Josh Durgin over 2 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (OSD)

#2 Updated by Josh Durgin over 2 years ago

  • Priority changed from Normal to High

#3 Updated by Manuel Lausch over 2 years ago

This is still a issue- In the newest Pacific release (16.2.5) as well

The developer documentation mentioned above talks about preventing reads from stray OSDs. I'am aware that this could be a issue while stopping OSDs. But I don't understand why I get laggy PGs wen starting OSDs.

please have a look at this issue.This is a showstopper which prevents us from upgrade our clusters from luminous to octopus ore pacific.

#4 Updated by Neha Ojha over 2 years ago

Is it possible for you to share your test reproducer with us? It would be great if we could run it against a vstart cluster.

#5 Updated by Manuel Lausch over 2 years ago

Sure.

Simple cluster with 5 nodes 125 OSDs in total
one pool replicated size 3, min_size 1

at least this in the ceph.conf
osd op complaint time = 5

running some benchmark. for example
rados bench -p testbench 100 write --no-cleanup

And now stop / start some OSDs.
In my case I will see in ceph -s output, and in the ceph.log also, some slow op alerts after stopping and starting some osds. Up to about 10 - 15 seconds blocked.
The client (ceph bench in this case) recognizes this slow operations as well.

please let me know if you need further information

#6 Updated by Sage Weil over 2 years ago

  • Status changed from New to Need More Info

I easily reproduced this with 'osd fast shutdown = false' (vstart default), but was unable to do so with 'osd fast shutdown = true' (the normal default).

#7 Updated by Manuel Lausch over 2 years ago

Hi Sage,

I tested it with fast shutdown enabled (default) and disabled. In both cases I got slow ops (longer than 5 seconds) after stopping one OSDs.

attached three ceph.log snippets.

with enabled fast shutdown (default) it took about 2 seconds until the first "immediately failed" message appeared after stopping one OSD. On bigger clusters this took some more time which is why I disabled the fast shutdown per default.

with disabled fast_shutdown I get immediately the down message and the cluster begins its peering. And then the slow op messages begin.

After starting the OSD again I got in both cases slow ops too

#8 Updated by Maximilian Stinsky about 2 years ago

I think we hit the same issue while upgrading our nautilus cluster to pacific.
While I did not hit this when testing the upgrade in our lab environment, we saw a low of slow requests and laggy pg's after we upgraded the first couple of osd hosts in our production cluster.
After we finished the upgrade I can start/stop/restart osd's without any issue but while doing the upgrade we had slight impact because of the slow requests and laggy pg's.

#9 Updated by Radoslaw Zarzynski almost 2 years ago

We suspect this ticker is actually a duplicate of https://tracker.ceph.com/issues/53327.
If somebody could test the fix, it would really helpful.

#11 Updated by Radoslaw Zarzynski almost 2 years ago

  • Status changed from Need More Info to Resolved

Also available in: Atom PDF