Project

General

Profile

Actions

Bug #5401

closed

cuttlefish osd recovery slow

Added by Stefan Priebe almost 11 years ago. Updated almost 11 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
cuttlefish
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

While the peering is fine now (Bug #5232) (latest upstream/cuttlefish) even without wip_cuttlefish_compact_on_startup the recovery itself is not.

While the osd is recovering i'm still seeing a lot of SLOW requests and also stucked I/O on the virtual machines.This also happens at night when there is no real load on the ceph storage (ceph -w show only 60-200 ops).

I also tested these settings:
osd recovery max active = 2
osd_recovery_op_priority = 1
osd_min_pg_log_entries = 300

but they do not help either. All OSDs have their journal on SSDs.

just a very small portion of the osd log:
2013-06-20 00:22:04.762379 7f7216fb8700 0 log [WRN] : 6 slow requests, 6 included below; oldest blocked for > 30.646148 secs
2013-06-20 00:22:04.762389 7f7216fb8700 0 log [WRN] : slow request 30.646148 seconds old, received at 2013-06-20 00:21:34.115930: osd_op(client.12181171.0:501117 rbd_data.63aa616b8b4567.00000000000097b1 [write 655360~4096] 4.f700780f snapc 603e=[] e95953) v4 currently commit sent
2013-06-20 00:22:04.762392 7f7216fb8700 0 log [WRN] : slow request 30.646131 seconds old, received at 2013-06-20 00:21:34.115947: osd_op(client.12181171.0:501118 rbd_data.63aa616b8b4567.00000000000097b1 [write 663552~4096] 4.f700780f snapc 603e=[] e95953) v4 currently commit sent
2013-06-20 00:22:04.762394 7f7216fb8700 0 log [WRN] : slow request 30.646117 seconds old, received at 2013-06-20 00:21:34.115961: osd_op(client.12181171.0:501119 rbd_data.63aa616b8b4567.00000000000097b1 [write 671744~4096] 4.f700780f snapc 603e=[] e95953) v4 currently commit sent
2013-06-20 00:22:04.762396 7f7216fb8700 0 log [WRN] : slow request 30.646104 seconds old, received at 2013-06-20 00:21:34.115974: osd_op(client.12181171.0:501120 rbd_data.63aa616b8b4567.00000000000097b1 [write 692224~4096] 4.f700780f snapc 603e=[] e95953) v4 currently no flag points reached
2013-06-20 00:22:04.762400 7f7216fb8700 0 log [WRN] : slow request 30.646079 seconds old, received at 2013-06-20 00:21:34.115999: osd_op(client.12181171.0:501121 rbd_data.63aa616b8b4567.00000000000097b1 [write 708608~4096] 4.f700780f snapc 603e=[] e95953) v4 currently no flag points reached
2013-06-20 00:22:05.762538 7f7216fb8700 0 log [WRN] : 10 slow requests, 6 included below; oldest blocked for > 31.646542 secs
2013-06-20 00:22:05.762544 7f7216fb8700 0 log [WRN] : slow request 31.646476 seconds old, received at 2013-06-20 00:21:34.116013: osd_op(client.12181171.0:501122 rbd_data.63aa616b8b4567.00000000000097b1 [write 716800~4096] 4.f700780f snapc 603e=[] e95953) v4 currently no flag points reached
2013-06-20 00:22:05.762546 7f7216fb8700 0 log [WRN] : slow request 31.646458 seconds old, received at 2013-06-20 00:21:34.116031: osd_op(client.12181171.0:501123 rbd_data.63aa616b8b4567.00000000000097b1 [write 724992~4096] 4.f700780f snapc 603e=[] e95953) v4 currently no flag points reached
2013-06-20 00:22:05.762548 7f7216fb8700 0 log [WRN] : slow request 31.646439 seconds old, received at 2013-06-20 00:21:34.116050: osd_op(client.12181171.0:501124 rbd_data.63aa616b8b4567.00000000000097b1 [write 733184~4096] 4.f700780f snapc 603e=[] e95953) v4 currently no flag points reached
2013-06-20 00:22:05.762551 7f7216fb8700 0 log [WRN] : slow request 31.646410 seconds old, received at 2013-06-20 00:21:34.116079: osd_op(client.12181171.0:501125 rbd_data.63aa616b8b4567.00000000000097b1 [write 741376~4096] 4.f700780f snapc 603e=[] e95953) v4 currently no flag points reached
2013-06-20 00:22:05.762553 7f7216fb8700 0 log [WRN] : slow request 31.646398 seconds old, received at 2013-06-20 00:21:34.116091: osd_op(client.12181171.0:501126 rbd_data.63aa616b8b4567.00000000000097b1 [write 749568~4096] 4.f700780f snapc 603e=[] e95953) v4 currently no flag points reached
2013-06-20 00:22:06.762687 7f7216fb8700 0 log [WRN] : 14 slow requests, 6 included below; oldest blocked for > 32.646670 secs
2013-06-20 00:22:06.762693 7f7216fb8700 0 log [WRN] : slow request 32.646530 seconds old, received at 2013-06-20 00:21:34.116101: osd_op(client.12181171.0:501127 rbd_data.63aa616b8b4567.00000000000097b1 [write 757760~4096] 4.f700780f snapc 603e=[] e95953) v4 currently no flag points reached
2013-06-20 00:22:06.762696 7f7216fb8700 0 log [WRN] : slow request 32.646521 seconds old, received at 2013-06-20 00:21:34.116110: osd_op(client.12181171.0:501128 rbd_data.63aa616b8b4567.00000000000097b1 [write 765952~4096] 4.f700780f snapc 603e=[] e95953) v4 currently no flag points reached

Actions

Also available in: Atom PDF