Project

General

Profile

Actions

Bug #50903

closed

ceph_objectstore_tool: Slow ops reported during the test.

Added by Sridhar Seshasayee almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Actions #1

Updated by Sridhar Seshasayee almost 3 years ago

  • Subject changed from :$ to ceph_objectstore_tool: Slow ops reported during the test.

JobId:
/a/sseshasa-2021-05-17_11:08:21-rados-wip-sseshasa-testing-2021-05-17-1504-distro-basic-smithi/6118306

Observations:
ceph_objectstore_tool actually passed. Test failed since slow ops were reported in the cluster log during the course of the test as part of the “Checking cluster for badness” task.

failure_reason: '"2021-05-17T12:22:57.121856+0000 osd.2 (osd.2) 1 : cluster [WRN]
  slow request osd_op(client.4143.0:4 2.1 2.3e6a8311 (undecoded) ondisk+retry+read+known_if_redirected
  e36) initiated 2021-05-17T12:22:26.217784+0000 currently delayed" in cluster log'

The above op was re-queued many times since the concerned PG 2.1 was not in active state. The
PG was handling peering events until e40 at which point the above op completed.

The grep'ed logs below show the above op in question:

2021-05-17T12:22:26.216+0000 7f1a9011c700 15 osd.2 36 enqueue_op 0x5640eab429a0 prio 63 type 42 cost 16 latency 0.000019 epoch 36 osd_op(client.4143.0:4 2.1 2.3e6a8311 (undecoded) ondisk+retry+read+known_if_redirected e36) v8
2021-05-17T12:22:26.216+0000 7f1a6737a700 10 osd.2 36 dequeue_op 0x5640eab429a0 prio 63 cost 16 latency 0.000071 osd_op(client.4143.0:4 2.1 2.3e6a8311 (undecoded) ondisk+retry+read+known_if_redirected e36) v8 pg pg[2.1( empty local-lis/les=17/18 n=0 ec=17/17 lis/c=17/17 les/c/f=18/18/0 sis=35) [2,3] r=0 lpr=35 pi=[17,35)/1 crt=0'0 mlcod 0'0 peering mbc={}]
2021-05-17T12:22:26.216+0000 7f1a6737a700 10 osd.2 pg_epoch: 36 pg[2.1( empty local-lis/les=17/18 n=0 ec=17/17 lis/c=17/17 les/c/f=18/18/0 sis=35) [2,3] r=0 lpr=35 pi=[17,35)/1 crt=0'0 mlcod 0'0 peering mbc={}] can_handle_while_inactive: 0x5640eab429a0
2021-05-17T12:22:26.216+0000 7f1a6737a700 10 osd.2 36 dequeue_op 0x5640eab429a0 finish
2021-05-17T12:22:59.401+0000 7f1a6737a700 20 osd.2 pg_epoch: 38 pg[2.1( empty local-lis/les=17/18 n=0 ec=17/17 lis/c=17/17 les/c/f=18/18/0 sis=38) [2] r=0 lpr=38 pi=[17,38)/1 crt=0'0 mlcod 0'0 unknown mbc={}] requeue_op 0x5640eab429a0
2021-05-17T12:22:59.401+0000 7f1a6b382700 10 osd.2 38 dequeue_op 0x5640eab429a0 prio 63 cost 16 latency 33.184278 osd_op(client.4143.0:4 2.1 2.3e6a8311 (undecoded) ondisk+retry+read+known_if_redirected e36) v8 pg pg[2.1( empty local-lis/les=17/18 n=0 ec=17/17 lis/c=17/17 les/c/f=18/18/0 sis=38) [2] r=0 lpr=38 pi=[17,38)/1 crt=0'0 mlcod 0'0 peering mbc={}]
2021-05-17T12:22:59.401+0000 7f1a6b382700 10 osd.2 pg_epoch: 38 pg[2.1( empty local-lis/les=17/18 n=0 ec=17/17 lis/c=17/17 les/c/f=18/18/0 sis=38) [2] r=0 lpr=38 pi=[17,38)/1 crt=0'0 mlcod 0'0 peering mbc={}] can_handle_while_inactive: 0x5640eab429a0
2021-05-17T12:22:59.401+0000 7f1a6b382700 10 osd.2 38 dequeue_op 0x5640eab429a0 finish
2021-05-17T12:23:00.407+0000 7f1a6b382700 20 osd.2 pg_epoch: 39 pg[2.1( empty local-lis/les=17/18 n=0 ec=17/17 lis/c=17/17 les/c/f=18/18/0 sis=39) [2,3] r=0 lpr=39 pi=[17,39)/1 crt=0'0 mlcod 0'0 unknown mbc={}] requeue_op 0x5640eab429a0
2021-05-17T12:23:00.407+0000 7f1a6b382700 10 osd.2 39 dequeue_op 0x5640eab429a0 prio 63 cost 16 latency 34.190553 osd_op(client.4143.0:4 2.1 2.3e6a8311 (undecoded) ondisk+retry+read+known_if_redirected e36) v8 pg pg[2.1( empty local-lis/les=17/18 n=0 ec=17/17 lis/c=17/17 les/c/f=18/18/0 sis=39) [2,3] r=0 lpr=39 pi=[17,39)/1 crt=0'0 mlcod 0'0 peering mbc={}]
2021-05-17T12:23:00.407+0000 7f1a6b382700 10 osd.2 pg_epoch: 39 pg[2.1( empty local-lis/les=17/18 n=0 ec=17/17 lis/c=17/17 les/c/f=18/18/0 sis=39) [2,3] r=0 lpr=39 pi=[17,39)/1 crt=0'0 mlcod 0'0 peering mbc={}] can_handle_while_inactive: 0x5640eab429a0
2021-05-17T12:23:00.407+0000 7f1a6b382700 10 osd.2 39 dequeue_op 0x5640eab429a0 finish
2021-05-17T12:23:01.407+0000 7f1a6737a700 20 osd.2 pg_epoch: 40 pg[2.1( empty local-lis/les=39/40 n=0 ec=17/17 lis/c=39/17 les/c/f=40/18/0 sis=39) [2,3] r=0 lpr=39 pi=[17,39)/1 crt=0'0 mlcod 0'0 active mbc={}] requeue_op 0x5640eab429a0
2021-05-17T12:23:01.408+0000 7f1a6737a700 10 osd.2 40 dequeue_op 0x5640eab429a0 prio 63 cost 16 latency 35.190918 osd_op(client.4143.0:4 2.1 2.3e6a8311 (undecoded) ondisk+retry+read+known_if_redirected e36) v8 pg pg[2.1( empty local-lis/les=39/40 n=0 ec=17/17 lis/c=39/17 les/c/f=40/18/0 sis=39) [2,3] r=0 lpr=39 pi=[17,39)/1 crt=0'0 mlcod 0'0 active mbc={}]
2021-05-17T12:23:01.408+0000 7f1a6737a700 10 osd.2 pg_epoch: 40 pg[2.1( empty local-lis/les=39/40 n=0 ec=17/17 lis/c=39/17 les/c/f=40/18/0 sis=39) [2,3] r=0 lpr=39 pi=[17,39)/1 crt=0'0 mlcod 0'0 active mbc={}] _handle_message: 0x5640eab429a0
2021-05-17T12:23:01.408+0000 7f1a6737a700 10 osd.2 40 dequeue_op 0x5640eab429a0 finish
Actions #2

Updated by Sridhar Seshasayee almost 3 years ago

  • Project changed from Ceph to RADOS
  • ceph-qa-suite rados added
Actions #3

Updated by Sridhar Seshasayee almost 3 years ago

  • Assignee set to Sridhar Seshasayee

This issue is related to the changes currently under review: https://github.com/ceph/ceph/pull/41308

The ceph_objectstore_tool test uses filestore and as part of the "Test pg import" test, all the OSDs are restarted.
Due to the introduction of the "osd bench" test as part of OSD initialization sequence in the above PR, the total time taken for the osd bench test to complete with filestore backend for osd.0 was around 27 secs as seen below:

osd.0
-----

2021-05-17T12:20:54.319+0000 7f353887d0c0  1 osd.0 24  bench count 12288000 bsize 4 KiB
...
2021-05-17T12:21:25.614+0000 7f353887d0c0  1 osd.0 24 maybe_override_max_osd_capacity_for_qos osd bench result - bandwidth (MiB/sec): 134.632 iops: 34465.816 elapsed_sec: 0.087

osd.1
-----

2021-05-17T12:21:30.496+0000 7f840dad40c0  1 osd.1 24  bench count 12288000 bsize 4 KiB
...
2021-05-17T12:21:56.826+0000 7f840dad40c0  1 osd.1 24 maybe_override_max_osd_capacity_for_qos osd bench result - bandwidth (MiB/sec): 122.683 iops: 31406.912 elapsed_sec: 0.096

osd.2
-----

2021-05-17T12:22:01.394+0000 7f1a9718f0c0  1 osd.2 24  bench count 12288000 bsize 4 KiB
...
2021-05-17T12:22:22.784+0000 7f1a9718f0c0  1 osd.2 24 maybe_override_max_osd_capacity_for_qos osd bench result - bandwidth (MiB/sec): 117.199 iops: 30003.054 elapsed_sec: 0.100

osd.3
------

2021-05-17T12:22:27.397+0000 7f8da34ec0c0  1 osd.3 24  bench count 12288000 bsize 4 KiB
...
2021-05-17T12:22:58.705+0000 7f8da34ec0c0  1 osd.3 24 maybe_override_max_osd_capacity_for_qos osd bench result - bandwidth (MiB/sec): 133.168 iops: 34090.917 elapsed_sec: 0.088

This was observed for all the osds in question. The restart of osd.3 coincided with the arrival of the osd_op.
Since osd.3 didn't initialize until 2021-05-17T12:22:58, the peering operation was delayed and therefore the osd_op
0x5640eab429a0 was delayed. Due to this, slow op was recorded in the cluster log which ultimately caused the
test to be marked as failed.

The above slowness due to the osd bench test is not observed with bluestore backend (on SSDs).

Solutions:

1. Short-term: Limit the number of objects to write to 100 by the osd bench test. The improves the osd init time and avoid
the issue altogether.
2. Long-term: Store the osd bench result in the config dB. On subsequent restarts of an osd, use the historical data
if present in the config dB and prevent running the osd bench tool.

Actions #4

Updated by Sridhar Seshasayee almost 3 years ago

  • Status changed from New to Closed

Closing this since the issue was hit during teuthology testing of my PR: https://github.com/ceph/ceph/pull/41308. This issue was related to the changes made in the mentioned PR. The said PR was modified accordingly to prevent this issue from occurring.

Actions

Also available in: Atom PDF