Project

General

Profile

Actions

Bug #64869

open

rados/thrash: slow reservation response from 1 (115547ms) in cluster log

Added by Sridhar Seshasayee 2 months ago. Updated about 2 months ago.

Status:
Pending Backport
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587833

The cluster log warning is raised since scrub reservation exceeded the default timeout. I didn't
look into the specifics further. Should this be added to the ignorelist?

Logs:

2024-03-10T02:31:32.736+0000 7f70cdae0640 20 osd.6 op_wq(4) _process 4.c to_process <OpSchedulerItem(4.c PGOpItem(op=MOSDScrubReserve(4.c,GRANT,e:270,from:1,reservation_nonce:0) v3) class_id 2 prio 127 cost 0 e270)> waiting <> waiting_peering {}
2024-03-10T02:31:32.736+0000 7f70cdae0640 20 osd.6 op_wq(4) _process OpSchedulerItem(4.c PGOpItem(op=MOSDScrubReserve(4.c,GRANT,e:270,from:1,reservation_nonce:0) v3) class_id 2 prio 127 cost 0 e270) pg 0x5635ab21c000
2024-03-10T02:31:32.736+0000 7f70cdae0640 10 osd.6 330 dequeue_op MOSDScrubReserve(4.c,GRANT,e:270,from:1,reservation_nonce:0) v3 prio 127 cost 0 latency 0.402707 MOSDScrubReserve(4.c,GRANT,e:270,from:1,reservation_nonce:0) v3 pg pg[4.c( v 143'128 (136'125,143'128] local-lis/les=259/260 n=4 ec=24/24 lis/c=259/259 les/c/f=260/260/0 sis=259) [6,3,1] r=0 lpr=259 crt=143'128 lcod 143'127 mlcod 0'0 active+clean+scrubbing [ 4.c:  ] ]
2024-03-10T02:31:32.740+0000 7f70cdae0640 10 osd.6 330 maybe_share_map: con v1:172.21.15.67:6809/4149191103 our osdmap epoch of 330 is not newer than session's projected_epoch of 330
2024-03-10T02:31:32.740+0000 7f70cdae0640 10 osd.6 pg_epoch: 330 pg[4.c( v 143'128 (136'125,143'128] local-lis/les=259/260 n=4 ec=24/24 lis/c=259/259 les/c/f=260/260/0 sis=259) [6,3,1] r=0 lpr=259 crt=143'128 lcod 143'127 mlcod 0'0 active+clean+scrubbing [ 4.c:  ] ] _handle_message: MOSDScrubReserve(4.c,GRANT,e:270,from:1,reservation_nonce:0) v3
2024-03-10T02:31:32.740+0000 7f70cdae0640 10 osd.6 pg_epoch: 330 pg[4.c( v 143'128 (136'125,143'128] local-lis/les=259/260 n=4 ec=24/24 lis/c=259/259 les/c/f=260/260/0 sis=259) [6,3,1] r=0 lpr=259 crt=143'128 lcod 143'127 mlcod 0'0 active+clean+scrubbing [ 4.c:  ] ] scrubber<Session/ReservingReplicas>: handle_scrub_reserve_msgs: MOSDScrubReserve(4.c,GRANT,e:270,from:1,reservation_nonce:0) v3
2024-03-10T02:31:32.740+0000 7f70cdae0640 20  scrubberFSM  event: --vvvv---- ReplicaGrant
2024-03-10T02:31:32.740+0000 7f70cdae0640 10 osd.6 pg_epoch: 330 pg[4.c( v 143'128 (136'125,143'128] local-lis/les=259/260 n=4 ec=24/24 lis/c=259/259 les/c/f=260/260/0 sis=259) [6,3,1] r=0 lpr=259 crt=143'128 lcod 143'127 mlcod 0'0 active+clean+scrubbing [ 4.c:  ] ] scrubber<Session/ReservingReplicas>: FSM: ReservingReplicas::react(const ReplicaGrant&)
2024-03-10T02:31:32.740+0000 7f70cdae0640  0 log_channel(cluster) log [WRN] : slow reservation response from 1 (115547ms)

Related issues 1 (0 open1 closed)

Copied to RADOS - Backport #65072: squid: rados/thrash: slow reservation response from 1 (115547ms) in cluster logResolvedRonen FriedmanActions
Actions

Also available in: Atom PDF