Actions
Bug #65545
openQuiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request
% Done:
0%
Source:
Tags:
backport_processed
Backport:
squid
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
quiesce
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Reported by the QE team at https://bugzilla.redhat.com/show_bug.cgi?id=2275459
2024-04-17T07:26:33.666+0000 7fa0a7c0b640 10 quiesce.mgr.44189 <sanitize_roots> Normalized root '/volumes/_nogroup/sv_def_6/c37e6c79-a83a-4b1d-96e8-16584f440626' to 'file:/volumes/_nogroup/sv_def_6/c37e6c79-a83a-4b1d-96e8-16584f440626' ... 2024-04-17T07:26:33.666+0000 7fa0a840c640 10 quiesce.mds.0 <operator()> submit_request: value:file:/volumes/_nogroup/sv_def_6/c37e6c79-a83a-4b1d-96e8-16584f440626 2024-04-17T07:26:33.667+0000 7fa0a840c640 10 quiesce.agt <agent_thread_main> got request handle < mds.0:3431> for 'file:/volumes/_nogroup/sv_def_6/c37e6c79-a83a-4b1d-96e8-16584f440626' ... 2024-04-17T07:26:33.669+0000 7fa0a840c640 10 quiesce.mds.0 <operator()> submit_request: value:file:/volumes/_nogroup/sv_def_6/c37e6c79-a83a-4b1d-96e8-16584f440626 ... 2024-04-17T07:26:33.670+0000 7fa0a840c640 10 quiesce.agt <agent_thread_main> got request handle < mds.0:3437> for 'file:/volumes/_nogroup/sv_def_6/c37e6c79-a83a-4b1d-96e8-16584f440626' ... 2024-04-17T07:26:33.674+0000 7fa0a7c0b640 5 quiesce.mgr.44189 <leader_upkeep_set> [cg_test1_p00@106,file:/volumes/_nogroup/sv_def_6/c37e6c79-a83a-4b1d-96e8-16584f440626] reported by at least one peer as: QS_FAILED (6)
This problem is due to a race condition that appears when multiple db updates are posted to the agent rapidly.
When new roots begin processing but don't yet make it into the currently tracked set, there is a window for the next update with the same roots to treat them as new.
Updated by Leonid Usov 13 days ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 56956
Updated by Leonid Usov 12 days ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot 12 days ago
- Copied to Backport #65570: squid: Quiesce may fail randomly with EBADF due to the same root submitted to the MDCache multiple times under the same quiesce request added
Actions