Actions
Bug #65669
openQuiesceDB responds with a misleading error to a quiesce-await of a terminated set.
Status:
Fix Under Review
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Development
Tags:
Backport:
squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS, quiesce
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
This design decision appears counterintuitive after having seen it in the wild.
Here the --await was sent with a delay, and must have seen the set already timed out. However, the response code suggested that the set was still active and reached timeout after this await command was received:
2024-04-25T20:36:15.770 DEBUG:tasks.quiescer.fs.[cephfs]:Running ceph command: 'tell mds.24479 quiesce db --set-id d960ac51 --await-for 42.0' ... 2024-04-25T20:36:46.709 ERROR:tasks.quiescer.fs.[cephfs]:Couldn't quiesce root with rc: 110 (ETIMEDOUT), stdout: { "epoch": 33, "leader": 24479, "set_version": 6, "sets": { "d960ac51": { "version": 6, "age_ref": 86.5, "state": { "name": "TIMEDOUT", "age": 0.0 }, "timeout": 60.0, "expiration": 86.2, "members": { "file:/": { "excluded": false, "state": { "name": "QUIESCING", "age": 60.0 } } } } } }
It would be less surprising to receive an EPERM in this case, which will indicate a misconception of the calling side about the current state of the set.
Additionally, it will be consistent with how `--release --await` behaves, returning EPERM for a `QS_EXPIRED` set, while a pending release-await that began with QS_RELEASING will report ETIMEDOUT if the set fails to release before it expires.
Updated by Leonid Usov 9 days ago
- Status changed from New to In Progress
- Pull request ID set to 57099
Updated by Leonid Usov 9 days ago
- Status changed from In Progress to Fix Under Review
Updated by Patrick Donnelly 4 days ago
- Category set to Correctness/Safety
- Target version set to v20.0.0
- Source set to Development
Actions