Bug #10258
ceph health reporting blocked op indefinitely
Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
On the performance test cluster, when creating an EC pool, ceph health reports that an op is blocked many hours after pool creation has finished and the cluster is otherwise idle.
perf@magna012:~$ ceph --version ceph version 0.80.7-141-gcb2c83b (cb2c83b2f216e503f7a52115f775bda1dbfe0c6a)
perf@magna012:~$ ceph health detail HEALTH_WARN 1 requests are blocked > 32 sec; 1 osds have slow requests 1 ops are blocked > 32.768 sec 1 ops are blocked > 32.768 sec on osd.7 1 osds have slow requests
However when examining osd.7's admin socket there are no blocked ops:
perf@magna012:~$ sudo ceph daemon osd.7 dump_ops_in_flight { "num_ops": 0, "ops": []}
No ops are blocked on any other OSDs either. Sam indicated he thinks the mon may not be properly clearing the state. Unfortunately we currently wait in CBT until all slow operations complete before starting tests, which is periodically breaking the nightly performance tests (this issue doesn't appear to happen 100% of the time).
Related issues