Actions
Bug #3658
closedosd/mon: stops processing pg stat messages
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
nuke-on-error: true overrides: ceph: branch: next conf: client: debug ms: 1 log max new: 1 rbd cache: true global: ms inject socket failures: 5000 osd: debug ms: 1 debug osd: 20 fs: ext4 log-whitelist: - slow request roles: - - mon.a - osd.0 - osd.1 - osd.2 - - mds.a - osd.3 - osd.4 - osd.5 - - client.0 tasks: - chef: null - clock: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: timeout: 1200 - rbd_fsx: clients: - client.0 ops: 500
looks like osd is sending stat messages, at some point the mon stops receiving and acking them. eventually marks the osd down (after the 900s timeout)
Updated by Sage Weil over 11 years ago
- Status changed from 12 to Resolved
pretty sure this was caused by the log bug and 'log max new = 1', fixed by 50914e7a429acddb981bc3344f51a793280704e6.
still saw this on masseffect, tracking that in #3661
Actions