Actions
Bug #803
closedmds assert failed replaying journal after respawn
% Done:
0%
Spent time:
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
sdc/Journaler.h:225: FAILED assert(readonly || state == STATE_READHEAD)
I created a dir with about 500,000 files, then ran "ls" in it. The process ran for many minutes, with the mds process at 100% cpu (and osd processes quite busy too iirc). Whilst it was running, I used injectargs to experiment with a few debug levels, at which point the mds crashed and now won't start back up.
Judging from the mds logs, it was detected as being laggy (possibly due to my messing with debug settings?), got set as down, respawned, but failed to replay the journal (logs attached).
No other processes crashed or oomed (all 4 osds and 3 mons stayed up).
It's a build from the master branch, commit da6966958471db1dbf20f30e467221338b2b2e7d.
Possibly related to #777.
Files
Actions