Actions
Bug #451
closedmds: replay error
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2010-10-01 03:30:59.816017 7f991dcf7710 mds0.journaler try_read_entry at 822192580 reading 822192580~4 (have 160437) 2010-10-01 03:30:59.816022 7f991dcf7710 mds0.journaler try_read_entry got 0 len entry at offset 822192580 2010-10-01 03:30:59.816026 7f991dcf7710 mds0.log _replay journaler got error -22, aborting 2010-10-01 03:30:59.816030 7f991dcf7710 mds0.log _replay_thread kicking waiters 2010-10-01 03:30:59.816034 7f991dcf7710 mds0.18 boot_start encountered an error, failing 2010-10-01 03:30:59.816038 7f991dcf7710 mds0.18 suicide. wanted up:replay, now down:dne 2010-10-01 03:30:59.816047 7f991dcf7710 mds0.cache WARNING: mdcache shutdown with non-empty cache 2010-10-01 03:30:59.816052 7f991dcf7710 mds0.cache show_subtrees 2010-10-01 03:30:59.816059 7f991dcf7710 mds0.cache |__ 0 auth [dir 1 / [2,head] auth v=327474 cv=0/0 dir_auth=0 state=1610612736 f(v5 m2010-09-30 10:42:48.462366 5=0+5) n(v5922 rc2010-10-01 02:32:18.156019 b129532745664 784=23+761)/n(v5922 rc2010-10-01 02:32:18.156019 b129521095616 784=23+761) hs=1+0,ss=0+0 dirty=1 | child subtree dirty 0x7f98e8006ae0] 2010-10-01 03:30:59.816078 7f991dcf7710 mds0.cache |__ 0 auth [dir 100 ~mds0/ [2,head] auth v=795 cv=0/0 dir_auth=0 state=1610612736 f(v0 2=1+1) n(v54 rc2010-10-01 02:07:56.696257 b1883380490265 372=365+7) hs=1+0,ss=0+0 dirty=1 | child subtree dirty 0x7f98e80070f0] 2010-10-01 03:30:59.816094 7f991dcf7710 mds0.log _replay_thread finish 2010-10-01 03:30:59.816129 7f992d903710 mds0.18 handle_mds_beacon up:replay seq 3 rtt 7.483578 2010-10-01 03:30:59.816571 7f9930405720 7f9930405720 stopped.
Henry, can you post the full mds log somewhere? I'm curious what the mds log/jouranl bounds are in the header, and how close we are to that.
IIRC at one point I was concerned about the journal replay's ability to detect a incomplete write at the end of a journal when an entry crosses an object boundary. Not sure if that's what's going on here...
Actions