Project

General

Profile

Bug #3261

mds crashes in EMetaBlob::replay

Added by Tobias Florek almost 8 years ago. Updated over 7 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature:

Description

while testing cephfs using the debian wheezy packages on a fairly large volume (2TB) i ran into random unreproduceable client-stalls. now mds starts but crashes shortly after.

i am willing to debug this further, as ceph was promising before.

ceph-mds.a.log View (1.13 MB) Tobias Florek, 10/03/2012 03:46 PM

ceph-mon.a.log View (4.11 KB) Tobias Florek, 10/03/2012 03:46 PM

ceph-osd.0.log View (4.13 KB) Tobias Florek, 10/03/2012 03:46 PM

ceph-osd.1.log View (4.13 KB) Tobias Florek, 10/03/2012 03:46 PM

ceph-mds.a.log.bz2 (7.56 MB) Tobias Florek, 10/03/2012 03:57 PM

History

#1 Updated by Sage Weil almost 8 years ago

  • Status changed from New to Need More Info

can you put 'debug mds = 20' in the ceph.conf, restart ceph-mds, and then attach the resulting log (assuming it crashes again)?

thanks!

#2 Updated by Sage Weil almost 8 years ago

  • Project changed from Ceph to fs

#3 Updated by Tobias Florek almost 8 years ago

aww. i had debug ms = 20 in my ceph.conf. sorry.

the new one is attached

#4 Updated by Sage Weil over 7 years ago

This looks like a problem with what's in the journal, but soo much MDS code has changed since then that I don't think we can make sense of this report. Are you in a position to retest against latest master?

#5 Updated by Tobias Florek over 7 years ago

should i test the same btrfs volume with a new ceph? if so i might get to it in the next month. please close with insufficient data. i will reopen when i found the time.

unfortunately i don't have the hardware around anymore to replicate the whole test.

#6 Updated by Sage Weil over 7 years ago

  • Status changed from Need More Info to Rejected

Understood. I'm sorry we weren't able to dig in when it happened. When do you get around to retesting we should be in a better position to follow up.

Thanks!

Also available in: Atom PDF