Project

General

Profile

Actions

Bug #434

closed

mds: clustered mds pjd failures

Added by Sage Weil over 13 years ago. Updated over 13 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
% Done:

0%

Spent time:
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description


../pjd-fstest-20080816/tests/chmod/03.......ok                               
../pjd-fstest-20080816/tests/chmod/04.......ok                               
../pjd-fstest-20080816/tests/chmod/05.......FAILED tests 5-6, 10-11          
        Failed 4/14 tests, 71.43% okay
../pjd-fstest-20080816/tests/chmod/06.......ok                               
../pjd-fstest-20080816/tests/chmod/07.......FAILED tests 5-6, 8, 11          
        Failed 4/14 tests, 71.43% okay
../pjd-fstest-20080816/tests/chmod/08.......ok                               
../pjd-fstest-20080816/tests/chmod/09.......ok                               
../pjd-fstest-20080816/tests/chmod/10.......ok                               
../pjd-fstest-20080816/tests/chmod/11.......ok                               
../pjd-fstest-20080816/tests/chown/00.......FAILED test 96                   
        Failed 1/171 tests, 99.42% okay
../pjd-fstest-20080816/tests/chown/01.......ok                               
../pjd-fstest-20080816/tests/chown/02.......ok                               
../pjd-fstest-20080816/tests/chown/03.......ok                               
../pjd-fstest-20080816/tests/chown/04.......ok                               
../pjd-fstest-20080816/tests/chown/05.......FAILED tests 5-6, 10-12          
        Failed 5/15 tests, 66.67% okay
../pjd-fstest-20080816/tests/chown/06.......ok                               
../pjd-fstest-20080816/tests/chown/07.......ok                               
../pjd-fstest-20080816/tests/chown/08.......ok                               

and from another run
../pjd-fstest-20080816/tests/chmod/04.......ok                               
../pjd-fstest-20080816/tests/chmod/05.......FAILED tests 5-6, 10-11          
        Failed 4/14 tests, 71.43% okay
../pjd-fstest-20080816/tests/chmod/06.......ok                               
../pjd-fstest-20080816/tests/chmod/07.......FAILED tests 5-6, 8, 11          
        Failed 4/14 tests, 71.43% okay
../pjd-fstest-20080816/tests/chmod/08.......ok                               
Actions #1

Updated by Greg Farnum over 13 years ago

  • Status changed from New to In Progress
  • Assignee changed from Sage Weil to Greg Farnum

Looking at this now.

Actions #2

Updated by Greg Farnum over 13 years ago

To reproduce, you need to turn on mds thrashing (mds thrash exports = 1 in ceph.conf).
However, I've yet to get these errors since so far I've been hitting different crashes and assert failures. Been fixing them as I go.
I think I'm nearing the end, though, as #472 is popping up a lot and it's getting harder to observe other crashes.

Actions #3

Updated by Sage Weil over 13 years ago

  • Target version changed from v0.22 to v0.23
Actions #4

Updated by Greg Farnum over 13 years ago

  • Assignee changed from Greg Farnum to Sage Weil

Sage has taken over the clustered MDS stuff for now, so here's the bug!

Actions #5

Updated by Sage Weil over 13 years ago

Just saw this again:

../pjd-fstest-20080816/tests/chmod/06.......ok                               
../pjd-fstest-20080816/tests/chmod/07.......FAILED tests 5-6, 8, 11          
        Failed 4/14 tests, 71.43% okay
../pjd-fstest-20080816/tests/chmod/08.......ok                               
../pjd-fstest-20080816/tests/chmod/09.......ok                               
../pjd-fstest-20080816/tests/chmod/10.......ok                               
../pjd-fstest-20080816/tests/chmod/11.......ok                               
../pjd-fstest-20080816/tests/chown/00.......FAILED test 97                   
        Failed 1/171 tests, 99.42% okay
../pjd-fstest-20080816/tests/chown/01.......ok                               
../pjd-fstest-20080816/tests/chown/02.......ok                               

and a bit later,
../pjd-fstest-20080816/tests/chown/05.......FAILED tests 5-6, 10-12          
        Failed 5/15 tests, 66.67% okay

Actions #6

Updated by Sage Weil over 13 years ago

  • Status changed from In Progress to Resolved

this was a kclient problem caused by bad uid/gid in resent requests. fixed by commit:cb4276cca4695670916a82e359f2e3776f0a9138

Actions #7

Updated by Sage Weil over 13 years ago

  • Project changed from Ceph to Linux kernel client
  • Category deleted (1)
  • Target version deleted (v0.23)
Actions #8

Updated by Sage Weil over 13 years ago

  • Target version set to v2.6.37
  • Estimated time set to 1:00 h
Actions #9

Updated by Sage Weil over 13 years ago

a few more fixes here on inode updates version check and mtime.

Actions

Also available in: Atom PDF