Project

General

Profile

Actions

Tasks #1002

closed

Ceph - Bug #910: Multi-MDS Ceph does not pass fsstress

Assert failure in Locker::handle_file_lock

Added by Greg Farnum about 13 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Component(FS):
Labels (FS):
Pull request ID:

Description

mds/Locker.cc: In function 'void Locker::handle_file_lock(ScatterLock*, MLock*)', in thread '0x7fd697866710'
mds/Locker.cc: 3972: FAILED assert(lock->get_state() == 1 || lock->get_state() == 2 || lock->get_state() == 17)
 ceph version 0.25-706-g26f84f4 (commit:26f84f4b310e3045f1de80b9c788081e46f16c71)

Running this in my wip_fsstress branch.

Actions #1

Updated by Greg Farnum about 13 years ago

  • Status changed from New to In Progress

Looks like there's a problem because the inode in question is being renamed across MDSes (big shocker, I know!). The new auth sends an MLock to the old auth, which hasn't finished its rename and is still in AMBIGAUTH.
I added a check when handling mix messages to wait if it's not single auth, but I'm not sure if the check should be for more kinds of lock messages, or more narrowly tailored, or what. Will see if that helps though, and consult with Sage.

Actions #2

Updated by Greg Farnum about 13 years ago

Yeah, Sage said this looks like the new auth is sending messages before it should, probably due to some kind of twiddle thing or something. Backed out my change and will attempt to diagnose more.

Actions #3

Updated by Greg Farnum about 13 years ago

  • Status changed from In Progress to 7

I suspect this got fixed up by some of the other changes, but want to leave it open for a little longer just to make sure it doesn't pop up again.

Actions #4

Updated by Greg Farnum about 13 years ago

  • Status changed from 7 to Resolved
Actions #5

Updated by Greg Farnum about 13 years ago

  • Status changed from Resolved to In Progress

This popped back up again. It appears to be the result of a newly-imported inode getting mixed (from sync), which sends a message to the old auth while it's still in AMBIGAUTH state. So then the old auth does eval-gather and sends out its own lock mix messages.

Actions #6

Updated by Greg Farnum about 13 years ago

  • Status changed from In Progress to 7

Okay, this was the same problem as commit:a028c8954ca240ec9a12682678aaee02eb507ae3.

Actions #7

Updated by Greg Farnum almost 13 years ago

  • Status changed from 7 to Resolved
Actions #8

Updated by John Spray over 7 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (1)
  • Target version deleted (v0.28)

Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.

Actions

Also available in: Atom PDF