Project

General

Profile

Actions

Bug #10335

closed

MDS: disallow flush_path and related commands if not active

Added by Greg Farnum over 9 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2014-12-15 17:57:05.542533 7f0b8a061700  1 mds.-1.0 asok_command: flush_path (starting...)
2014-12-15 17:57:05.542597 7f0b8a061700 10 mds.-1.cache flush_dentry /scrub/test/path/i/dont/exist
2014-12-15 17:57:05.542625 7f0b8a061700  7 mds.-1.cache request_start_internal request(mds.?:1) op 5379
2014-12-15 17:57:05.542651 7f0b8a061700 10 mds.-1.server rdlock_path_pin_ref request(mds.?:1) #1/scrub/test/path/i/dont/exist
2014-12-15 17:57:05.542679 7f0b8a061700  7 mds.-1.cache traverse: opening base ino 1 snap head
2014-12-15 17:57:05.542686 7f0b8a061700 10 mds.-1.server FAIL on ESTALE but attempting recovery
2014-12-15 17:57:05.542705 7f0b8a061700  5 mds.-1.cache find_ino_peers 1 hint -1
2014-12-15 17:57:05.542714 7f0b8a061700 10 mds.-1.cache _do_find_ino_peer 1 1 active 0 all 0 checked -1
2014-12-15 17:57:05.542747 7f0b8a061700  1 -- 10.214.134.124:6808/6632 --> mds.0 10.214.135.134:6808/5316 -- mdsmap(e 7) v1 -- ?+0 0x3fc2fc0
2014-12-15 17:57:05.542951 7f0b8a061700  1 -- 10.214.134.124:6808/6632 --> mds.0 10.214.135.134:6808/5316 -- findino(1 1) v1 -- ?+0 0x4000c40

I've seen this once before in a branch test; I don't remember if it was merged as-is or not. :(
In any case, we're for some reason getting ESTALE on the root (huh!??!) and then spinning off an attempt to look it up which fails and doesn't wake up the callback. Or perhaps it never finishes? Anyway, this should hopefully just be an exercise in tracing the code logic, although the ESTALE just doesn't make much sense...

Actions #1

Updated by Greg Farnum over 9 years ago

  • Subject changed from MDS: flush_path failing on ESTALE to MDS: disallow flush_path and related commands if not active
  • Status changed from New to In Progress

Oh duh, right, this is us hitting the standby MDS with a flush request. (Thus the mds.-1 up there.) I guess the admin socket command should check for that, but it seemed non-trivial to solve this in ceph-qa-suite so that only the active MDS receives requests. (We always direct the flush request to mds.a, but sometimes mds.s-a is the active one for whatever reason. See #10361.)

Actions #2

Updated by Greg Farnum over 9 years ago

  • Status changed from In Progress to Fix Under Review
Actions #3

Updated by Zheng Yan over 9 years ago

  • Status changed from Fix Under Review to Resolved
Actions #4

Updated by Greg Farnum almost 8 years ago

  • Component(FS) MDS added
Actions

Also available in: Atom PDF