Project

General

Profile

Bug #10335

MDS: disallow flush_path and related commands if not active

Added by Greg Farnum almost 9 years ago. Updated about 7 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2014-12-15 17:57:05.542533 7f0b8a061700  1 mds.-1.0 asok_command: flush_path (starting...)
2014-12-15 17:57:05.542597 7f0b8a061700 10 mds.-1.cache flush_dentry /scrub/test/path/i/dont/exist
2014-12-15 17:57:05.542625 7f0b8a061700  7 mds.-1.cache request_start_internal request(mds.?:1) op 5379
2014-12-15 17:57:05.542651 7f0b8a061700 10 mds.-1.server rdlock_path_pin_ref request(mds.?:1) #1/scrub/test/path/i/dont/exist
2014-12-15 17:57:05.542679 7f0b8a061700  7 mds.-1.cache traverse: opening base ino 1 snap head
2014-12-15 17:57:05.542686 7f0b8a061700 10 mds.-1.server FAIL on ESTALE but attempting recovery
2014-12-15 17:57:05.542705 7f0b8a061700  5 mds.-1.cache find_ino_peers 1 hint -1
2014-12-15 17:57:05.542714 7f0b8a061700 10 mds.-1.cache _do_find_ino_peer 1 1 active 0 all 0 checked -1
2014-12-15 17:57:05.542747 7f0b8a061700  1 -- 10.214.134.124:6808/6632 --> mds.0 10.214.135.134:6808/5316 -- mdsmap(e 7) v1 -- ?+0 0x3fc2fc0
2014-12-15 17:57:05.542951 7f0b8a061700  1 -- 10.214.134.124:6808/6632 --> mds.0 10.214.135.134:6808/5316 -- findino(1 1) v1 -- ?+0 0x4000c40

I've seen this once before in a branch test; I don't remember if it was merged as-is or not. :(
In any case, we're for some reason getting ESTALE on the root (huh!??!) and then spinning off an attempt to look it up which fails and doesn't wake up the callback. Or perhaps it never finishes? Anyway, this should hopefully just be an exercise in tracing the code logic, although the ESTALE just doesn't make much sense...

History

#1 Updated by Greg Farnum almost 9 years ago

  • Subject changed from MDS: flush_path failing on ESTALE to MDS: disallow flush_path and related commands if not active
  • Status changed from New to In Progress

Oh duh, right, this is us hitting the standby MDS with a flush request. (Thus the mds.-1 up there.) I guess the admin socket command should check for that, but it seemed non-trivial to solve this in ceph-qa-suite so that only the active MDS receives requests. (We always direct the flush request to mds.a, but sometimes mds.s-a is the active one for whatever reason. See #10361.)

#2 Updated by Greg Farnum almost 9 years ago

  • Status changed from In Progress to Fix Under Review

#3 Updated by Zheng Yan almost 9 years ago

  • Status changed from Fix Under Review to Resolved

#4 Updated by Greg Farnum about 7 years ago

  • Component(FS) MDS added

Also available in: Atom PDF