Project

General

Profile

Actions

Bug #13166

closed

MDS: standby-replay does not change client_incarnation properly

Added by Greg Farnum over 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2015-09-17 17:40:57.876236 7fbfb374f700 10 mds.b-s-a handle_mds_map: handling map as rank 0
2015-09-17 17:40:57.876251 7fbfb374f700  1 mds.0.0 handle_mds_map i am now mds.4109.0replaying mds.0.0
2015-09-17 17:40:57.876254 7fbfb374f700  1 mds.0.0 handle_mds_map state change up:boot --> up:standby-replay
2015-09-17 17:40:57.876261 7fbfb374f700 10 mds.beacon.b-s-a set_want_state: up:standby -> up:standby-replay
2015-09-17 17:40:57.876265 7fbfb374f700  1 mds.0.0 replay_start
2015-09-17 17:41:23.463233 7fbfb374f700  5 mds.b-s-a handle_mds_map epoch 8 from mon.2
2015-09-17 17:41:23.463265 7fbfb374f700 10 mds.b-s-a      my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table}
2015-09-17 17:41:23.463274 7fbfb374f700 10 mds.b-s-a  mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table}
2015-09-17 17:41:23.463280 7fbfb374f700 10 mds.b-s-a  peer mds gid 4116 removed from map
2015-09-17 17:41:23.463284 7fbfb374f700  1 -- 10.214.132.17:6812/28040 mark_down 10.214.134.104:6812/17763 -- pipe dne
2015-09-17 17:41:23.463293 7fbfb374f700 10 mds.b-s-a map says i am 10.214.132.17:6812/28040 mds.0.2 state up:replay
2015-09-17 17:41:23.463300 7fbfb374f700 10 mds.b-s-a handle_mds_map: handling map as rank 0
2015-09-17 17:41:23.463303 7fbfb374f700  1 mds.0.0 handle_mds_map i am now mds.0.0
2015-09-17 17:41:23.463306 7fbfb374f700  1 mds.0.0 handle_mds_map state change up:standby-replay --> up:replay
2015-09-17 17:41:23.463316 7fbfb374f700 10 mds.beacon.b-s-a set_want_state: up:standby-replay -> up:replay
2015-09-17 17:41:23.463324 7fbfb374f700 10 mds.0.0 Monitor activated us! Deactivating replay loop

I think we must have broken this handling when splitting up MDS into Rank and Daemon.

Actions #1

Updated by Greg Farnum over 8 years ago

  • Subject changed from mds: damaged journal to MDS: standby-replay does not change client_incarnation properly
  • Description updated (diff)
  • Category set to 47
  • Status changed from New to 12
Actions #2

Updated by Greg Farnum over 8 years ago

Hmm, I don't think we should actually be doing operations as mds.0.0 when we're a standby for the real mds.0.0 either! That is liable to confuse things as well.

Actions #3

Updated by Greg Farnum over 8 years ago

Obvious fix is to have MDSRank check the incarnation and update, but I want us to look more deeply at how the replaying works. I think we used to have different IDs entirely when replaying, rather than sending stuff off while pretending to be the active MDS. :/

Actions #4

Updated by Greg Farnum over 8 years ago

If we need more logs, I copied the standby MDS log to ubuntu-2015-09-17_16:55:52-fs-greg-fs-testing---basic-multi/1061724/ceph-mds.b-s-a.log.

Actions #5

Updated by Zheng Yan over 8 years ago

  • Status changed from 12 to Fix Under Review
Actions #6

Updated by Zheng Yan over 8 years ago

  • Status changed from Fix Under Review to 7
Actions #7

Updated by Greg Farnum over 8 years ago

Zheng, can you dig up a firefly test run and make sure the behavior of standby-replay daemons there is the same as it is with this branch? (In particular, the rank and invocation it's telling OSDs it is.)

Actions #8

Updated by Zheng Yan over 8 years ago

For firely, standby-replay MDS also uses 0 as client_incarnation, its ID is MDS.x.0.

Actions #9

Updated by Zheng Yan over 8 years ago

  • Status changed from 7 to Resolved
Actions #10

Updated by Greg Farnum almost 8 years ago

  • Component(FS) MDS added
Actions

Also available in: Atom PDF