Project

General

Profile

Bug #13166

MDS: standby-replay does not change client_incarnation properly

Added by Greg Farnum almost 4 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
Start date:
09/18/2015
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:

Description

2015-09-17 17:40:57.876236 7fbfb374f700 10 mds.b-s-a handle_mds_map: handling map as rank 0
2015-09-17 17:40:57.876251 7fbfb374f700  1 mds.0.0 handle_mds_map i am now mds.4109.0replaying mds.0.0
2015-09-17 17:40:57.876254 7fbfb374f700  1 mds.0.0 handle_mds_map state change up:boot --> up:standby-replay
2015-09-17 17:40:57.876261 7fbfb374f700 10 mds.beacon.b-s-a set_want_state: up:standby -> up:standby-replay
2015-09-17 17:40:57.876265 7fbfb374f700  1 mds.0.0 replay_start
2015-09-17 17:41:23.463233 7fbfb374f700  5 mds.b-s-a handle_mds_map epoch 8 from mon.2
2015-09-17 17:41:23.463265 7fbfb374f700 10 mds.b-s-a      my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table}
2015-09-17 17:41:23.463274 7fbfb374f700 10 mds.b-s-a  mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table}
2015-09-17 17:41:23.463280 7fbfb374f700 10 mds.b-s-a  peer mds gid 4116 removed from map
2015-09-17 17:41:23.463284 7fbfb374f700  1 -- 10.214.132.17:6812/28040 mark_down 10.214.134.104:6812/17763 -- pipe dne
2015-09-17 17:41:23.463293 7fbfb374f700 10 mds.b-s-a map says i am 10.214.132.17:6812/28040 mds.0.2 state up:replay
2015-09-17 17:41:23.463300 7fbfb374f700 10 mds.b-s-a handle_mds_map: handling map as rank 0
2015-09-17 17:41:23.463303 7fbfb374f700  1 mds.0.0 handle_mds_map i am now mds.0.0
2015-09-17 17:41:23.463306 7fbfb374f700  1 mds.0.0 handle_mds_map state change up:standby-replay --> up:replay
2015-09-17 17:41:23.463316 7fbfb374f700 10 mds.beacon.b-s-a set_want_state: up:standby-replay -> up:replay
2015-09-17 17:41:23.463324 7fbfb374f700 10 mds.0.0 Monitor activated us! Deactivating replay loop

I think we must have broken this handling when splitting up MDS into Rank and Daemon.

Associated revisions

Revision e65fb1ba (diff)
Added by Yan, Zheng almost 4 years ago

mds: adjust MDSRank::incarnation according to mdsmap

When a standby-replay MDS replace failed MDS, we need update its
incarnation.

Fixes: #13166
Signed-off-by: Yan, Zheng <>

History

#1 Updated by Greg Farnum almost 4 years ago

  • Subject changed from mds: damaged journal to MDS: standby-replay does not change client_incarnation properly
  • Description updated (diff)
  • Category set to 47
  • Status changed from New to Verified

#2 Updated by Greg Farnum almost 4 years ago

Hmm, I don't think we should actually be doing operations as mds.0.0 when we're a standby for the real mds.0.0 either! That is liable to confuse things as well.

#3 Updated by Greg Farnum almost 4 years ago

Obvious fix is to have MDSRank check the incarnation and update, but I want us to look more deeply at how the replaying works. I think we used to have different IDs entirely when replaying, rather than sending stuff off while pretending to be the active MDS. :/

#4 Updated by Greg Farnum almost 4 years ago

If we need more logs, I copied the standby MDS log to ubuntu-2015-09-17_16:55:52-fs-greg-fs-testing---basic-multi/1061724/ceph-mds.b-s-a.log.

#5 Updated by Zheng Yan almost 4 years ago

  • Status changed from Verified to Need Review

#6 Updated by Zheng Yan almost 4 years ago

  • Status changed from Need Review to Testing

#7 Updated by Greg Farnum almost 4 years ago

Zheng, can you dig up a firefly test run and make sure the behavior of standby-replay daemons there is the same as it is with this branch? (In particular, the rank and invocation it's telling OSDs it is.)

#8 Updated by Zheng Yan almost 4 years ago

For firely, standby-replay MDS also uses 0 as client_incarnation, its ID is MDS.x.0.

#9 Updated by Zheng Yan almost 4 years ago

  • Status changed from Testing to Resolved

#10 Updated by Greg Farnum about 3 years ago

  • Component(FS) MDS added

Also available in: Atom PDF