Bug #7485
closedKilling MDS during 'creating' breaks subsequent startup (no snaptable)
0%
Description
Pretty easy to reproduce: start MDS for first time on fresh cluster (I'm using vstart here), ctrl-c it promptly, try and start it up again, hit assertion on reading snaptable. Sure enough, rados ls says the object isn't there.
-50> 2014-02-20 06:17:44.869883 7fefe834e700 1 mds.0.7 handle_mds_map i am now mds.0.7 -49> 2014-02-20 06:17:44.869886 7fefe834e700 1 mds.0.7 handle_mds_map state change up:standby --> up:replay -48> 2014-02-20 06:17:44.869895 7fefe834e700 1 mds.0.7 replay_start -47> 2014-02-20 06:17:44.869906 7fefe834e700 7 mds.0.cache set_recovery_set -46> 2014-02-20 06:17:44.869917 7fefe834e700 1 mds.0.7 recovery set is -45> 2014-02-20 06:17:44.869919 7fefe834e700 1 mds.0.7 need osdmap epoch 15, have 14 -44> 2014-02-20 06:17:44.869921 7fefe834e700 1 mds.0.7 waiting for osdmap 15 (which blacklists prior instance) -43> 2014-02-20 06:17:44.869937 7fefe834e700 10 monclient: renew_subs -42> 2014-02-20 06:17:44.869946 7fefe834e700 10 monclient: _send_mon_message to mon.a at 172.16.79.181:6789/0 -41> 2014-02-20 06:17:44.869959 7fefe834e700 1 -- 172.16.79.181:6815/7059 --> 172.16.79.181:6789/0 -- mon_subscribe({mdsmap=18+,monmap=2+,osdmap=15}) v2 -- ?+0 0x3d338c0 con 0x3d2d580 -40> 2014-02-20 06:17:44.870848 7fefe834e700 1 -- 172.16.79.181:6815/7059 <== mon.0 172.16.79.181:6789/0 15 ==== osd_map(15..15 src has 1..15) v3 ==== 342+0+0 (3715003977 0 0) 0x3d3f480 con 0x3d2d580 -39> 2014-02-20 06:17:44.871028 7fefe834e700 2 mds.0.7 boot_start 1: opening inotable -38> 2014-02-20 06:17:44.871086 7fefe834e700 10 mds.0.inotable: load -37> 2014-02-20 06:17:44.871318 7fefe834e700 1 -- 172.16.79.181:6815/7059 --> 172.16.79.181:6805/6488 -- osd_op(mds.0.7:1 mds0_inotable [read 0~0] 1.b852b893 ack+read e15) v4 -- ?+0 0x3d15d80 con 0x3d2dc60 -36> 2014-02-20 06:17:44.871387 7fefe834e700 2 mds.0.7 boot_start 1: opening sessionmap -35> 2014-02-20 06:17:44.871395 7fefe834e700 10 mds.0.sessionmap load -34> 2014-02-20 06:17:44.871437 7fefe834e700 1 -- 172.16.79.181:6815/7059 --> 172.16.79.181:6805/6488 -- osd_op(mds.0.7:2 mds0_sessionmap [read 0~0] 1.3270c60b ack+read e15) v4 -- ?+0 0x3d15b40 con 0x3d2dc60 -33> 2014-02-20 06:17:44.871489 7fefe834e700 2 mds.0.7 boot_start 1: opening anchor table -32> 2014-02-20 06:17:44.871493 7fefe834e700 10 mds.0.anchortable: load -31> 2014-02-20 06:17:44.871521 7fefe834e700 1 -- 172.16.79.181:6815/7059 --> 172.16.79.181:6805/6488 -- osd_op(mds.0.7:3 mds_anchortable [read 0~0] 1.a977f6a7 ack+read e15) v4 -- ?+0 0x3d3fb40 con 0x3d2dc60 -30> 2014-02-20 06:17:44.871532 7fefe834e700 2 mds.0.7 boot_start 1: opening snap table -29> 2014-02-20 06:17:44.871535 7fefe834e700 10 mds.0.snaptable: load -28> 2014-02-20 06:17:44.871687 7fefe5247700 10 mds.0.7 MDS::ms_get_authorizer type=osd -27> 2014-02-20 06:17:44.871691 7fefe834e700 1 -- 172.16.79.181:6815/7059 --> 172.16.79.181:6810/6696 -- osd_op(mds.0.7:4 mds_snaptable [read 0~0] 1.d90270ad ack+read e15) v4 -- ?+0 0x3d3f900 con 0x3d2db00 -26> 2014-02-20 06:17:44.871708 7fefe834e700 2 mds.0.7 boot_start 1: opening mds log -25> 2014-02-20 06:17:44.871710 7fefe834e700 5 mds.0.log open discovering log bounds -24> 2014-02-20 06:17:44.871726 7fefe834e700 1 mds.0.journaler(ro) recover start -23> 2014-02-20 06:17:44.871730 7fefe834e700 1 mds.0.journaler(ro) read_head -22> 2014-02-20 06:17:44.871765 7fefe834e700 1 -- 172.16.79.181:6815/7059 --> 172.16.79.181:6805/6488 -- osd_op(mds.0.7:5 200.00000000 [read 0~0] 1.844f3494 ack+read e15) v4 -- ?+0 0x3d3f6c0 con 0x3d2dc60 -21> 2014-02-20 06:17:44.871858 7fefe834e700 1 -- 172.16.79.181:6815/7059 <== mon.0 172.16.79.181:6789/0 16 ==== mon_subscribe_ack(300s) v1 ==== 20+0+0 (3705050423 0 0) 0x3d46380 con 0x3d2d580 -20> 2014-02-20 06:17:44.871875 7fefe834e700 10 monclient: handle_subscribe_ack sent 2014-02-20 06:17:44.869940 renew after 2014-02-20 06:20:14.869940 -19> 2014-02-20 06:17:44.872070 7fefe5146700 10 mds.0.7 MDS::ms_get_authorizer type=osd -18> 2014-02-20 06:17:44.872293 7fefe834e700 5 mds.0.7 ms_handle_connect on 172.16.79.181:6805/6488 -17> 2014-02-20 06:17:44.872501 7fefe834e700 5 mds.0.7 ms_handle_connect on 172.16.79.181:6810/6696 -16> 2014-02-20 06:17:44.873581 7fefe834e700 1 -- 172.16.79.181:6815/7059 <== osd.1 172.16.79.181:6805/6488 1 ==== osd_op_reply(1 mds0_inotable [read 0~0] v0'0 uv2 ondisk = 0) v6 ==== 180+0+34 (1740633793 0 1054821296) 0x3d44780 con 0x3d2dc60 -15> 2014-02-20 06:17:44.873668 7fefe834e700 10 mds.0.inotable: load_2 got 34 bytes -14> 2014-02-20 06:17:44.873673 7fefe834e700 10 mds.0.inotable: load_2 loaded v0 -13> 2014-02-20 06:17:44.874193 7fefe834e700 1 -- 172.16.79.181:6815/7059 <== osd.1 172.16.79.181:6805/6488 2 ==== osd_op_reply(3 mds_anchortable [read 0~0] v0'0 uv5 ondisk = 0) v6 ==== 182+0+34 (271931579 0 2191498079) 0x3d44c80 con 0x3d2dc60 -12> 2014-02-20 06:17:44.874241 7fefe834e700 10 mds.0.anchortable: load_2 got 34 bytes -11> 2014-02-20 06:17:44.874245 7fefe834e700 10 mds.0.anchortable: load_2 loaded v0 -10> 2014-02-20 06:17:44.874269 7fefe834e700 1 -- 172.16.79.181:6815/7059 <== osd.1 172.16.79.181:6805/6488 3 ==== osd_op_reply(2 mds0_sessionmap [read 0~0] v0'0 uv3 ondisk = 0) v6 ==== 182+0+22 (2821256041 0 3628653333) 0x3d44a00 con 0x3d2dc60 -9> 2014-02-20 06:17:44.874290 7fefe834e700 10 mds.0.sessionmap dump -8> 2014-02-20 06:17:44.874295 7fefe834e700 10 mds.0.sessionmap _load_finish v 0, 0 sessions, 22 bytes -7> 2014-02-20 06:17:44.874296 7fefe834e700 10 mds.0.sessionmap dump -6> 2014-02-20 06:17:44.874639 7fefe834e700 1 -- 172.16.79.181:6815/7059 <== osd.1 172.16.79.181:6805/6488 4 ==== osd_op_reply(5 200.00000000 [read 0~0] v0'0 uv1 ondisk = 0) v6 ==== 179+0+84 (2808334475 0 2960925567) 0x3d44780 con 0x3d2dc60 -5> 2014-02-20 06:17:44.874712 7fefe834e700 1 mds.0.journaler(ro) _finish_read_head loghead(trim 4194304, expire 4194304, write 4194304). probing for end of log (from 4194304)... -4> 2014-02-20 06:17:44.874720 7fefe834e700 1 mds.0.journaler(ro) probing for end of the log -3> 2014-02-20 06:17:44.874808 7fefe834e700 1 -- 172.16.79.181:6815/7059 --> 172.16.79.181:6805/6488 -- osd_op(mds.0.7:6 200.00000001 [stat] 1.6e5f474 ack+read+rwordered e15) v4 -- ?+0 0x3d3f480 con 0x3d2dc60 -2> 2014-02-20 06:17:44.875610 7fefe834e700 1 -- 172.16.79.181:6815/7059 <== osd.2 172.16.79.181:6810/6696 1 ==== osd_op_reply(4 mds_snaptable [read 0~0] v0'0 uv0 ack = -2 (No such file or directory)) v6 ==== 180+0+0 (2740985549 0 0) 0x3d44c80 con 0x3d2db00 -1> 2014-02-20 06:17:44.875689 7fefe834e700 10 mds.0.snaptable: load_2 could not read table: -2 0> 2014-02-20 06:17:44.883167 7fefe834e700 -1 mds/MDSTable.cc: In function 'void MDSTable::load_2(int, ceph::bufferlist&, Context*)' thread 7fefe834e700 time 2014-02-20 06:17:44.875695 mds/MDSTable.cc: 152: FAILED assert(r >= 0)
Log from the initial startup that was killed attached.
Files
Updated by John Spray about 10 years ago
MDS -1 gid 1 starts in BOOTING, sends a beacon
MON prepare_beacon records its existence and puts it into state STANDBY rank -1
MDS -1 gid 1 goes into STANDBY
MON tick notices that in_mds < max_mds and selects gid 1 to expand the cluster, puts it into state CREATING and assigns rank 0
MDS 0 gid 1 goes into CREATING, starts creation
< terminate gid 1 here >
MON notices gid 1 has gone laggy
MDS -1 gid 2 starts in BOOTING, sends a beacon
MON prepare_beacon notices gid 2 has same name as gid 1, calls fail_mds_gid(1). rank 0 is now in the 'failed' list, gid 1 is erased from pending_mdsmap.info. At this point we've lost all memory that we were ever in the CREATING state.
MON prepare_beacon continues, puts gid 2 into state STANDBY
MDS -1 gid 2 goes into STANDBY
MON tick notices that failed is not empty, picks gid 2 as a replacement for rank 0. Sets state to REPLAY.
If we had disabled enforce_unique_name, or if the new gid had been from a different MDS daemon, then we would have hit and earlier laggy check in tick(), which does correctly carry forward the CREATING state from the gid it's replacing.
Updated by John Spray about 10 years ago
- Status changed from New to Fix Under Review
- Assignee set to John Spray
Updated by Greg Farnum about 10 years ago
- Status changed from Fix Under Review to Resolved
We merged this to master in commit:9a040bfd46d141712c32aaa0fa8fc5de93336306, but I guess we missed closing out the ticket.