when an mds boots, it sends a mdsbeacon to mon to get it added to the fsmap. but monitor rejected because it thought that the mds was not compatible with the fsmap.
2019-08-27T07:26:33.741+0000 7f7e93435700 5 mon.a@0(leader).mds e3 preprocess_beacon mdsbeacon(4302/a up:boot seq 1 v0) v7 from mds.? [v2:172.21.4.106:6810/3881629364,v1:172.21.4.106:6812/3881629364] com
pat={},rocompat={},incompat={}
2019-08-27T07:26:33.741+0000 7f7e93435700 1 mon.a@0(leader).mds e3 mds mds.? [v2:172.21.4.106:6810/3881629364,v1:172.21.4.106:6812/3881629364] can't write to fsmap compat={},rocompat={},incompat={1=base
v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
2019-08-27T07:26:33.741+0000 7f7e93435700 10 mon.a@0(leader) e1 no_reply to mds.? v2:172.21.4.106:6810/3881629364 mdsbeacon(4302/a up:boot seq 1 v0) v7
so even though mds managed to be added to daemon registry on mgr, it cannot update its status. because when mgr gets an update from mds, it queries mon to see if it's in the fsmap. and mon just replied that the mds did not exist:
2019-08-27T07:26:33.744+0000 7f7e93435700 1 -- [v2:172.21.4.106:3300/0,v1:172.21.4.106:6789/0] <== mgr.4111 172.21.4.106:0/6595 77 ==== mon_command({"prefix": "mds metadata", "who": "a"} v 0) v1 ==== 80+
0+0 (crc 0 0 0) 0x55678618da00 con 0x556785de7400
2019-08-27T07:26:33.744+0000 7f7e93435700 20 mon.a@0(leader) e1 _ms_dispatch existing session 0x556785e47500 for mgr.4111
2019-08-27T07:26:33.744+0000 7f7e93435700 20 mon.a@0(leader) e1 caps allow profile mgr
2019-08-27T07:26:33.744+0000 7f7e93435700 0 mon.a@0(leader) e1 handle_command mon_command({"prefix": "mds metadata", "who": "a"} v 0) v1
2019-08-27T07:26:33.744+0000 7f7e93435700 20 is_capable service=mds command=mds metadata read addr 172.21.4.106:0/6595 on cap allow profile mgr
2019-08-27T07:26:33.744+0000 7f7e93435700 20 allow so far , doing grant allow profile mgr
2019-08-27T07:26:33.744+0000 7f7e93435700 20 match
2019-08-27T07:26:33.744+0000 7f7e93435700 10 mon.a@0(leader) e1 _allowed_command capable
2019-08-27T07:26:33.744+0000 7f7e93435700 0 log_channel(audit) log [DBG] : from='mgr.4111 172.21.4.106:0/6595' entity='mgr.x' cmd=[{"prefix": "mds metadata", "who": "a"}]: dispatch
2019-08-27T07:26:33.744+0000 7f7e93435700 1 -- [v2:172.21.4.106:3300/0,v1:172.21.4.106:6789/0] --> [v2:172.21.4.106:3300/0,v1:172.21.4.106:6789/0] -- log(1 entries from seq 120 at 2019-08-27T07:26:33.745
580+0000) v1 -- 0x556785f44240 con 0x5567850f1400
2019-08-27T07:26:33.745+0000 7f7e93435700 10 mon.a@0(leader).paxosservice(mdsmap 1..3) dispatch 0x55678618da00 mon_command({"prefix": "mds metadata", "who": "a"} v 0) v1 from mgr.4111 172.21.4.106:0/6595
con 0x556785de7400
2019-08-27T07:26:33.745+0000 7f7e93435700 5 mon.a@0(leader).paxos(paxos active c 1..71) is_readable = 1 - now=2019-08-27T07:26:33.745641+0000 lease_expire=2019-08-27T07:26:38.569920+0000 has v0 lc 71
2019-08-27T07:26:33.745+0000 7f7e93435700 10 mon.a@0(leader).mds e3 preprocess_query mon_command({"prefix": "mds metadata", "who": "a"} v 0) v1 from mgr.4111 172.21.4.106:0/6595
2019-08-27T07:26:33.745+0000 7f7e93435700 1 mon.a@0(leader).mds e3 all = 0
2019-08-27T07:26:33.745+0000 7f7e93435700 2 mon.a@0(leader) e1 send_reply 0x556785f5fc70 0x55678618d800 mon_command_ack([{"prefix": "mds metadata", "who": "a"}]=-22 MDS named 'a' does not exist, or is not up v3) v1
2019-08-27T07:26:33.745+0000 7f7e93435700 1 -- [v2:172.21.4.106:3300/0,v1:172.21.4.106:6789/0] --> 172.21.4.106:0/6595 -- mon_command_ack([{"prefix": "mds metadata", "who": "a"}]=-22 MDS named 'a' does not exist, or is not up v3) v1 -- 0x55678618d800 con 0x556785de7400
that's why mgr failed to return the mds' daemon status even though it's updated by corresponding mds' status report before:
2019-08-27T07:26:33.744+0000 7f1493af3700 10 mgr.server handle_open from 0x55a183210000 mds,a
2019-08-27T07:26:33.744+0000 7f1493af3700 1 -- [v2:172.21.4.106:6800/6595,v1:172.21.4.106:6801/6595] --> [v2:172.21.4.106:6810/3881629364,v1:172.21.4.106:6812/3881629364] -- mgrconfigure(period=5, thresh
old=5) v3 -- 0x55a1830e6e00 con 0x55a183210000
2019-08-27T07:26:33.744+0000 7f14b0e45700 1 --2- [v2:172.21.4.106:6800/6595,v1:172.21.4.106:6801/6595] >> [v2:172.21.4.106:6811/1098026670,v1:172.21.4.106:6813/1098026670] conn(0x55a182f3d800 0x55a18329e
000 crc :-1 s=READY pgs=4 cs=0 l=1 rx=0 tx=0).ready entity=mds.? client_cookie=0 server_cookie=0 in_seq=0 out_seq=0
2019-08-27T07:26:33.744+0000 7f1493af3700 1 -- [v2:172.21.4.106:6800/6595,v1:172.21.4.106:6801/6595] <== mds.? v2:172.21.4.106:6810/3881629364 2 ==== mgrreport(mds.a +0-0 packed 6) v8 ==== 41+0+0 (crc 0
0 0) 0x55a18058f180 con 0x55a183210000
2019-08-27T07:26:33.744+0000 7f1493af3700 10 mgr.server handle_report from 0x55a183210000 mds,a
2019-08-27T07:26:33.744+0000 7f1493af3700 5 mgr.server handle_report rejecting report from mds,a, since we do not have its metadata now.
probably it's a bug on mds side?
see /a/kchai-2019-08-27_06:58:14-rados-wip-kefu-testing-2019-08-27-1029-distro-basic-mira/4256553