Bug #3590
closedmds segfault at MonClient::wait_auth_rotating during upgrade
0%
Description
Upgrade from ceph argonaut v0.48.2 to v0.55
Test steps:
1. Running a local cluster burnupi06[mds], burnupi07[osds], burnupi08[mon] on argonaut 0.48.2 [ceph.conf has no mention about ceph auth, so by default it was cephx disabled]
2. upgrading the mds node to v0.55 and restarting mds daemon segfaults at MonClient::wait_auth_rotating
ceph version 0.55-159-gefc6614 (efc66148837d614dc9489e3675aaa05bb2d9b9c2)
1: /usr/bin/ceph-mds() [0x815a9a]
2: (()+0xfcb0) [0x7fac57cf1cb0]
3: (MonClient::wait_auth_rotating(double)+0xb6) [0x73b066]
4: (MDS::init(int)+0x1433) [0x4ca033]
5: (main()+0xfed) [0x4a9c4d]
6: (__libc_start_main()+0xed) [0x7fac563cd76d]
7: /usr/bin/ceph-mds() [0x4abda9]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
ceph.conf
[osd]
osd journal size = 1000
filestore_xattr_use_omap = true
[osd.0]
host = burnupi07
[osd.1]
host = burnupi07
[mon.a]
host = burnupi08
mon addr = 10.214.134.36:6789
[mds.a]
host = burnupi06
3. Tried to set ceph auth by including the line "auth client required = cephx,none" on mds node and restarted mds but segfaults again.
However, continued upgrading osd and mon nodes to v0.55 with "auth client required = cephx,none" set in ceph.conf and it is working fine
Updated by Tamilarasi muthamizhan over 11 years ago
I restarted the running cluster after removing the entry "auth client required=cephx,none" in ceph.conf and it is still working fine.
So, the issue is not with the auth setting in ceph.conf , it is about the order in which the nodes have to be upgraded.
Updated by Greg Farnum over 11 years ago
Can you reproduce this with debug ms = 10, debug auth = 10, debug mds = 10 on?
Updated by Anonymous over 11 years ago
Not from my end. I'm not working at Inktank, today. I'll be in tomorrow and I hope to make progress on it then.
Updated by Tamilarasi muthamizhan over 11 years ago
I am planning to reproduce it with debug on and show it to Greg
Updated by Yehuda Sadeh over 11 years ago
- Assignee changed from Anonymous to Yehuda Sadeh
Updated by Tamilarasi muthamizhan over 11 years ago
I was able to reproduce this issue on burnupi06[running mds], burnupi07[osds], burnupi08[mon] cluster with debugs on.
Leaving the cluster nodes as it is for anyone who wants to take a look at it.
pasting below the mds log:
ubuntu@burnupi06:/etc/ceph$ tail -f /var/log/ceph/ceph-mds.a.log 2012-12-10 19:42:37.986712 7f070b5ab700 10 -- 10.214.133.8:6800/54651 reaper done 2012-12-10 19:42:37.986722 7f070b5ab700 10 -- 10.214.133.8:6800/54651 reaper_entry done 2012-12-10 19:42:37.986745 7f070f52b780 10 -- 10.214.133.8:6800/54651 wait: closing pipes 2012-12-10 19:42:37.986752 7f070f52b780 10 -- 10.214.133.8:6800/54651 reaper 2012-12-10 19:42:37.986756 7f070f52b780 10 -- 10.214.133.8:6800/54651 reaper done 2012-12-10 19:42:37.986757 7f070f52b780 10 -- 10.214.133.8:6800/54651 wait: waiting for pipes to close 2012-12-10 19:42:37.986760 7f070f52b780 10 -- 10.214.133.8:6800/54651 >> :/0 pipe(0x31a5000 sd=-1 pgs=0 cs=0 l=0).discard_queue 2012-12-10 19:42:37.986766 7f070f52b780 10 -- 10.214.133.8:6800/54651 wait: done. 2012-12-10 19:42:37.986768 7f070f52b780 1 -- 10.214.133.8:6800/54651 shutdown complete. 2012-12-10 19:42:37.986809 7f070f52b780 0 stopped. 2012-12-10 19:45:48.224683 7f66ce07f780 1 -- 0.0.0.0:6800/55867 messenger.start 2012-12-10 19:45:48.224722 7f66ca900700 10 -- 0.0.0.0:6800/55867 reaper_entry start 2012-12-10 19:45:48.224741 7f66ca900700 10 -- 0.0.0.0:6800/55867 reaper 2012-12-10 19:45:48.224745 7f66ca900700 10 -- 0.0.0.0:6800/55867 reaper done 2012-12-10 19:45:48.224804 7f66ce07f780 5 adding auth protocol: cephx 2012-12-10 19:45:48.224810 7f66ce07f780 5 adding auth protocol: cephx 2012-12-10 19:45:48.224870 7f66ce07f780 10 mds.-1.0 168 MDSCacheObject 2012-12-10 19:45:48.224885 7f66ce07f780 10 mds.-1.0 2144 CInode 2012-12-10 19:45:48.224886 7f66ce07f780 10 mds.-1.0 16 elist<>::item *7=112 2012-12-10 19:45:48.224887 7f66ce07f780 10 mds.-1.0 360 inode_t 2012-12-10 19:45:48.224889 7f66ce07f780 10 mds.-1.0 56 nest_info_t 2012-12-10 19:45:48.224890 7f66ce07f780 10 mds.-1.0 32 frag_info_t 2012-12-10 19:45:48.224895 7f66ce07f780 10 mds.-1.0 40 SimpleLock *5=200 2012-12-10 19:45:48.224896 7f66ce07f780 10 mds.-1.0 48 ScatterLock *3=144 2012-12-10 19:45:48.224897 7f66ce07f780 10 mds.-1.0 472 CDentry 2012-12-10 19:45:48.224898 7f66ce07f780 10 mds.-1.0 16 elist<>::item 2012-12-10 19:45:48.224899 7f66ce07f780 10 mds.-1.0 40 SimpleLock 2012-12-10 19:45:48.224900 7f66ce07f780 10 mds.-1.0 1016 CDir 2012-12-10 19:45:48.224901 7f66ce07f780 10 mds.-1.0 16 elist<>::item *2=32 2012-12-10 19:45:48.224902 7f66ce07f780 10 mds.-1.0 192 fnode_t 2012-12-10 19:45:48.224903 7f66ce07f780 10 mds.-1.0 56 nest_info_t *2 2012-12-10 19:45:48.224904 7f66ce07f780 10 mds.-1.0 32 frag_info_t *2 2012-12-10 19:45:48.224906 7f66ce07f780 10 mds.-1.0 168 Capability 2012-12-10 19:45:48.224906 7f66ce07f780 10 mds.-1.0 32 xlist<>::item *2=64 2012-12-10 19:45:48.224909 7f66ce07f780 10 -- 0.0.0.0:6800/55867 ready 0.0.0.0:6800/55867 2012-12-10 19:45:48.224936 7f66ce07f780 1 accepter.accepter.start 2012-12-10 19:45:48.224962 7f66c90fd700 10 accepter.accepter starting 2012-12-10 19:45:48.225133 7f66ce07f780 2 auth: KeyRing::load: loaded key file /var/lib/ceph/mds/ceph-a/keyring 2012-12-10 19:45:48.225188 7f66ce07f780 5 adding auth protocol: cephx 2012-12-10 19:45:48.225207 7f66ce07f780 10 -- 0.0.0.0:6800/55867 connect_rank to 10.214.134.36:6789/0, creating pipe and registering 2012-12-10 19:45:48.225239 7f66ce07f780 10 -- 0.0.0.0:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=-1 :0 pgs=0 cs=0 l=1).register_pipe 2012-12-10 19:45:48.225256 7f66ce07f780 10 -- 0.0.0.0:6800/55867 get_connection mon.0 10.214.134.36:6789/0 new 0x2e806c0 2012-12-10 19:45:48.225245 7f66ce07b700 10 -- 0.0.0.0:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=-1 :0 pgs=0 cs=0 l=1).writer: state = connecting policy.server=0 2012-12-10 19:45:48.225266 7f66ce07b700 10 -- 0.0.0.0:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=-1 :0 pgs=0 cs=0 l=1).connect 0 2012-12-10 19:45:48.225282 7f66ce07b700 10 -- 0.0.0.0:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :0 pgs=0 cs=0 l=1).connecting to 10.214.134.36:6789/0 2012-12-10 19:45:48.225307 7f66ce07f780 1 -- 0.0.0.0:6800/55867 --> 10.214.134.36:6789/0 -- auth(proto 0 26 bytes epoch 0) v1 -- ?+0 0x2e87400 con 0x2e95580 2012-12-10 19:45:48.225699 7f66ce07b700 1 -- 10.214.133.8:6800/55867 learned my addr 10.214.133.8:6800/55867 2012-12-10 19:45:48.225720 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=0 cs=0 l=1).connect sent my addr 10.214.133.8:6800/55867 2012-12-10 19:45:48.225731 7f66ce07b700 10 mds.-1.0 MDS::ms_get_authorizer type=mon 2012-12-10 19:45:48.225735 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=0 cs=0 l=1).connect sending gseq=1 cseq=0 proto=15 2012-12-10 19:45:48.225919 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).connect success 1, lossy = 1, features 327679 2012-12-10 19:45:48.225978 7f66c98fe700 0 mds.-1.0 ms_handle_connect on 10.214.134.36:6789/0 2012-12-10 19:45:48.225998 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).writer: state = open policy.server=0 2012-12-10 19:45:48.226032 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).writer: state = open policy.server=0 2012-12-10 19:45:48.226308 7f66c78fa700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).reader wants 24 from dispatch throttler 0/104857600 2012-12-10 19:45:48.226329 7f66c78fa700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).aborted = 0 2012-12-10 19:45:48.226345 7f66c78fa700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).No session security set 2012-12-10 19:45:48.226353 7f66c78fa700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).reader got message 1 0x2e87600 auth_reply(proto 0 -95 Operation not supported) v1 2012-12-10 19:45:48.226381 7f66c98fe700 1 -- 10.214.133.8:6800/55867 <== mon.0 10.214.134.36:6789/0 1 ==== auth_reply(proto 0 -95 Operation not supported) v1 ==== 24+0+0 (3632875112 0 0) 0x2e87600 con 0x2e95580 2012-12-10 19:45:48.226385 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).writer: state = open policy.server=0 2012-12-10 19:45:48.226402 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).write_ack 1 2012-12-10 19:45:48.226407 7f66c98fe700 10 -- 10.214.133.8:6800/55867 dispatch_throttle_release 24 to dispatch throttler 24/104857600 2012-12-10 19:45:48.226415 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).writer: state = open policy.server=0 2012-12-10 19:45:48.226970 7f66ce07f780 -1 *** Caught signal (Segmentation fault) ** in thread 7f66ce07f780 ceph version 0.55-192-ge4d0aea (e4d0aeace187afee6313d803511ba595e91dd7ed) 1: /usr/bin/ceph-mds() [0x816dea] 2: (()+0xfcb0) [0x7f66cdc62cb0] 3: (MonClient::wait_auth_rotating(double)+0xb6) [0x73c216] 4: (MDS::init(int)+0x1433) [0x4c9fa3] 5: (main()+0xfed) [0x4a9b9d] 6: (__libc_start_main()+0xed) [0x7f66cc33e76d] 7: /usr/bin/ceph-mds() [0x4abcf9] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -87> 2012-12-10 19:45:48.222426 7f66ce07f780 5 asok(0x2e81000) register_command perfcounters_dump hook 0x2e76010 -86> 2012-12-10 19:45:48.222449 7f66ce07f780 5 asok(0x2e81000) register_command 1 hook 0x2e76010 -85> 2012-12-10 19:45:48.222453 7f66ce07f780 5 asok(0x2e81000) register_command perf dump hook 0x2e76010 -84> 2012-12-10 19:45:48.222458 7f66ce07f780 5 asok(0x2e81000) register_command perfcounters_schema hook 0x2e76010 -83> 2012-12-10 19:45:48.222462 7f66ce07f780 5 asok(0x2e81000) register_command 2 hook 0x2e76010 -82> 2012-12-10 19:45:48.222464 7f66ce07f780 5 asok(0x2e81000) register_command perf schema hook 0x2e76010 -81> 2012-12-10 19:45:48.222468 7f66ce07f780 5 asok(0x2e81000) register_command config show hook 0x2e76010 -80> 2012-12-10 19:45:48.222472 7f66ce07f780 5 asok(0x2e81000) register_command config set hook 0x2e76010 -79> 2012-12-10 19:45:48.222475 7f66ce07f780 5 asok(0x2e81000) register_command log flush hook 0x2e76010 -78> 2012-12-10 19:45:48.222479 7f66ce07f780 5 asok(0x2e81000) register_command log dump hook 0x2e76010 -77> 2012-12-10 19:45:48.222483 7f66ce07f780 5 asok(0x2e81000) register_command log reopen hook 0x2e76010 -76> 2012-12-10 19:45:48.223534 7f66ce07f780 0 ceph version 0.55-192-ge4d0aea (e4d0aeace187afee6313d803511ba595e91dd7ed), process ceph-mds, pid 55867 -75> 2012-12-10 19:45:48.223914 7f66ce07f780 10 -- :/0 rank.bind :/0 -74> 2012-12-10 19:45:48.223921 7f66ce07f780 10 accepter.accepter.bind -73> 2012-12-10 19:45:48.223935 7f66ce07f780 10 accepter.accepter.bind bound on random port 0.0.0.0:6800/0 -72> 2012-12-10 19:45:48.223945 7f66ce07f780 10 accepter.accepter.bind bound to 0.0.0.0:6800/0 -71> 2012-12-10 19:45:48.223955 7f66ce07f780 1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6800/55867 need_addr=1 -70> 2012-12-10 19:45:48.224368 7f66ce07f780 1 finished global_init_daemonize -69> 2012-12-10 19:45:48.224524 7f66ce07f780 5 asok(0x2e81000) init /var/run/ceph/ceph-mds.a.asok -68> 2012-12-10 19:45:48.224540 7f66ce07f780 5 asok(0x2e81000) bind_and_listen /var/run/ceph/ceph-mds.a.asok -67> 2012-12-10 19:45:48.224572 7f66ce07f780 5 asok(0x2e81000) register_command 0 hook 0x2e75038 -66> 2012-12-10 19:45:48.224579 7f66ce07f780 5 asok(0x2e81000) register_command version hook 0x2e75038 -65> 2012-12-10 19:45:48.224585 7f66ce07f780 5 asok(0x2e81000) register_command git_version hook 0x2e75038 -64> 2012-12-10 19:45:48.224590 7f66ce07f780 5 asok(0x2e81000) register_command help hook 0x2e76050 -63> 2012-12-10 19:45:48.224631 7f66ce07f780 10 monclient(hunting): build_initial_monmap -62> 2012-12-10 19:45:48.224620 7f66cb101700 5 asok(0x2e81000) entry start -61> 2012-12-10 19:45:48.224683 7f66ce07f780 1 -- 0.0.0.0:6800/55867 messenger.start -60> 2012-12-10 19:45:48.224722 7f66ca900700 10 -- 0.0.0.0:6800/55867 reaper_entry start -59> 2012-12-10 19:45:48.224741 7f66ca900700 10 -- 0.0.0.0:6800/55867 reaper -58> 2012-12-10 19:45:48.224745 7f66ca900700 10 -- 0.0.0.0:6800/55867 reaper done -57> 2012-12-10 19:45:48.224804 7f66ce07f780 5 adding auth protocol: cephx -56> 2012-12-10 19:45:48.224810 7f66ce07f780 5 adding auth protocol: cephx -55> 2012-12-10 19:45:48.224870 7f66ce07f780 10 mds.-1.0 168 MDSCacheObject -54> 2012-12-10 19:45:48.224885 7f66ce07f780 10 mds.-1.0 2144 CInode -53> 2012-12-10 19:45:48.224886 7f66ce07f780 10 mds.-1.0 16 elist<>::item *7=112 -52> 2012-12-10 19:45:48.224887 7f66ce07f780 10 mds.-1.0 360 inode_t -51> 2012-12-10 19:45:48.224889 7f66ce07f780 10 mds.-1.0 56 nest_info_t -50> 2012-12-10 19:45:48.224890 7f66ce07f780 10 mds.-1.0 32 frag_info_t -49> 2012-12-10 19:45:48.224895 7f66ce07f780 10 mds.-1.0 40 SimpleLock *5=200 -48> 2012-12-10 19:45:48.224896 7f66ce07f780 10 mds.-1.0 48 ScatterLock *3=144 -47> 2012-12-10 19:45:48.224897 7f66ce07f780 10 mds.-1.0 472 CDentry -46> 2012-12-10 19:45:48.224898 7f66ce07f780 10 mds.-1.0 16 elist<>::item -45> 2012-12-10 19:45:48.224899 7f66ce07f780 10 mds.-1.0 40 SimpleLock -44> 2012-12-10 19:45:48.224900 7f66ce07f780 10 mds.-1.0 1016 CDir -43> 2012-12-10 19:45:48.224901 7f66ce07f780 10 mds.-1.0 16 elist<>::item *2=32 -42> 2012-12-10 19:45:48.224902 7f66ce07f780 10 mds.-1.0 192 fnode_t -41> 2012-12-10 19:45:48.224903 7f66ce07f780 10 mds.-1.0 56 nest_info_t *2 -40> 2012-12-10 19:45:48.224904 7f66ce07f780 10 mds.-1.0 32 frag_info_t *2 -39> 2012-12-10 19:45:48.224906 7f66ce07f780 10 mds.-1.0 168 Capability -38> 2012-12-10 19:45:48.224906 7f66ce07f780 10 mds.-1.0 32 xlist<>::item *2=64 -37> 2012-12-10 19:45:48.224909 7f66ce07f780 10 -- 0.0.0.0:6800/55867 ready 0.0.0.0:6800/55867 -36> 2012-12-10 19:45:48.224936 7f66ce07f780 1 accepter.accepter.start -35> 2012-12-10 19:45:48.224959 7f66ce07f780 10 monclient(hunting): init -34> 2012-12-10 19:45:48.224962 7f66c90fd700 10 accepter.accepter starting -33> 2012-12-10 19:45:48.225133 7f66ce07f780 2 auth: KeyRing::load: loaded key file /var/lib/ceph/mds/ceph-a/keyring -32> 2012-12-10 19:45:48.225188 7f66ce07f780 5 adding auth protocol: cephx -31> 2012-12-10 19:45:48.225192 7f66ce07f780 10 monclient(hunting): auth_supported 2 method cephx -30> 2012-12-10 19:45:48.225201 7f66ce07f780 10 monclient(hunting): _reopen_session -29> 2012-12-10 19:45:48.225207 7f66ce07f780 10 -- 0.0.0.0:6800/55867 connect_rank to 10.214.134.36:6789/0, creating pipe and registering -28> 2012-12-10 19:45:48.225239 7f66ce07f780 10 -- 0.0.0.0:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=-1 :0 pgs=0 cs=0 l=1).register_pipe -27> 2012-12-10 19:45:48.225256 7f66ce07f780 10 -- 0.0.0.0:6800/55867 get_connection mon.0 10.214.134.36:6789/0 new 0x2e806c0 -26> 2012-12-10 19:45:48.225245 7f66ce07b700 10 -- 0.0.0.0:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=-1 :0 pgs=0 cs=0 l=1).writer: state = connecting policy.server=0 -25> 2012-12-10 19:45:48.225266 7f66ce07b700 10 -- 0.0.0.0:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=-1 :0 pgs=0 cs=0 l=1).connect 0 -24> 2012-12-10 19:45:48.225282 7f66ce07b700 10 -- 0.0.0.0:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :0 pgs=0 cs=0 l=1).connecting to 10.214.134.36:6789/0 -23> 2012-12-10 19:45:48.225281 7f66ce07f780 10 monclient(hunting): _pick_new_mon picked mon.a con 0x2e95580 addr 10.214.134.36:6789/0 -22> 2012-12-10 19:45:48.225302 7f66ce07f780 10 monclient(hunting): _send_mon_message to mon.a at 10.214.134.36:6789/0 -21> 2012-12-10 19:45:48.225307 7f66ce07f780 1 -- 0.0.0.0:6800/55867 --> 10.214.134.36:6789/0 -- auth(proto 0 26 bytes epoch 0) v1 -- ?+0 0x2e87400 con 0x2e95580 -20> 2012-12-10 19:45:48.225320 7f66ce07f780 10 monclient(hunting): renew_subs -19> 2012-12-10 19:45:48.225699 7f66ce07b700 1 -- 10.214.133.8:6800/55867 learned my addr 10.214.133.8:6800/55867 -18> 2012-12-10 19:45:48.225720 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=0 cs=0 l=1).connect sent my addr 10.214.133.8:6800/55867 -17> 2012-12-10 19:45:48.225731 7f66ce07b700 10 mds.-1.0 MDS::ms_get_authorizer type=mon -16> 2012-12-10 19:45:48.225735 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=0 cs=0 l=1).connect sending gseq=1 cseq=0 proto=15 -15> 2012-12-10 19:45:48.225919 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).connect success 1, lossy = 1, features 327679 -14> 2012-12-10 19:45:48.225978 7f66c98fe700 0 mds.-1.0 ms_handle_connect on 10.214.134.36:6789/0 -13> 2012-12-10 19:45:48.225998 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).writer: state = open policy.server=0 -12> 2012-12-10 19:45:48.226032 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).writer: state = open policy.server=0 -11> 2012-12-10 19:45:48.226308 7f66c78fa700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).reader wants 24 from dispatch throttler 0/104857600 -10> 2012-12-10 19:45:48.226329 7f66c78fa700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).aborted = 0 -9> 2012-12-10 19:45:48.226345 7f66c78fa700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).No session security set -8> 2012-12-10 19:45:48.226353 7f66c78fa700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).reader got message 1 0x2e87600 auth_reply(proto 0 -95 Operation not supported) v1 -7> 2012-12-10 19:45:48.226381 7f66c98fe700 1 -- 10.214.133.8:6800/55867 <== mon.0 10.214.134.36:6789/0 1 ==== auth_reply(proto 0 -95 Operation not supported) v1 ==== 24+0+0 (3632875112 0 0) 0x2e87600 con 0x2e95580 -6> 2012-12-10 19:45:48.226385 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).writer: state = open policy.server=0 -5> 2012-12-10 19:45:48.226401 7f66c98fe700 10 monclient(hunting): no handler for protocol 0 -4> 2012-12-10 19:45:48.226403 7f66c98fe700 10 monclient(hunting): none of our auth protocols are supported by the server -3> 2012-12-10 19:45:48.226402 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).write_ack 1 -2> 2012-12-10 19:45:48.226407 7f66c98fe700 10 -- 10.214.133.8:6800/55867 dispatch_throttle_release 24 to dispatch throttler 24/104857600 -1> 2012-12-10 19:45:48.226415 7f66ce07b700 10 -- 10.214.133.8:6800/55867 >> 10.214.134.36:6789/0 pipe(0x2e806c0 sd=16 :57301 pgs=7 cs=1 l=1).writer: state = open policy.server=0 0> 2012-12-10 19:45:48.226970 7f66ce07f780 -1 *** Caught signal (Segmentation fault) ** in thread 7f66ce07f780 ceph version 0.55-192-ge4d0aea (e4d0aeace187afee6313d803511ba595e91dd7ed) 1: /usr/bin/ceph-mds() [0x816dea] 2: (()+0xfcb0) [0x7f66cdc62cb0] 3: (MonClient::wait_auth_rotating(double)+0xb6) [0x73c216] 4: (MDS::init(int)+0x1433) [0x4c9fa3] 5: (main()+0xfed) [0x4a9b9d] 6: (__libc_start_main()+0xed) [0x7f66cc33e76d] 7: /usr/bin/ceph-mds() [0x4abcf9] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- logging levels --- 0/ 5 none 0/ 1 lockdep 0/ 1 context 1/ 1 crush 10/10 mds 1/ 5 mds_balancer 1/ 5 mds_locker 1/ 5 mds_log 1/ 5 mds_log_expire 1/ 5 mds_migrator 0/ 1 buffer 0/ 1 timer 0/ 1 filer 0/ 1 striper 0/ 1 objecter 0/ 5 rados 0/ 5 rbd 0/ 5 journaler 0/ 5 objectcacher 0/ 5 client 0/ 5 osd 0/ 5 optracker 0/ 5 objclass 1/ 3 filestore 1/ 3 journal 10/10 ms 1/ 5 mon 0/10 monc 0/ 5 paxos 0/ 5 tp 10/10 auth 1/ 5 crypto 1/ 1 finisher 1/ 5 heartbeatmap 1/ 5 perfcounter 1/ 5 rgw 1/ 5 hadoop 1/ 5 javaclient 1/ 5 asok 1/ 1 throttle -2/-2 (syslog threshold) -1/-1 (stderr threshold) max_recent 100000 max_new 1000 log_file /var/log/ceph/ceph-mds.a.log --- end dump of recent events --- ubuntu@burnupi06:~$ sudo service ceph start === mds.a === Starting Ceph mds.a on burnupi06... starting mds.a at :/0 ubuntu@burnupi06:~$ sudo service ceph status === mds.a === mds.a: not running.
Updated by Yehuda Sadeh over 11 years ago
mon is running argonaut, mds is running 0.55+. The problem is that mds doesn't handle gracefully a case where mon rejects the connection due to auth protocol mismatch. This happens due to changes in the default selected auth protocol.
Updated by Yehuda Sadeh over 11 years ago
- Status changed from New to Resolved
Fixed, 29307d3b32e523da7f25a6c3bc904749790b1e18. Fix would just make it exit gracefully.