Actions
Bug #540
closedCephxClientHandler::handle_response
% Done:
0%
Spent time:
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Saw this crash today after upgrading to the latest unstable:
Core was generated by `/usr/bin/cosd -i 5 -c /etc/ceph/ceph.conf'. Program terminated with signal 6, Aborted. #0 0x00007f9bd0120a75 in raise () from /lib/libc.so.6 (gdb) bt #0 0x00007f9bd0120a75 in raise () from /lib/libc.so.6 #1 0x00007f9bd0124545 in abort () from /lib/libc.so.6 #2 0x00007f9bd09d68e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6 #3 0x00007f9bd09d4d16 in ?? () from /usr/lib/libstdc++.so.6 #4 0x00007f9bd09d4d43 in std::terminate() () from /usr/lib/libstdc++.so.6 #5 0x00007f9bd09d4e3e in __cxa_throw () from /usr/lib/libstdc++.so.6 #6 0x00000000005c8a98 in ceph::__ceph_assert_fail (assertion=0x60b537 "0", file=<value optimized out>, line=162, func=<value optimized out>) at common/assert.cc:30 #7 0x00000000005e432c in CephxClientHandler::handle_response (this=<value optimized out>, ret=-792682944, indata=...) at auth/cephx/CephxClientHandler.cc:162 #8 0x00000000005ce27b in MonClient::handle_auth (this=0x7ffff01d6420, m=0x2255400) at mon/MonClient.cc:375 #9 0x00000000005d0733 in MonClient::ms_dispatch (this=0x7ffff01d6420, m=0x2255400) at mon/MonClient.cc:227 #10 0x0000000000463b39 in Messenger::ms_deliver_dispatch (this=0x11b7000) at msg/Messenger.h:97 #11 SimpleMessenger::dispatch_entry (this=0x11b7000) at msg/SimpleMessenger.cc:342 #12 0x000000000045abec in SimpleMessenger::DispatchThread::entry (this=0x11b7488) at msg/SimpleMessenger.h:570 #13 0x000000000046f1ba in Thread::_entry_func (arg=0x6ec2) at ./common/Thread.h:39 #14 0x00007f9bd121b9ca in start_thread () from /lib/libpthread.so.0 #15 0x00007f9bd01d370d in clone () from /lib/libc.so.6 #16 0x0000000000000000 in ?? ()
Last loglines:
2010-11-03 10:09:22.780031 7f9bc348f710 -- [2001:16f8:10:2::c3c3:3b6a]:6801/28353 >> [2001:16f8:10:2::c3c3:ab76]:6801/14744 pipe(0x26d7780 sd=12 pgs=167 cs=2 l=0).connect claims to be [2001:16f8:10:2::c3c3:ab76]:6801/17673 not [2001:16f8:10:2::c3c3:ab76]:6801/14744 - wrong node! 2010-11-03 10:09:23.553928 7f9bc3f9a710 -- [2001:16f8:10:2::c3c3:3b6a]:6801/28353 >> [2001:16f8:10:2::c3c3:2e56]:6801/21758 pipe(0x26a4a00 sd=12 pgs=201 cs=2 l=0).connect claims to be [2001:16f8:10:2::c3c3:2e56]:6801/3071 not [2001:16f8:10:2::c3c3:2e56]:6801/21758 - wrong node! 2010-11-03 10:09:26.016180 7f9bc94b0710 cephx client: handle_response ret = 0 2010-11-03 10:09:26.016245 7f9bc94b0710 cephx client: got initial server challenge 15029606718942791501 2010-11-03 10:09:26.016260 7f9bc94b0710 cephx client: build_request 2010-11-03 10:09:26.016270 7f9bc94b0710 cephx client: validate_tickets: want=37 need=32 have=5 2010-11-03 10:09:26.016287 7f9bc94b0710 cephx: validate_tickets want 37 have 5 need 32 2010-11-03 10:09:26.016298 7f9bc94b0710 cephx client: want=37 need=32 have=5 2010-11-03 10:09:26.016373 7f9bc94b0710 cephx client: old ticket len=96 2010-11-03 10:09:26.016386 7f9bc94b0710 cephx client: get auth session key: client_challenge 5687951232298511269 2010-11-03 10:09:26.998599 7f9bc7cad710 -- [2001:16f8:10:2::c3c3:3b6a]:6800/28353 >> [2001:16f8:10:2::c3c3:3f9b]:6789/0 pipe(0x369e000 sd=-1 pgs=0 cs=0 l=0).fault first fault 2010-11-03 10:09:35.966178 7f9bc94b0710 cephx client: handle_response ret = 0 2010-11-03 10:09:35.966245 7f9bc94b0710 cephx client: got initial server challenge 282574488338433 2010-11-03 10:09:35.966259 7f9bc94b0710 cephx client: build_request 2010-11-03 10:09:35.966268 7f9bc94b0710 cephx client: validate_tickets: want=37 need=32 have=5 2010-11-03 10:09:35.966286 7f9bc94b0710 cephx: validate_tickets want 37 have 5 need 32 2010-11-03 10:09:35.966297 7f9bc94b0710 cephx client: want=37 need=32 have=5 2010-11-03 10:09:35.966370 7f9bc94b0710 cephx client: old ticket len=96 2010-11-03 10:09:35.966382 7f9bc94b0710 cephx client: get auth session key: client_challenge 10140645574015473838 2010-11-03 10:09:35.966405 7f9bc94b0710 cephx client: handle_response ret = 0 2010-11-03 10:09:35.966416 7f9bc94b0710 cephx client: unknown request_type 27649 auth/cephx/CephxClientHandler.cc: In function 'virtual int CephxClientHandler::handle_response(int, ceph::buffer::list::iterator&)': auth/cephx/CephxClientHandler.cc:162: FAILED assert(0) ceph version 0.23~rc (commit:e304a2451a80f11117cb01031734e72077c88ce0) 1: (MonClient::handle_auth(MAuthReply*)+0x7b) [0x5ce27b] 2: (MonClient::ms_dispatch(Message*)+0x143) [0x5d0733] 3: (SimpleMessenger::dispatch_entry()+0x749) [0x463b39] 4: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x45abec] 5: (Thread::_entry_func(void*)+0xa) [0x46f1ba] 6: (()+0x69ca) [0x7f9bd121b9ca] 7: (clone()+0x6d) [0x7f9bd01d370d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. *** Caught signal (ABRT) *** ceph version 0.23~rc (commit:e304a2451a80f11117cb01031734e72077c88ce0) 1: (sigabrt_handler(int)+0x7d) [0x5db9fd] 2: (()+0x33af0) [0x7f9bd0120af0] 3: (gsignal()+0x35) [0x7f9bd0120a75] 4: (abort()+0x180) [0x7f9bd01245c0] 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f9bd09d68e5] 6: (()+0xcad16) [0x7f9bd09d4d16] 7: (()+0xcad43) [0x7f9bd09d4d43] 8: (()+0xcae3e) [0x7f9bd09d4e3e] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x448) [0x5c8a98] 10: (CephxClientHandler::handle_response(int, ceph::buffer::list::iterator&)+0x13c) [0x5e432c] 11: (MonClient::handle_auth(MAuthReply*)+0x7b) [0x5ce27b] 12: (MonClient::ms_dispatch(Message*)+0x143) [0x5d0733] 13: (SimpleMessenger::dispatch_entry()+0x749) [0x463b39] 14: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x45abec] 15: (Thread::_entry_func(void*)+0xa) [0x46f1ba] 16: (()+0x69ca) [0x7f9bd121b9ca] 17: (clone()+0x6d) [0x7f9bd01d370d]
I've used cdebugpack to gather all the relevant information, it has been uploaded to logger.ceph.widodh.nl:/srv/ceph/issues/osd_crash_cephx_client_handler
Restarted the OSD, it didn't crash again afterwards.
Actions