Project

General

Profile

Actions

Bug #1151

closed

OSD: CephxClientHandler::handle_response

Added by Wido den Hollander almost 13 years ago. Updated almost 13 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
cephx
Target version:
% Done:

0%

Spent time:
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I just saw a bunch of OSD's going down with:

(gdb) bt
#0  0x00007f609f8e97bb in raise () from /lib/libpthread.so.0
#1  0x000000000063d443 in reraise_fatal (signum=4000) at common/signal.cc:61
#2  0x000000000063e55b in handle_fatal_signal (signum=6) at common/signal.cc:108
#3  <signal handler called>
#4  0x00007f609e4b9a75 in raise () from /lib/libc.so.6
#5  0x00007f609e4bd5c0 in abort () from /lib/libc.so.6
#6  0x00007f609ed6f8e5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6
#7  0x00007f609ed6dd16 in ?? () from /usr/lib/libstdc++.so.6
#8  0x00007f609ed6dd43 in std::terminate() () from /usr/lib/libstdc++.so.6
#9  0x00007f609ed6de3e in __cxa_throw () from /usr/lib/libstdc++.so.6
#10 0x00000000006077e1 in ceph::__ceph_assert_fail (assertion=<value optimized out>, file=<value optimized out>, line=<value optimized out>, 
    func=0x6c29a0 "virtual int CephxClientHandler::handle_response(int, ceph::buffer::list::iterator&)") at common/assert.cc:50
#11 0x0000000000657b09 in CephxClientHandler::handle_response (this=0x91b630, ret=-1627771328, indata=...) at auth/cephx/CephxClientHandler.cc:162
#12 0x000000000062fa58 in MonClient::handle_auth (this=0x7fff1a1b5fe0, m=0x6793200) at mon/MonClient.cc:438
#13 0x0000000000630ebc in MonClient::ms_dispatch (this=0x7fff1a1b5fe0, m=0x6793200) at mon/MonClient.cc:272
#14 0x00000000006245a3 in Messenger::ms_deliver_dispatch (this=0x1cfb000) at msg/Messenger.h:98
#15 SimpleMessenger::dispatch_entry (this=0x1cfb000) at msg/SimpleMessenger.cc:353
#16 0x000000000048fd9c in SimpleMessenger::DispatchThread::entry (this=0x1cfb488) at msg/SimpleMessenger.h:544
#17 0x00007f609f8e09ca in start_thread () from /lib/libpthread.so.0
#18 0x00007f609e56c70d in clone () from /lib/libc.so.6
#19 0x0000000000000000 in ?? ()

The logging was on:

debug osd = 20
debug ms = 1

My logs showed:

Jun  8 18:39:13 atom2 osd.8[4000]: 7f6094259700 osd8 41918 send_boot
Jun  8 18:39:13 atom2 osd.8[4000]: 7f6094259700 osd8 41918  assuming cluster_addr ip matches client_addr
Jun  8 18:39:13 atom2 osd.8[4000]: 7f6094259700 osd8 41918  assuming hb_addr ip matches cluster_addr
Jun  8 18:39:13 atom2 osd.8[4000]: 7f6094259700 osd8 41918  client_addr [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/3999, cluster_addr [2a00:f10:113:1:225:90ff:fe33:49f2]:6807/3999, hb addr [2a00:f10:113:1:225:90ff:fe33:49f2]:6808/3999
Jun  8 18:39:13 atom2 osd.8[4000]: 7f6094259700 osd8 41918 send_alive up_thru currently 41870 want 0
Jun  8 18:39:13 atom2 osd.8[4000]: 7f6094259700 osd8 41918 send_pg_stats
Jun  8 18:39:16 atom2 osd.8[4000]: 7f6092a56700 -- [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/3999 mark_down 0x1c941dc0 -- 0x2016da00
Jun  8 18:39:16 atom2 osd.8[4000]: 7f6092a56700 -- [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/3999 --> [2a00:f10:113:1:230:48ff:fed3:b086]:6789/0 -- auth(proto 0 26 bytes) v1 -- ?+0 0x6793600 con 0x1c941c80
Jun  8 18:39:16 atom2 osd.8[4000]: 7f6092154700 osd8 41918 OSD::ms_get_authorizer type=mon
Jun  8 18:39:16 atom2 osd.8[4000]: 7f6094259700 osd8 41918 ms_handle_connect on mon
Jun  8 18:39:16 atom2 osd.8[4000]: 7f6094259700 osd8 41918 send_boot
Jun  8 18:39:16 atom2 osd.8[4000]: 7f6094259700 osd8 41918  assuming cluster_addr ip matches client_addr
Jun  8 18:39:16 atom2 osd.8[4000]: 7f6094259700 osd8 41918  assuming hb_addr ip matches cluster_addr
Jun  8 18:39:16 atom2 osd.8[4000]: 7f6094259700 osd8 41918  client_addr [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/3999, cluster_addr [2a00:f10:113:1:225:90ff:fe33:49f2]:6807/3999, hb addr [2a00:f10:113:1:225:90ff:fe33:49f2]:6808/3999
Jun  8 18:39:16 atom2 osd.8[4000]: 7f6094259700 osd8 41918 send_alive up_thru currently 41870 want 0
Jun  8 18:39:16 atom2 osd.8[4000]: 7f6094259700 osd8 41918 send_pg_stats
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6092a56700 -- [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/3999 mark_down 0x1c941c80 -- 0x2016d280
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6094259700 -- [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/3999 <== mon0 [2a00:f10:113:1:230:48ff:fed3:b086]:6789/0 1 ==== auth_reply(proto 2 0 Success) v1 ==== 33+0+0 (3727614903 0 0) 0x6793800 con 0x1c941c80
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6092a56700 -- [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/3999 --> [2a00:f10:113:1:230:48ff:fed3:b086]:6789/0 -- auth(proto 0 26 bytes) v1 -- ?+0 0x6793200 con 0x1c941b40
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6094259700 -- [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/3999 --> [2a00:f10:113:1:230:48ff:fed3:b086]:6789/0 -- auth(proto 2 32 bytes) v1 -- ?+0 0x6793800 con 0x1c941b40
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6092053700 osd8 41918 OSD::ms_get_authorizer type=mon
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6094259700 osd8 41918 ms_handle_connect on mon
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6094259700 osd8 41918 send_boot
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6094259700 osd8 41918  assuming cluster_addr ip matches client_addr
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6094259700 osd8 41918  assuming hb_addr ip matches cluster_addr
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6094259700 osd8 41918  client_addr [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/3999, cluster_addr [2a00:f10:113:1:225:90ff:fe33:49f2]:6807/3999, hb addr [2a00:f10:113:1:225:90ff:fe33:49f2]:6808/3999
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6094259700 osd8 41918 send_alive up_thru currently 41870 want 0
Jun  8 18:39:19 atom2 osd.8[4000]: 7f6094259700 osd8 41918 send_pg_stats
Jun  8 18:39:21 atom2 osd.8[4000]: 7f6094259700 -- [2a00:f10:113:1:225:90ff:fe33:49f2]:6806/3999 <== mon0 [2a00:f10:113:1:230:48ff:fed3:b086]:6789/0 1 ==== auth_reply(proto 2 0 Success) v1 ==== 33+0+0 (3639190872 0 0) 0x6793200 con 0x1c941b40
Jun  8 18:39:21 atom2 osd.8[4000]: 7f6094259700 cephx client:  unknown request_type 64001
Jun  8 18:39:21 atom2 osd.8[4000]: auth/cephx/CephxClientHandler.cc: In function 'virtual int CephxClientHandler::handle_response(int, ceph::buffer::list::iterator&)', in thread '0x7f6094259700'#012auth/cephx/CephxClientHandler.cc: 162: FAILED assert(0)
Jun  8 18:39:21 atom2 osd.8[4000]:  ceph version 0.28.2-260-ge2c808a (commit:e2c808aea97ced6e9a55b143116b66d344f72c0b)#012 1: (CephxClientHandler::handle_response(int, ceph::buffer::list::iterator&)+0x369) [0x657b09]#012 2: (MonClient::handle_auth(MAuthReply*)+0xb8) [0x62fa58]#012 3: (MonClient::ms_dispatch(Message*)+0x26c) [0x630ebc]#012 4: (SimpleMessenger::dispatch_entry()+0x893) [0x6245a3]#012 5: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x48fd9c]#012 6: (()+0x69ca) [0x7f609f8e09ca]#012 7: (clone()+0x6d) [0x7f609e56c70d]
Jun  8 18:39:21 atom2 osd.8[4000]:  ceph version 0.28.2-260-ge2c808a (commit:e2c808aea97ced6e9a55b143116b66d344f72c0b)#012 1: (CephxClientHandler::handle_response(int, ceph::buffer::list::iterator&)+0x369) [0x657b09]#012 2: (MonClient::handle_auth(MAuthReply*)+0xb8) [0x62fa58]#012 3: (MonClient::ms_dispatch(Message*)+0x26c) [0x630ebc]#012 4: (SimpleMessenger::dispatch_entry()+0x893) [0x6245a3]#012 5: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x48fd9c]#012 6: (()+0x69ca) [0x7f609f8e09ca]#012 7: (clone()+0x6d) [0x7f609e56c70d]
Jun  8 18:39:21 atom2 osd.8[4000]: *** Caught signal (Aborted) **#012 in thread 0x7f6094259700
Jun  8 18:39:21 atom2 osd.8[4000]:  ceph version 0.28.2-260-ge2c808a (commit:e2c808aea97ced6e9a55b143116b66d344f72c0b)#012 1: /usr/bin/cosd() [0x63e33e]#012 2: (()+0xf8f0) [0x7f609f8e98f0]#012 3: (gsignal()+0x35) [0x7f609e4b9a75]#012 4: (abort()+0x180) [0x7f609e4bd5c0]#012 5: (__gnu_cxx::__verbose_terminate_handler()+0x115) [0x7f609ed6f8e5]#012 6: (()+0xcad16) [0x7f609ed6dd16]#012 7: (()+0xcad43) [0x7f609ed6dd43]#012 8: (()+0xcae3e) [0x7f609ed6de3e]#012 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x371) [0x6077e1]#012 10: (CephxClientHandler::handle_response(int, ceph::buffer::list::iterator&)+0x369) [0x657b09]#012 11: (MonClient::handle_auth(MAuthReply*)+0xb8) [0x62fa58]#012 12: (MonClient::ms_dispatch(Message*)+0x26c) [0x630ebc]#012 13: (SimpleMessenger::dispatch_entry()+0x893) [0x6245a3]#012 14: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x48fd9c]#012 15: (()+0x69ca) [0x7f609f8e09ca]#012 16: (clone()+0x6d) [0x7f609e56c70d]

I'm still seeing my monitor eating more and more memory and going OOM, this might be related to it?

Actions

Also available in: Atom PDF