Project

General

Profile

Bug #16266

Ceph Status - Segmentation Fault

Added by Mathias Buresch 8 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
Start date:
06/13/2016
Due date:
% Done:

0%

Source:
Tags:
Backport:
jewel,hammer
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
hammer
Needs Doc:
No

Description

The command "ceph -s" just gives me "Segementation Fault". Here is the debugging output:

(gdb) run /usr/bin/ceph status --debug-monc=20 --debug-ms=20 --debug-rados=20 --debug-auth=20
Starting program: /usr/bin/python /usr/bin/ceph status --debug-monc=20 --debug-ms=20 --debug-rados=20 --debug-auth=20
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff10f5700 (LWP 19281)]
[New Thread 0x7ffff08f4700 (LWP 19282)]
[Thread 0x7ffff10f5700 (LWP 19281) exited]
[New Thread 0x7ffff10f5700 (LWP 19283)]
[Thread 0x7ffff10f5700 (LWP 19283) exited]
[New Thread 0x7ffff10f5700 (LWP 19284)]
[Thread 0x7ffff10f5700 (LWP 19284) exited]
[New Thread 0x7ffff10f5700 (LWP 19285)]
[Thread 0x7ffff10f5700 (LWP 19285) exited]
[New Thread 0x7ffff10f5700 (LWP 19286)]
[Thread 0x7ffff10f5700 (LWP 19286) exited]
[New Thread 0x7ffff10f5700 (LWP 19287)]
[Thread 0x7ffff10f5700 (LWP 19287) exited]
[New Thread 0x7ffff10f5700 (LWP 19288)]
[New Thread 0x7fffeb885700 (LWP 19289)]
2016-06-13 18:23:11.009358 7ffff10f5700 10 monclient(hunting): build_initial_monmap
2016-06-13 18:23:11.009453 7ffff10f5700 1 librados: starting msgr at :/0
2016-06-13 18:23:11.009459 7ffff10f5700 1 librados: starting objecter
[New Thread 0x7fffeb084700 (LWP 19290)]
2016-06-13 18:23:11.010474 7ffff10f5700 10 -- :/0 ready :/0
[New Thread 0x7fffea883700 (LWP 19291)]
[New Thread 0x7fffea082700 (LWP 19292)]
2016-06-13 18:23:11.012538 7ffff10f5700 1 -- :/0 messenger.start
[New Thread 0x7fffe9881700 (LWP 19293)]
2016-06-13 18:23:11.013537 7ffff10f5700 1 librados: setting wanted keys
2016-06-13 18:23:11.013544 7ffff10f5700 1 librados: calling monclient init
2016-06-13 18:23:11.013545 7ffff10f5700 10 monclient(hunting): init
2016-06-13 18:23:11.013560 7ffff10f5700 5 adding auth protocol: cephx
2016-06-13 18:23:11.013562 7ffff10f5700 10 monclient(hunting): auth_supported 2 method cephx
2016-06-13 18:23:11.013539 7fffe9881700 10 -- :/3675815490 reaper_entry start
2016-06-13 18:23:11.013565 7fffe9881700 10 -- :/3675815490 reaper
2016-06-13 18:23:11.013567 7fffe9881700 10 -- :/3675815490 reaper done
2016-06-13 18:23:11.013703 7ffff10f5700 2 auth: KeyRing::load: loaded key file /etc/ceph/ceph.client.admin.keyring
[New Thread 0x7fffe9080700 (LWP 19294)]
[New Thread 0x7fffe887f700 (LWP 19295)]
2016-06-13 18:23:11.015700 7ffff10f5700 10 monclient(hunting): _reopen_session rank 1 name
2016-06-13 18:23:11.015709 7ffff10f5700 10 -
:/3675815490 connect_rank to 62.176.141.181:6789/0, creating pipe and registering
[New Thread 0x7fffe3fff700 (LWP 19296)]
2016-06-13 18:23:11.016719 7ffff10f5700 10 -- :/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=-1 :0 s=1 pgs=0 cs=0 l=1 c=0x7fffec05aa30).register_pipe
2016-06-13 18:23:11.016735 7ffff10f5700 10 -- :/3675815490 get_connection mon.0 62.176.141.181:6789/0 new 0x7fffec064010
2016-06-13 18:23:11.016720 7fffe3fff700 10 -- :/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=-1 :0 s=1 pgs=0 cs=0 l=1 c=0x7fffec05aa30).writer: state = connecting policy.server=0
2016-06-13 18:23:11.016746 7fffe3fff700 10 -- :/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=-1 :0 s=1 pgs=0 cs=0 l=1 c=0x7fffec05aa30).connect 0
2016-06-13 18:23:11.016758 7ffff10f5700 10 monclient(hunting): picked mon.pix01 con 0x7fffec05aa30 addr 62.176.141.181:6789/0
2016-06-13 18:23:11.016762 7fffe3fff700 10 -- :/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fffec05aa30).connecting to 62.176.141.181:6789/0
2016-06-13 18:23:11.016767 7ffff10f5700 20 -- :/3675815490 send_keepalive con 0x7fffec05aa30, have pipe.
2016-06-13 18:23:11.016787 7ffff10f5700 10 monclient(hunting): _send_mon_message to mon.pix01 at 62.176.141.181:6789/0
2016-06-13 18:23:11.016793 7ffff10f5700 1 -- :/3675815490 --> 62.176.141.181:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- ?+0 0x7fffec060450 con 0x7fffec05aa30
2016-06-13 18:23:11.016800 7ffff10f5700 20 -- :/3675815490 submit_message auth(proto 0 30 bytes epoch 0) v1 remote, 62.176.141.181:6789/0, have pipe.
2016-06-13 18:23:11.016806 7ffff10f5700 10 monclient(hunting): renew_subs
2016-06-13 18:23:11.016811 7ffff10f5700 10 monclient(hunting): authenticate will time out at 2016-06-13 18:28:11.016810
2016-06-13 18:23:11.016969 7fffe3fff700 20 -- :/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=1 pgs=0 cs=0 l=1 c=0x7fffec05aa30).connect read peer addr 62.176.141.181:6789/0 on socket 3
2016-06-13 18:23:11.016983 7fffe3fff700 20 -- :/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=1 pgs=0 cs=0 l=1 c=0x7fffec05aa30).connect peer addr for me is 62.176.141.181:47428/0
2016-06-13 18:23:11.016988 7fffe3fff700 1 -- 62.176.141.181:0/3675815490 learned my addr 62.176.141.181:0/3675815490
2016-06-13 18:23:11.017014 7fffe3fff700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=1 pgs=0 cs=0 l=1 c=0x7fffec05aa30).connect sent my addr 62.176.141.181:0/3675815490
2016-06-13 18:23:11.017022 7fffe3fff700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=1 pgs=0 cs=0 l=1 c=0x7fffec05aa30).connect sending gseq=1 cseq=0 proto=15
2016-06-13 18:23:11.017033 7fffe3fff700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=1 pgs=0 cs=0 l=1 c=0x7fffec05aa30).connect wrote (self +) cseq, waiting for reply
2016-06-13 18:23:11.017069 7fffe3fff700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=1 pgs=0 cs=0 l=1 c=0x7fffec05aa30).connect got reply tag 1 connect_seq 1 global_seq 795910 proto 15 flags 1 features 55169095435288575
2016-06-13 18:23:11.017078 7fffe3fff700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).connect success 1, lossy = 1, features 55169095435288575
2016-06-13 18:23:11.017087 7fffe3fff700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).connect starting reader
[New Thread 0x7fffe3efe700 (LWP 19299)]
2016-06-13 18:23:11.018149 7fffe3fff700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).writer: state = open policy.server=0
2016-06-13 18:23:11.018166 7fffe3fff700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).write_keepalive2 14 2016-06-13 18:23:11.018165
2016-06-13 18:23:11.018169 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader reading tag...
2016-06-13 18:23:11.018192 7fffe3fff700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).writer encoding 1 features 55169095435288575 0x7fffec060450 auth(proto 0 30 bytes epoch 0) v1
2016-06-13 18:23:11.018216 7fffe3fff700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).writer no session security
2016-06-13 18:23:11.018221 7fffe3fff700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).writer sending 1 0x7fffec060450
2016-06-13 18:23:11.018239 7fffe3fff700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).writer: state = open policy.server=0
2016-06-13 18:23:11.018244 7fffe3fff700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).writer sleeping
2016-06-13 18:23:11.018242 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got KEEPALIVE_ACK
2016-06-13 18:23:11.018252 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader reading tag...
2016-06-13 18:23:11.018349 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got ACK
2016-06-13 18:23:11.018359 7fffe3efe700 15 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got ack seq 1
2016-06-13 18:23:11.018363 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader reading tag...
2016-06-13 18:23:11.018370 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got MSG
2016-06-13 18:23:11.018373 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got envelope type=4 src mon.0 front=340 data=0 off 0
2016-06-13 18:23:11.018381 7fffe3efe700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader wants 340 from dispatch throttler 0/104857600
2016-06-13 18:23:11.018387 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got front 340
2016-06-13 18:23:11.018393 7fffe3efe700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).aborted = 0
2016-06-13 18:23:11.018401 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got 340 + 0 + 0 byte message
2016-06-13 18:23:11.018420 7fffe3efe700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).No session security set
2016-06-13 18:23:11.018429 7fffe3efe700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got message 1 0x7fffd0001cb0 mon_map magic: 0 v1
2016-06-13 18:23:11.018442 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 queue 0x7fffd0001cb0 prio 196
2016-06-13 18:23:11.018451 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader reading tag...
2016-06-13 18:23:11.018453 7fffe3fff700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).writer: state = open policy.server=0
2016-06-13 18:23:11.018460 7fffe3fff700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).write_ack 1
2016-06-13 18:23:11.018476 7fffe3fff700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).writer: state = open policy.server=0
2016-06-13 18:23:11.018473 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got MSG
2016-06-13 18:23:11.018481 7fffe3fff700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).writer sleeping
2016-06-13 18:23:11.018482 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got envelope type=18 src mon.0 front=33 data=0 off 0
2016-06-13 18:23:11.018487 7fffe3efe700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader wants 33 from dispatch throttler 340/104857600
2016-06-13 18:23:11.018492 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got front 33
2016-06-13 18:23:11.018499 7fffe3efe700 10 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).aborted = 0
2016-06-13 18:23:11.018480 7fffea883700 1 -- 62.176.141.181:0/3675815490 <== mon.0 62.176.141.181:6789/0 1 ==== mon_map magic: 0 v1 ==== 340+0+0 (3213884171 0 0) 0x7fffd0001cb0 con 0x7fffec05aa30
2016-06-13 18:23:11.018503 7fffe3efe700 20 -- 62.176.141.181:0/3675815490 >> 62.176.141.181:6789/0 pipe(0x7fffec064010 sd=3 :47428 s=2 pgs=795910 cs=1 l=1 c=0x7fffec05aa30).reader got 33 + 0 + 0 byte message
2016-06-13 18:23:11.018507 7fffea883700 10
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffea883700 (LWP 19291)]
0x00007ffff3141a57 in encrypt (cct=<optimized out>, error=0x7fffea882280, out=..., in=..., this=0x7fffea882470)
at auth/cephx/../Crypto.h:110
110 auth/cephx/../Crypto.h: No such file or directory.
(gdb) bt
#0 0x00007ffff3141a57 in encrypt (cct=<optimized out>, error=0x7fffea882280, out=..., in=..., this=0x7fffea882470)
at auth/cephx/../Crypto.h:110
#1 encode_encrypt_enc_bl<CephXChallengeBlob> (cct=<optimized out>, error="", out=..., key=..., t=<synthetic pointer>)
at auth/cephx/CephxProtocol.h:464
#2 encode_encrypt<CephXChallengeBlob> (cct=<optimized out>, error="", out=..., key=..., t=<synthetic pointer>)
at auth/cephx/CephxProtocol.h:489
#3 cephx_calc_client_server_challenge (cct=<optimized out>, secret=..., server_challenge=2963837093789507174,
client_challenge=4406578846186549183, key=key@entry=0x7fffea8824a8, ret="") at auth/cephx/CephxProtocol.cc:36
#4 0x00007ffff313aff4 in CephxClientHandler::build_request (this=0x7fffd4001520, bl=...) at auth/cephx/CephxClientHandler.cc:53
#5 0x00007ffff2fe4a79 in MonClient::handle_auth (this=this@entry=0x7fffec006b70, m=m@entry=0x7fffd0002ee0) at mon/MonClient.cc:510
#6 0x00007ffff2fe6507 in MonClient::ms_dispatch (this=0x7fffec006b70, m=0x7fffd0002ee0) at mon/MonClient.cc:277
#7 0x00007ffff30d5dc9 in ms_deliver_dispatch (m=0x7fffd0002ee0, this=0x7fffec055410) at ./msg/Messenger.h:582
#8 DispatchQueue::entry (this=0x7fffec0555d8) at msg/simple/DispatchQueue.cc:185
#9 0x00007ffff31023bd in DispatchQueue::DispatchThread::entry (this=<optimized out>) at msg/simple/DispatchQueue.h:103
#10 0x00007ffff7bc4182 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00007ffff78f147d in clone () from /lib/x86_64-linux-gnu/libc.so.6

ceph.client.admin.keyring (113 Bytes) Mathias Buresch, 06/14/2016 09:05 AM


Related issues

Copied to Backport #17345: jewel: Ceph Status - Segmentation Fault Resolved
Copied to Backport #17346: hammer: Ceph Status - Segmentation Fault Resolved

History

#1 Updated by Brad Hubbard 8 months ago

  • Assignee set to Brad Hubbard
  • Priority changed from Normal to High

Hi Mathias,

I believe I've seen this issue and am working on a patch for it.

Could you upload /etc/ceph/ceph.client.admin.keyring since it looks like one of the keys in that keyring has caused the issue?

#2 Updated by Brad Hubbard 8 months ago

This issue can be seen due to http://tracker.ceph.com/issues/2904 or some other event leading to a bad key.
Here's a simple reproducer.

$ OSD=3 MON=1 ./vstart.sh -n -x -l
$ ./ceph-authtool /tmp/keyring --create-keyring --name=mon. --add-key= --cap mon 'allow *'
$ cat /tmp/keyring
[mon.]
key = AAAAAAAAAAAAAAAA
caps mon = "allow *"
$ ./ceph --cluster=ceph -c ceph.conf --name=mon. --keyring=/tmp/keyring auth get-or-create client.admin mon 'allow *' osd 'allow *' mds 'allow *'

I guess there are many "ceph" commands could be substituted above.

This produces this crash.

(gdb) bt
#0 0x00007fffec1ffb0a in CryptoKey::encrypt (cct=<optimized out>, error=0x7fffdeaa5490, out=..., in=..., this=0x7fffdeaa5460) at auth/Crypto.h:110
#1 encode_encrypt_enc_bl<CephXChallengeBlob> (cct=<optimized out>, error="", out=..., key=..., t=<synthetic pointer>) at auth/cephx/CephxProtocol.h:464
#2 encode_encrypt<CephXChallengeBlob> (cct=<optimized out>, error="", out=..., key=..., t=<synthetic pointer>) at auth/cephx/CephxProtocol.h:489
#3 cephx_calc_client_server_challenge (cct=<optimized out>, secret=..., server_challenge=5130027892042384807, client_challenge=2771181419658874561, key=key@entry=0x7fffdeaa54b8, error="") at auth/cephx/CephxProtocol.cc:35
#4 0x00007fffec1f930a in CephxClientHandler::build_request (this=0x7fffd4001150, bl=...) at auth/cephx/CephxClientHandler.cc:53
#5 0x00007fffec076892 in MonClient::handle_auth (this=this@entry=0x7fffe0007b20, m=m@entry=0x7fffd0001080) at mon/MonClient.cc:526
#6 0x00007fffec076ffb in MonClient::ms_dispatch (this=0x7fffe0007b20, m=0x7fffd0001080) at mon/MonClient.cc:287
#7 0x00007fffec18f5f6 in Messenger::ms_deliver_dispatch (m=0x7fffd0001080, this=0x7fffe0055290) at msg/Messenger.h:584
#8 DispatchQueue::entry (this=0x7fffe0055488) at msg/simple/DispatchQueue.cc:185
#9 0x00007fffec1bb88d in DispatchQueue::DispatchThread::entry (this=<optimized out>) at msg/simple/DispatchQueue.h:103
#10 0x00007ffff77f158a in start_thread () from /lib64/libpthread.so.0
#11 0x00007ffff6e165cd in clone () from /lib64/libc.so.6

A related crash is seen when modifying the existing key for a mon and trying
to start it.

(gdb) bt
#0 0x00000000012eb8aa in CryptoKey::encrypt(CephContext*, ceph::buffer::list const&, ceph::buffer::list&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) const ()
#1 0x0000000001784067 in void encode_encrypt_enc_bl<CephXServiceTicketInfo>(CephContext
, CephXServiceTicketInfo const&, CryptoKey const&, ceph::buffer::list&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) ()
#2 0x000000000177ee22 in cephx_build_service_ticket_blob(CephContext*, CephXSessionAuthInfo&, CephXTicketBlob&) ()
#3 0x0000000001269b3f in Monitor::ms_get_authorizer(int, AuthAuthorizer**, bool) ()
#4 0x00000000016414b8 in Messenger::ms_deliver_get_authorizer(int, bool) ()
#5 0x000000000163d063 in SimpleMessenger::get_authorizer(int, bool) ()
#6 0x00000000017e787d in Pipe::connect() ()
#7 0x00000000017f10fc in Pipe::writer() ()
#8 0x00000000017f87a6 in Pipe::Writer::entry() ()
#9 0x0000000001721b73 in Thread::entry_wrapper() ()
#10 0x0000000001721aa8 in Thread::_entry_func(void*) ()
#11 0x00007ffff504360a in start_thread (arg=0x7fffea8a0700) at pthread_create.c:334
#12 0x00007ffff34d659d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

In both cases we see a null secret.

(gdb) f 4
#4 0x00007fffec1f930a in CephxClientHandler::build_request (this=0x7fffd4001150, bl=...) at auth/cephx/CephxClientHandler.cc:53
53 req.client_challenge, &req.key, error);
(gdb) p secret.secret
$1 = {_raw = 0x7fffe005d280, _off = 0, _len = 0}

I have patches for these but need to clean up some test case failures before submitting a PR.

#3 Updated by Mathias Buresch 8 months ago

Hi Brad,

here is the keyring.

#4 Updated by Brad Hubbard 8 months ago

Hi Mathias,

That key ("AAAAAAAAAAAAAAAA") is invalid and is what is causing the crash, see http://tracker.ceph.com/issues/2904.

You can fix that keyring file by generating a new key.

$ ceph-authtool /etc/ceph/ceph.client.admin.keyring --name=client.admin -u 0 --gen-key --cap mds 'allow *' --cap mon 'allow *' --cap osd 'allow *'

#5 Updated by Brad Hubbard 8 months ago

  • Status changed from New to In Progress

#6 Updated by Mathias Buresch 8 months ago

Just creating a new keyring with the command you applied didnt work.. Maybe I am missing something..

I dont have really time to debug it now.. but will have a look at it later.
Anyway.. Thanks for your help so far!! You do great work! :)

#7 Updated by Brad Hubbard 7 months ago

  • Status changed from In Progress to Resolved

#8 Updated by Patrick Donnelly 5 months ago

Ugh, I just found this bug in my testing with ceph-ansible on v10.2.2. Can we get this backported to jewel?

#9 Updated by Nathan Cutler 5 months ago

  • Tracker changed from Support to Bug
  • Status changed from Resolved to Pending Backport
  • Backport set to jewel,hammer

#10 Updated by Nathan Cutler 5 months ago

#11 Updated by Nathan Cutler 5 months ago

  • Copied to Backport #17346: hammer: Ceph Status - Segmentation Fault added

#12 Updated by Nathan Cutler 3 months ago

  • Status changed from Pending Backport to Resolved
  • Needs Doc set to No

Also available in: Atom PDF