Project

General

Profile

Bug #22096

Authentication failed, did you specify a mgr ID with a valid keyring?

Added by Matthew Richardson about 1 year ago. Updated 12 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
ceph-mgr
Target version:
-
Start date:
11/09/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

I have a test cluster which has been running fine with jewel, which I am trying to upgrade to luminous. (12.2.1, using ceph-deploy 1.5.39)

I followed the upgrade steps given here: http://ceph.com/releases/v12-2-0-luminous-released/#upgrading and mons and osds seem ok (apart from the obvious
HEALTH_WARN no active mgr).

I then ran 'ceph-deploy mgr create $HOST' which completed without error.

However, starting ceph-mgr (either directly, or through systemd, results in the error:

"Authentication failed, did you specify a mgr ID with a valid keyring?"

Disabling cephx (auth_cluster_required=none) and everything works fine.

ceph auth list contains the following:

installed auth entries:

mds.$HOST
    key: [REDACTED]==
    caps: [mds] allow
    caps: [mgr] allow profile mds
    caps: [mon] allow profile mds
    caps: [osd] allow rwx
client.admin
    key: [REDACTED]==
    caps: [mds] allow *
    caps: [mgr] allow *
    caps: [mon] allow *
    caps: [osd] allow *
client.bootstrap-mds
    key: [REDACTED]==
    caps: [mgr] allow r
    caps: [mon] allow profile bootstrap-mds
client.bootstrap-mgr
    key: [REDACTED]==
    caps: [mds] allow *
    caps: [mgr] allow *
    caps: [mon] allow profile mgr
    caps: [osd] allow *
client.bootstrap-osd
    key: [REDACTED]==
    caps: [mgr] allow r
    caps: [mon] allow profile bootstrap-osd
client.bootstrap-rgw
    key: [REDACTED]==
    caps: [mgr] allow r
    caps: [mon] allow profile bootstrap-rgw
mgr.$HOST
    key: [REDACTED]==
    caps: [mds] allow *
    caps: [mon] allow profile mgr
    caps: [osd] allow *

Additionally, the keyrings (owned and readable by ceph user) contain keys which match those given above:
/var/lib/ceph/bootstrap-mgr/ceph.keyring

[client.bootstrap-mgr]
        key = [REDACTED]==

/var/lib/ceph/mgr/ceph-$HOST/keyring

<code class="text">
[mgr.$HOST]
        key = [REDACTED]==
</pre>

Please let me know if there is any other debugging information I can provide?


Related issues

Copied to mgr - Backport #22811: luminous: Authentication failed, did you specify a mgr ID with a valid keyring? Resolved

History

#1 Updated by John Spray about 1 year ago

I don't see anything wrong with that key information.

You could use a "strace ceph-mgr -i $HOST -d" to try and see what's happening when it trys to open its key file.

#2 Updated by Matthew Richardson about 1 year ago

I've just run ceph-mgr with strace and it never tries to access or open any keyring files - nothing in /var/lib/ceph at all.

#3 Updated by John Spray about 1 year ago

Pretty odd. I'd look at your ceph.conf to see if there's anything bogus there

#4 Updated by Matthew Richardson about 1 year ago

ceph.conf - I've tried it with and without the [mgr] section - makes no difference.

[global]
auth_cluster_required=cephx
auth_service_required=none
auth_client_required=none
fsid=17c914ee-d91e-4ce3-a40c-3848bb7aaaaa
public_network=192.168.0.0/24
[mon.$HOST]
host=$HOST
mon_addr=192.168.0.1:6789
[mds.$HOST]
host=$HOST
[mgr.$HOST]
host=$HOST
[osd.0]
host=$HOST
[osd.1]
host=$HOST
[osd.2]
host=$HOST

#5 Updated by Matthew Richardson about 1 year ago

Some more debugging info, in case it's useful (please let me know if there's any other info I can provide to help solve this).

Running ceph-mon and ceph-mgr with '--debug_ms 5':

ceph-mgr -i w3572 --setuser ceph --setgroup ceph -d --debug_ms 5
2017-11-15 15:31:00.792720 7f73f65ea480  0 set uid:gid to 167:167 (ceph:ceph)
2017-11-15 15:31:00.792737 7f73f65ea480  0 ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable), process (unknown), pid 25085
2017-11-15 15:31:00.794252 7f73f65ea480  0 pidfile_write: ignore empty --pid-file
2017-11-15 15:31:00.796675 7f73f06ce700  2 Event(0x55f8f275d480 nevent=5000 time_id=1).set_owner idx=0 owner=140135931635456
2017-11-15 15:31:00.796718 7f73efecd700  2 Event(0x55f8f275d880 nevent=5000 time_id=1).set_owner idx=1 owner=140135923242752
2017-11-15 15:31:00.796753 7f73ef6cc700  2 Event(0x55f8f275dc80 nevent=5000 time_id=1).set_owner idx=2 owner=140135914850048
2017-11-15 15:31:00.797003 7f73f65ea480  1  Processor -- start
2017-11-15 15:31:00.797058 7f73f65ea480  1 -- - start start
2017-11-15 15:31:00.797336 7f73f65ea480  1 -- - --> 192.168.0.1:6789/0 -- auth(proto 0 30 bytes epoch 0) v1 -- 0x55f8f2780f00 con 0
2017-11-15 15:31:00.797527 7f73efecd700  1 -- 192.168.0.1:0/1195569141 learned_addr learned my addr 192.168.0.1:0/1195569141
2017-11-15 15:31:00.797664 7f73efecd700  2 -- 192.168.0.1:0/1195569141 >> 192.168.0.1:6789/0 conn(0x55f8f29a1800 :-1 s=STATE_CONNECTING_WAIT_ACK_SEQ pgs=0 cs=0 l=0)._process_connection got newly_acked_seq 0 vs out_seq 0
2017-11-15 15:31:00.797976 7f73efecd700  5 -- 192.168.0.1:0/1195569141 >> 192.168.0.1:6789/0 conn(0x55f8f29a1800 :-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=3 cs=1 l=1). rx mon.0 seq 1 0x55f8f2780f00 auth_reply(proto 0 -95 (95) Operation not supported) v1
2017-11-15 15:31:00.798049 7f73ed6c8700  1 -- 192.168.0.1:0/1195569141 <== mon.0 192.168.0.1:6789/0 1 ==== auth_reply(proto 0 -95 (95) Operation not supported) v1 ==== 24+0+0 (3632875112 0 0) 0x55f8f2780f00 con 0x55f8f29a1800
2017-11-15 15:31:00.798103 7f73ed6c8700  1 -- 192.168.0.1:0/1195569141 >> 192.168.0.1:6789/0 conn(0x55f8f29a1800 :-1 s=STATE_OPEN pgs=3 cs=1 l=1).mark_down
2017-11-15 15:31:00.798115 7f73ed6c8700  2 -- 192.168.0.1:0/1195569141 >> 192.168.0.1:6789/0 conn(0x55f8f29a1800 :-1 s=STATE_OPEN pgs=3 cs=1 l=1)._stop
2017-11-15 15:31:00.798151 7f73f65ea480 -1 mgr init Authentication failed, did you specify a mgr ID with a valid keyring?
2017-11-15 15:31:00.798215 7f73f65ea480  1 -- 192.168.0.1:0/1195569141 shutdown_connections 
2017-11-15 15:31:00.798224 7f73f65ea480  5 -- 192.168.0.1:0/1195569141 shutdown_connections mark down 192.168.0.1:6789/0 0x55f8f29a1800
2017-11-15 15:31:00.798232 7f73f65ea480  5 -- 192.168.0.1:0/1195569141 shutdown_connections delete 0x55f8f29a1800
2017-11-15 15:31:00.798307 7f73f65ea480  1 -- 192.168.0.1:0/1195569141 shutdown_connections 
Error in initialization: (95) Operation not supported
2017-11-15 15:31:00.798359 7f73f65ea480  1 -- 192.168.0.1:0/1195569141 wait complete.
2017-11-15 15:31:00.798537 7f73f65ea480  1 -- 192.168.0.1:0/1195569141 >> 192.168.0.1:0/1195569141 conn(0x55f8f29a0000 :-1 s=STATE_NONE pgs=0 cs=0 l=0).mark_down
2017-11-15 15:31:00.798550 7f73f65ea480  2 -- 192.168.0.1:0/1195569141 >> 192.168.0.1:0/1195569141 conn(0x55f8f29a0000 :-1 s=STATE_NONE pgs=0 cs=0 l=0)._stop

2017-11-15 15:31:16.253758 7f2090d69700  1 -- 192.168.0.1:6789/0 >> - conn(0x560910b35800 :6789 s=STATE_ACCEPTING pgs=0 cs=0 l=0)._process_connection sd=28 -
2017-11-15 15:31:16.253903 7f2090d69700  2 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_ACCEPTING_WAIT_SEQ pgs=1 cs=1 l=1).handle_connect_msg accept write reply msg done
2017-11-15 15:31:16.253962 7f2090d69700  2 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_ACCEPTING_WAIT_SEQ pgs=1 cs=1 l=1)._process_connection accept get newly_acked_seq 0
2017-11-15 15:31:16.254041 7f2090d69700  5 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1 cs=1 l=1). rx client.? seq 1 0x56091067c300 auth(proto 0 30 bytes epoch 0) v1
2017-11-15 15:31:16.254090 7f208c948700  1 -- 192.168.0.1:6789/0 <== client.? 192.168.0.1:0/3338990731 1 ==== auth(proto 0 30 bytes epoch 0) v1 ==== 60+0+0 (3702222748 0 0) 0x56091067c300 con 0x560910b35800
2017-11-15 15:31:16.254149 7f208c948700  1 mon.w3572@0(leader).auth v1897 client did not provide supported auth type
2017-11-15 15:31:16.254170 7f208c948700  1 -- 192.168.0.1:6789/0 --> 192.168.0.1:0/3338990731 -- auth_reply(proto 0 -95 (95) Operation not supported) v1 -- 0x56091067c080 con 0
2017-11-15 15:31:16.254453 7f2090d69700  1 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN pgs=1 cs=1 l=1).read_bulk peer close file descriptor 28
2017-11-15 15:31:16.254472 7f2090d69700  1 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN pgs=1 cs=1 l=1).read_until read failed
2017-11-15 15:31:16.254481 7f2090d69700  1 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN pgs=1 cs=1 l=1).process read tag failed
2017-11-15 15:31:16.254489 7f2090d69700  1 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN pgs=1 cs=1 l=1).fault on lossy channel, failing
2017-11-15 15:31:16.254508 7f2090d69700  2 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN pgs=1 cs=1 l=1)._stop

#6 Updated by Matthew Richardson about 1 year ago

Apologies - pasted different mgr and mon examples - correct mon example:

2017-11-15 15:31:16.253758 7f2090d69700  1 -- 192.168.0.1:6789/0 >> - conn(0x560910b35800 :6789 s=STATE_ACCEPTING pgs=0 cs=0 l=0)._process_connection sd=28 -
2017-11-15 15:31:16.253903 7f2090d69700  2 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_ACCEPTING_WAIT_SEQ pgs=1 cs=1 l=1).handle_connect_msg accept write reply msg done
2017-11-15 15:31:16.253962 7f2090d69700  2 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_ACCEPTING_WAIT_SEQ pgs=1 cs=1 l=1)._process_connection accept get newly_acked_seq 0
2017-11-15 15:31:16.254041 7f2090d69700  5 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=1 cs=1 l=1). rx client.? seq 1 0x56091067c300 auth(proto 0 30 bytes epoch 0) v1
2017-11-15 15:31:16.254090 7f208c948700  1 -- 192.168.0.1:6789/0 <== client.? 192.168.0.1:0/3338990731 1 ==== auth(proto 0 30 bytes epoch 0) v1 ==== 60+0+0 (3702222748 0 0) 0x56091067c300 con 0x560910b35800
2017-11-15 15:31:16.254149 7f208c948700  1 mon.w3572@0(leader).auth v1897 client did not provide supported auth type
2017-11-15 15:31:16.254170 7f208c948700  1 -- 192.168.0.1:6789/0 --> 192.168.0.1:0/3338990731 -- auth_reply(proto 0 -95 (95) Operation not supported) v1 -- 0x56091067c080 con 0
2017-11-15 15:31:16.254453 7f2090d69700  1 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN pgs=1 cs=1 l=1).read_bulk peer close file descriptor 28
2017-11-15 15:31:16.254472 7f2090d69700  1 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN pgs=1 cs=1 l=1).read_until read failed
2017-11-15 15:31:16.254481 7f2090d69700  1 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN pgs=1 cs=1 l=1).process read tag failed
2017-11-15 15:31:16.254489 7f2090d69700  1 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN pgs=1 cs=1 l=1).fault on lossy channel, failing
2017-11-15 15:31:16.254508 7f2090d69700  2 -- 192.168.0.1:6789/0 >> 192.168.0.1:0/3338990731 conn(0x560910b35800 :6789 s=STATE_OPEN pgs=1 cs=1 l=1)._stop

#7 Updated by Javier Castillo about 1 year ago

I'm also facing this issue: upgrade from jewel to 12.2.0

and when I try to run ceph-mgr I get:
ceph-mgr c /etc/ceph/ceph.conf -i mgr.10.20.0.11 -f -d --debug_ms 3 2>&1
2017-11-21 15:36:34.129679 7f1135c10500 0 ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc), process (unknown), pid 754
2017-11-21 15:36:34.129816 7f1135c10500 0 pidfile_write: ignore empty --pid-file
2017-11-21 15:36:34.133037 7f1130788700 2 Event(0x55a1929642c0 nevent=5000 time_id=1).set_owner idx=0 owner=139711804376832
2017-11-21 15:36:34.133051 7f112ff87700 2 Event(0x55a192964740 nevent=5000 time_id=1).set_owner idx=1 owner=139711795984128
2017-11-21 15:36:34.133134 7f112f786700 2 Event(0x55a192964980 nevent=5000 time_id=1).set_owner idx=2 owner=139711787591424
2017-11-21 15:36:34.133582 7f1135c10500 1 Processor -
start
2017-11-21 15:36:34.133734 7f1135c10500 1 -- - start start
2017-11-21 15:36:34.134446 7f1135c10500 1 -- - --> 10.20.0.12:6789/0 -- auth(proto 0 39 bytes epoch 0) v1 -- 0x55a192abca00 con 0
2017-11-21 15:36:34.134464 7f1135c10500 1 -- - --> 10.20.0.13:6789/0 -- auth(proto 0 39 bytes epoch 0) v1 -- 0x55a192abcc80 con 0
2017-11-21 15:36:34.135480 7f112ff87700 1 -- 10.20.0.11:0/2400010690 learned_addr learned my addr 10.20.0.11:0/2400010690
2017-11-21 15:36:34.136083 7f112f786700 2 -- 10.20.0.11:0/2400010690 >> 10.20.0.12:6789/0 conn(0x55a192bec000 :-1 s=STATE_CONNECTING_WAIT_ACK_SEQ pgs=0 cs=0 l=0)._process_connection got newly_acked_seq 0 vs out_seq 0
2017-11-21 15:36:34.136177 7f112ff87700 2 -- 10.20.0.11:0/2400010690 >> 10.20.0.13:6789/0 conn(0x55a192942800 :-1 s=STATE_CONNECTING_WAIT_ACK_SEQ pgs=0 cs=0 l=0)._process_connection got newly_acked_seq 0 vs out_seq 0
2017-11-21 15:36:34.137840 7f112d782700 1 -- 10.20.0.11:0/2400010690 <== mon.1 10.20.0.12:6789/0 1 ==== mon_map magic: 0 v1 ==== 442+0+0 (2997829210 0 0) 0x55a192abca00 con 0x55a192bec000
2017-11-21 15:36:34.137931 7f112d782700 1 -- 10.20.0.11:0/2400010690 <== mon.1 10.20.0.12:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (1226643662 0 0) 0x55a192abcf00 con 0x55a192bec000
2017-11-21 15:36:34.138045 7f112d782700 1 -- 10.20.0.11:0/2400010690 --> 10.20.0.12:6789/0 -- auth(proto 2 2 bytes epoch 0) v1 -- 0x55a192abca00 con 0
2017-11-21 15:36:34.140289 7f112d782700 1 -- 10.20.0.11:0/2400010690 <== mon.1 10.20.0.12:6789/0 3 ==== auth_reply(proto 2 22 (22) Invalid argument) v1 ==== 24+0+0 (3049609040 0 0) 0x55a192abca00 con 0x55a192bec000
2017-11-21 15:36:34.140317 7f112d782700 1 -
10.20.0.11:0/2400010690 >> 10.20.0.12:6789/0 conn(0x55a192bec000 :-1 s=STATE_OPEN pgs=475 cs=1 l=1).mark_down
2017-11-21 15:36:34.140323 7f112d782700 2 -- 10.20.0.11:0/2400010690 >> 10.20.0.12:6789/0 conn(0x55a192bec000 :-1 s=STATE_OPEN pgs=475 cs=1 l=1)._stop
2017-11-21 15:36:34.141633 7f112d782700 1 -- 10.20.0.11:0/2400010690 <== mon.2 10.20.0.13:6789/0 1 ==== mon_map magic: 0 v1 ==== 442+0+0 (2997829210 0 0) 0x55a192abcc80 con 0x55a192942800
2017-11-21 15:36:34.141680 7f112d782700 1 -- 10.20.0.11:0/2400010690 <== mon.2 10.20.0.13:6789/0 2 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 33+0+0 (2983870075 0 0) 0x55a192abcf00 con 0x55a192942800
2017-11-21 15:36:34.141722 7f112d782700 1 -- 10.20.0.11:0/2400010690 --> 10.20.0.13:6789/0 -- auth(proto 2 2 bytes epoch 0) v1 -- 0x55a192abcc80 con 0
2017-11-21 15:36:34.142215 7f112d782700 1 -- 10.20.0.11:0/2400010690 <== mon.2 10.20.0.13:6789/0 3 ==== auth_reply(proto 2 22 (22) Invalid argument) v1 ==== 24+0+0 (4055120821 0 0) 0x55a192abcc80 con 0x55a192942800
2017-11-21 15:36:34.142244 7f112d782700 1 -
10.20.0.11:0/2400010690 >> 10.20.0.13:6789/0 conn(0x55a192942800 :-1 s=STATE_OPEN pgs=496 cs=1 l=1).mark_down
2017-11-21 15:36:34.142249 7f112d782700 2 -- 10.20.0.11:0/2400010690 >> 10.20.0.13:6789/0 conn(0x55a192942800 :-1 s=STATE_OPEN pgs=496 cs=1 l=1)._stop
2017-11-21 15:36:34.142314 7f1135c10500 1 mgr init Authentication failed, did you specify a mgr ID with a valid keyring?
2017-11-21 15:36:34.143647 7f1135c10500 1 -
10.20.0.11:0/2400010690 shutdown_connections
2017-11-21 15:36:34.143871 7f1135c10500 1 -- 10.20.0.11:0/2400010690 shutdown_connections
Error in initialization: 2017-11-21 15:36:34.143953 7f1135c10500 1 -- 10.20.0.11:0/2400010690 wait complete.
(22) Invalid argument
2017-11-21 15:36:34.144408 7f1135c10500 1 -- 10.20.0.11:0/2400010690 >> 10.20.0.11:0/2400010690 conn(0x55a19293f800 :-1 s=STATE_NONE pgs=0 cs=0 l=0).mark_down
2017-11-21 15:36:34.144427 7f1135c10500 2 -- 10.20.0.11:0/2400010690 >> 10.20.0.11:0/2400010690 conn(0x55a19293f800 :-1 s=STATE_NONE pgs=0 cs=0 l=0)._stop

and in other ceph-mon I get this log:
_ mon.10.20.0.12@1(peon).auth v15 caught error when trying to handle auth request, probably malformed request_

#8 Updated by Shane Voss 12 months ago

I have also had this exact problem.
We have upgraded our MONs and have them running, but cannot get MGR to start.
We keep getting:
mgr init Authentication failed, did you specify a mgr ID with a valid keyring?

#9 Updated by John Spray 12 months ago

Looking a few posts back, I see this:

auth_cluster_required=cephx
auth_service_required=none
auth_client_required=none

Because ceph-mgr includes a monitor client, it is picking up the auth_client_required setting when trying to talk to the monitors, but the monitors are applying the auth_cluster_required to their MGR peer, and rejecting the client for trying to use "none" auth.

You can work around this by setting "auth_client_required=cephx" in a [mgr] section

#10 Updated by John Spray 12 months ago

  • Status changed from New to Need Review
  • Backport set to luminous

#11 Updated by Shane Voss 12 months ago

Fantastic! I added the following to my ceph.conf:

[mgr]
auth_client_required=cephx

And that has resolved the issue.

#12 Updated by Kefu Chai 12 months ago

  • Assignee set to John Spray
  • Target version deleted (v12.2.2)
  • Affected Versions v12.2.2 added

#13 Updated by Matthew Richardson 12 months ago

Having made this change I can confirm that the mgr now connects correctly. However my colleague Bruce noticed a side effect that client commands now show the following error message before then reporting the expected output:

~$ ceph health
2018-01-23 11:59:54.475483 7f51abfff700  0 -- 192.168.140.2:0/2588795769 >> 192.168.140.1:6805/24684 conn(0x7f5194001c90 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER
2018-01-23 11:59:54.476310 7f51abfff700  0 -- 192.168.140.2:0/2588795769 >> 192.168.140.1:6805/24684 conn(0x7f5194001c90 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER
2018-01-23 11:59:54.680145 7f51abfff700  0 -- 192.168.140.2:0/2588795769 >> 192.168.140.1:6805/24684 conn(0x7f5194001c90 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER
2018-01-23 11:59:54.680623 7f51abfff700  0 -- 192.168.140.2:0/2588795769 >> 192.168.140.1:6805/24684 conn(0x7f5194001c90 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER
HEALTH_OK

#14 Updated by Shane Voss 12 months ago

Since making the change above, I get error messages whenever I run /usr/bin/ceph (even just ceph --help).

2018-01-23 11:58:07.671538 7f4c4687b700 0 -- 192.168.34.121:0/3376126226 >> 192.168.34.55:6812/14185 conn(0x7f4c300025b0 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER

There are usually two or four of them. The IP address to the left of >> is always the one running ceph.
The IP address on the right is sometimes the same, and sometimes the address of the current mgr daemon.

On jewel clients the message never appears.

#15 Updated by John Spray 12 months ago

Ah -- turns out the mgr code never thought about the possibility that people would be using a ceph CLI with cephx turned off. I'll fix that, but at the same time I'll also suggest that you reconsider disabling the crypto on your admin CLI client!

#16 Updated by Kefu Chai 12 months ago

  • Status changed from Need Review to Pending Backport

#17 Updated by Nathan Cutler 12 months ago

  • Copied to Backport #22811: luminous: Authentication failed, did you specify a mgr ID with a valid keyring? added

#18 Updated by John Spray 12 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF