Bug #37835
closedceph-mgr 13.2.4 fails to start with "auth_reply(proto 2 -22 (22) Invalid argument)"
0%
Description
I have luminous cluster that I am attempting to upgrade to mimic. Prior to the upgrade, the mgrs were connecting just fine. Following the upgrade guide I upgrade the monitors to mimic first. That upgrade went fine. Next I attempted to upgrade the mgrs. Unfortunately, the upgraded mgrs fail to connect.
I started with 13.2.2 and have attempted versions up to 13.2.4. There has been no change in behavior between versions.
I've attached the output of /usr/bin/ceph-mgr --cluster ceph --id 8 -d --debug_ms 20 and what I believe is the relevant portion of the monitor log (collected with debug_ms = 10/10).
I have been working this problem at the ceph-users list (http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-January/032099.html) but have been unable to make progress.
Files
Updated by Randall Smith about 5 years ago
I'm still having this issue. Is there anything else that is needed to help troubleshoot this?
Updated by Sebastian Wagner almost 5 years ago
Not sure if ceph-mgr is really the best project to track this issue. Is this a core problem just affecting by luck the mgr?
Updated by Randall Smith almost 5 years ago
mgr is the only service that I am having problems with. Every other service is working. That doesn't preclude a core issue, of course.
Updated by Randall Smith almost 5 years ago
I finally found the problem and the fix. I have a keyring set in the [global] section of ceph.conf. ceph-mgr was trying to use that instead of the default in /var/lib/ceph/mgr/$cluster-$id/keyring. (I think this behavior changed with mimic but I didn't trace it down in the code.)
The fix was to set the keyring path in a [mgr] section in ceph.conf. Once that was done, the mgr started and authenticated just fine.
This can be closed.