Bug #8614
closedOSD keyring shifted
0%
Description
Hello,
Env:
1 host = mon, mds, admin for ceph-deploy
4 hosts = osd server
I troubleshooted an OSD being down on 1 of my osd server. I had to reboot it to fix a down osd/drive. While the host came back from a reboot, the osds' keyring shifted by increment of 1. The keys don't match with <ceph auth export>, so all the OSD on this host are down, not able to authenticate with mon server.
[jlu@gfsnode1 ~]$ sudo service ceph start
Password:
=== osd.34 ===
2014-06-16 14:00:12.630262 7f36171a9700 0 librados: osd.34 authentication error (1) Operation not permitted
Error connecting to cluster: PermissionError
failed: 'timeout 30 /usr/bin/ceph c /etc/ceph/ceph.conf --name=osd.34 --keyring=/var/lib/ceph/osd/ceph-34/keyring osd crush create-or-move - 34 2.73 host=gfsnode1 root=default'
[jlu@gfsnode1 ~]$
[jlu@gfsnode1 ceph-34]$ cd ../ceph-33
[jlu@gfsnode1 ceph-33]$ pwd
/var/lib/ceph/osd/ceph-33
[jlu@gfsnode1 ceph-33]$ cat whoami
34
[jlu@gfsnode1 ceph-33]$ cd ../ceph-34
[jlu@gfsnode1 ceph-34]$ pwd
/var/lib/ceph/osd/ceph-34
[jlu@gfsnode1 ceph-34]$ cat whoami
35
[jlu@gfsnode1 ceph-34]$
I also compare the keys with <ceph auth export> from the mon node, the [osd_id] and keys are not match up.
From mon node:
osd.33
key: AQDdPXVTUP0OAxAAptykWQeOWrSwg+DIMwRCwA==
caps: [mon] allow profile osd
caps: [osd] allow *
From osd node:
[jlu@gfsnode1 ceph-33]$ pwd
/var/lib/ceph/osd/ceph-33
[jlu@gfsnode1 ceph-33]$ cat keyring
[jlu@gfsnode1 ceph-33]$ sudo !!
sudo cat keyring
[osd.34]
key = AQAwPnVT6G7fBRAA86D4FuxN0U8uKXk0brPbCQ==
[jlu@gfsnode1 ceph-33]$
——
From mon node:
osd.34
key: AQAwPnVT6G7fBRAA86D4FuxN0U8uKXk0brPbCQ==
caps: [mon] allow profile osd
caps: [osd] allow *
From osd node:
[jlu@gfsnode1 ceph-34]$ pwd
/var/lib/ceph/osd/ceph-34
[jlu@gfsnode1 ceph-34]$ cat keyring
[jlu@gfsnode1 ceph-34]$ sudo !!
sudo cat keyring
[osd.35]
key = AQBbPnVTmG4BLxAA6UV6XHbZepXUEXB6VJQzEA==
[jlu@gfsnode1 ceph-34]$
All 11 OSDs shifted increment of one. Very odd.
THanks,
Jimmy