Bug #3591
closed
auth: could not find secret_id=0
Added by Yehuda Sadeh over 11 years ago.
Updated over 11 years ago.
Description
User reported that when using kernel rbd, periodically appears in the log:
libceph: osd5 xxx.xxx.xxx.xxx:6809 socket closed
and
libceph: osd5 xxx.xxx.xxx.xxx:6809 connect authorization failure
and on the osd side:
ceph-osd: 2012-12-07 19:53:40.856596 7f6f894de700 0 auth: could not find secret_id=0
ceph-osd: 2012-12-07 19:53:40.856600 7f6f894de700 0 cephx: verify_authorizer could not get service secret for service osd secret_id=0
- Status changed from New to 12
- Priority changed from Normal to High
Have seen this pop up in several places.
I bet we can find it with 'debug auth = 0/20' (so that it is all logged in memory) and then changing the above to also assert
Did you see it happening with anything other than the kernel clients?
That i can't remember.. but that message shouldn't come up on the server ever unless it is failing to rotate its keys properly, or unless the startup sequence is wrong.
Oh.. i fixed this on the osd recently. eee098222343cec0e7db2c5d4067050847da5481 which went into 0.55. Were they running something older?
They were running 0.55. I don't think it's failure to rotate keys, as the secret_id was consistently 0. I'd expect a higher number for a rotation failure.
It happened periodically, which I guess was correlated to client reconnection to the osd. Could be an issue with the data the client sends to the osd (when it reconnects), or a problem with the way the osd handles these reconnections. In any case, other than these messages the system was functioning well and there was no other noticeable symptom (retry fixed the issue?).
- Status changed from 12 to Resolved
- Assignee set to Sage Weil
Resolved by Sage's fix above.
- Status changed from Resolved to Closed
Also available in: Atom
PDF