Project

General

Profile

Actions

Bug #3591

closed

auth: could not find secret_id=0

Added by Yehuda Sadeh over 11 years ago. Updated over 11 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

User reported that when using kernel rbd, periodically appears in the log:

libceph: osd5 xxx.xxx.xxx.xxx:6809 socket closed

and
libceph: osd5 xxx.xxx.xxx.xxx:6809 connect authorization failure

and on the osd side:
ceph-osd: 2012-12-07 19:53:40.856596 7f6f894de700  0 auth: could not find secret_id=0
ceph-osd: 2012-12-07 19:53:40.856600 7f6f894de700  0 cephx: verify_authorizer could not get service secret for service osd secret_id=0


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #3563: osd crashed with error "auth: could not find secret_id=2"ClosedSage Weil11/30/2012

Actions
Actions #1

Updated by Sage Weil over 11 years ago

  • Status changed from New to 12
  • Priority changed from Normal to High

Have seen this pop up in several places.

I bet we can find it with 'debug auth = 0/20' (so that it is all logged in memory) and then changing the above to also assert

Actions #2

Updated by Yehuda Sadeh over 11 years ago

Did you see it happening with anything other than the kernel clients?

Actions #3

Updated by Sage Weil over 11 years ago

That i can't remember.. but that message shouldn't come up on the server ever unless it is failing to rotate its keys properly, or unless the startup sequence is wrong.

Oh.. i fixed this on the osd recently. eee098222343cec0e7db2c5d4067050847da5481 which went into 0.55. Were they running something older?

Actions #4

Updated by Yehuda Sadeh over 11 years ago

They were running 0.55. I don't think it's failure to rotate keys, as the secret_id was consistently 0. I'd expect a higher number for a rotation failure.
It happened periodically, which I guess was correlated to client reconnection to the osd. Could be an issue with the data the client sends to the osd (when it reconnects), or a problem with the way the osd handles these reconnections. In any case, other than these messages the system was functioning well and there was no other noticeable symptom (retry fixed the issue?).

Actions #5

Updated by Ian Colle over 11 years ago

  • Status changed from 12 to Resolved
  • Assignee set to Sage Weil

Resolved by Sage's fix above.

Actions #6

Updated by Ian Colle over 11 years ago

  • Status changed from Resolved to Closed
Actions

Also available in: Atom PDF