Bug #3591: auth: could not find secret_id=0 - Ceph - Ceph

Actions

Copy link

Bug #3591

closed

auth: could not find secret_id=0

Added by Yehuda Sadeh over 11 years ago. Updated over 11 years ago.

Status:

Closed

Priority:

High

Assignee:

Sage Weil

Category:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

User reported that when using kernel rbd, periodically appears in the log:

libceph: osd5 xxx.xxx.xxx.xxx:6809 socket closed

and

libceph: osd5 xxx.xxx.xxx.xxx:6809 connect authorization failure

and on the osd side:

ceph-osd: 2012-12-07 19:53:40.856596 7f6f894de700  0 auth: could not find secret_id=0
ceph-osd: 2012-12-07 19:53:40.856600 7f6f894de700  0 cephx: verify_authorizer could not get service secret for service osd secret_id=0

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Sage Weil over 11 years ago

Status changed from New to 12
Priority changed from Normal to High

Have seen this pop up in several places.

I bet we can find it with 'debug auth = 0/20' (so that it is all logged in memory) and then changing the above to also assert

Actions

Copy link

Updated by Yehuda Sadeh over 11 years ago

Did you see it happening with anything other than the kernel clients?

Actions

Copy link

Updated by Sage Weil over 11 years ago

That i can't remember.. but that message shouldn't come up on the server ever unless it is failing to rotate its keys properly, or unless the startup sequence is wrong.

Oh.. i fixed this on the osd recently. eee098222343cec0e7db2c5d4067050847da5481 which went into 0.55. Were they running something older?

Actions

Copy link

Updated by Yehuda Sadeh over 11 years ago

They were running 0.55. I don't think it's failure to rotate keys, as the secret_id was consistently 0. I'd expect a higher number for a rotation failure.
It happened periodically, which I guess was correlated to client reconnection to the osd. Could be an issue with the data the client sends to the osd (when it reconnects), or a problem with the way the osd handles these reconnections. In any case, other than these messages the system was functioning well and there was no other noticeable symptom (retry fixed the issue?).

Actions

Copy link

Updated by Ian Colle over 11 years ago

Status changed from 12 to Resolved
Assignee set to Sage Weil

Resolved by Sage's fix above.

Actions

Copy link

Updated by Ian Colle over 11 years ago

Status changed from Resolved to Closed

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #3591

auth: could not find secret_id=0

Updated by Sage Weil over 11 years ago

Updated by Yehuda Sadeh over 11 years ago

Updated by Sage Weil over 11 years ago

Updated by Yehuda Sadeh over 11 years ago

Updated by Ian Colle over 11 years ago

Updated by Ian Colle over 11 years ago