Project

General

Profile

Bug #6605

mon: remove full osd state on "osd rm"

Added by Greg Farnum almost 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
High
Category:
Monitor
Target version:
Start date:
10/21/2013
Due date:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

It turns out that "osd rm" does not eliminate the osd's auth keys from the monitor. That's liable to cause issues. Correct the oversight and check for any other osd state that might be left around.

This is particularly unfortunate because it looks like preprocess_boot will not block "old" instances of the OSD from joining and booting the "new" one out. :/

Associated revisions

Revision e02740ac (diff)
Added by Joao Eduardo Luis almost 6 years ago

mon: OSDMonitor: only allow an osd to boot iff it has the fsid on record

Fixes: #6605

Signed-off-by: Joao Eduardo Luis <>

Revision be6267f5
Added by João Eduardo Luís almost 6 years ago

Merge pull request #788 from ceph/wip-6605

mon: OSDMonitor: only allow an osd to boot iff it has the fsid on record

Fixes: #6605

Reviewed-by: Sage Weil <>

History

#1 Updated by Greg Farnum almost 6 years ago

We should also be checking that the osd matches what we expect based on more than the ID. I think we have enough information in the monitor and boot message to do that...

#2 Updated by Samuel Just almost 6 years ago

  • Assignee set to Joao Eduardo Luis

#3 Updated by Joao Eduardo Luis almost 6 years ago

  • Status changed from New to Need Review

wip-6605, pull request 787, e02740ac5da7c9f5e4c1fdd603918e56c05123de

Greg Farnum wrote:

It turns out that "osd rm" does not eliminate the osd's auth keys from the monitor. That's liable to cause issues. Correct the oversight and check for any other osd state that might be left around.

This was not fixed. Fixing this would need us to be able to seamlessly pack a change to the AuthMonitor state into the OSDMonitor's paxos proposal. Although this would be possible with a little bit of an effort, it would be a new feature altogether rather than a bug fix. Proposing both changes at different times isn't an option either as we would lose the expected atomicity. Instead, we delegate to the user, as we currently do, the responsibility of removing the osd key from the keyring upon osd removal.

Greg Farnum wrote:

This is particularly unfortunate because it looks like preprocess_boot will not block "old" instances of the OSD from joining and booting the "new" one out. :/

This was indeed a bug, however unrelated to the above. This behavior is fixed by wip-6605 and we will now let an osd boot given the following circumstances:

  • osd exists and it's fsid is the same we have on record (in the osdmap)
  • osd dne and fsid on record is nil

#4 Updated by Joao Eduardo Luis almost 6 years ago

  • Status changed from Need Review to Resolved

merged into master

Also available in: Atom PDF