Project

General

Profile

Actions

Bug #3510

closed

messenger doesn't fill in nonce if port is specified

Added by Greg Farnum over 11 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

A user came into irc this morning have trouble getting rebooted OSDs into the cluster. Sam tracked it down to the daemons all having a nonce of 0 at all times, which meant that the monitor thought the freshly-booted OSDs were in fact old long-running daemons.
This turned out to be because the messenger doesn't fill in the nonce if the port to bind to is specified. (See Accepter.cc:140.) I can't for the life of me trace how or why this is the case, and so I assume it's a historical accident due to our mess of messenger startup.

I pushed a branch wip-set-nonce-on-bind which does precisely that. However, I didn't want to merge it:
1) I can't figure out how this bug started happening so I don't want to violate any unknown requirements without looking into it more.
2) This means that the monitors will start getting their nonce set, which they haven't had happen for literally years. No idea if that will cause trouble.

Actions #1

Updated by Sage Weil over 11 years ago

  • Status changed from New to Resolved
Actions #2

Updated by Greg Farnum about 5 years ago

  • Project changed from Ceph to Messengers
  • Category deleted (msgr)
  • Target version deleted (v0.55a)
Actions

Also available in: Atom PDF