Project

General

Profile

Actions

Bug #372

closed

2-Monitor election fight

Added by Greg Farnum over 13 years ago. Updated over 13 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Category:
Monitor
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Was running tests on a local branch with a 2 monitor setup and eventually got to where no system work was completing because the losing monitor kept calling a new election. Sage thinks there might be an issue where the monitors will check that the winner has a lower ID than them, since we changed monitors to have names.

Logs are in the failure dir, so find out if that's what happened and fix it.

Actions #1

Updated by Greg Farnum over 13 years ago

  • Status changed from New to In Progress

It looks like it's a legitimate election -- in all the instances I've checked, the lease is timing out on the pgmap (ie, not getting a lease extension), so it calls for a new election. Not sure yet why the PGMap isn't getting leases when the other monitored things are.

Actions #2

Updated by Greg Farnum over 13 years ago

  • Status changed from In Progress to Rejected

All right, looking through all the logs I think this was just a resource contention issue from running too many high-activity daemons on my server.

Actions #3

Updated by Yehuda Sadeh over 13 years ago

Are the logs useful enough so that the common user could get to such a conclusion? Maybe we need to some kind of warning mechanism that will trigger when the server is overloaded, so that users would easily discover why their cmon hasn't come up.

Actions

Also available in: Atom PDF