Project

General

Profile

Actions

Bug #5203

closed

mon: backup monmap for sync appears to drop correct monitor names?

Added by Joao Eduardo Luis almost 11 years ago. Updated almost 11 years ago.

Status:
Resolved
Priority:
High
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Came across this one while debugging one of saaby's mon crashes.

Apparently, saaby (@ #ceph) recreated a monitor using the monmap obtained from his cluster (with a formed quorum). That monitor then went about to sync, and backed up a monmap as according to plan.

All hell then broke loose when the monitor was restarted, as the backed up monmap appears to have messed up the names:


// Obtained from broken mon store

$ ceph-monstore-tool --mon-store-path . --key mon_sync:latest_monmap get-val --out /tmp/mon_sync.monmap
2013-05-30 15:44:52.178388 7f768d504780 -1 did not load config file, using default settings.
obtaining (mon_sync,latest_monmap)

$ monmaptool --print /tmp/mon_sync.monmap 
monmaptool: monmap file /tmp/mon_sync.monmap
epoch 3
fsid ab804c03-24c1-4532-9fad-f7c1a2606aa5
last_changed 2013-05-21 16:15:04.234470
created 0.000000
0: 10.81.16.11:6789/0 mon.0
1: 10.81.30.11:6789/0 mon.1
2: 10.83.27.11:6789/0 mon.2

Note how the backup monmap's monitor names are mon.0, mon.1 and mon.2, which seems to be according to rank. Instead, they should have been as follows:


// Obtained from a healthier, earlier version of the store

$ ceph-monstore-tool --mon-store-path . getmonmap --out /tmp/mon_sync.monmap.02
2013-05-30 15:47:20.007503 7f2d41a0f780 -1 did not load config file, using default settings.

$ monmaptool --print /tmp/mon_sync.monmap.02
monmaptool: monmap file /tmp/mon_sync.monmap.02
epoch 3
fsid ab804c03-24c1-4532-9fad-f7c1a2606aa5
last_changed 2013-05-21 16:15:04.234470
created 0.000000
0: 10.81.16.11:6789/0 mon.ceph1-cph1c16-mon1
1: 10.81.30.11:6789/0 mon.ceph1-cph1f11-mon1
2: 10.83.27.11:6789/0 mon.ceph1-cph2i11-mon1

These were the names that were supposed to be on the monmap.

Note how the last_changed timestamps match though.

Actions

Also available in: Atom PDF