Bug #24524
closedNewly added OSDs do not start in Mimic
0%
Description
Hi!
In my test cluster with Ceph Mimic installed (from scratch, not upgraded from luminous) newly added OSD fail to start with the following error:
-8> 2018-06-14 12:51:04.616 7f58cf8a0700 3 osd.3 0 handle_osd_map epochs [533,533], i have 0, src has [533,1182]
-7> 2018-06-14 12:51:04.616 7f58cf8a0700 -1 osd.3 0 failed to load OSD map for epoch 532, got 0 bytes
From reading the code here https://github.com/ceph/ceph/blob/master/src/osd/OSD.cc#L7330 I suspect that when added OSDs try to load all osdmaps from oldest to newest (533-1182 in my case) for each Nth map it first loads (N-1)th map and compares them... But there is no (N-1)th map for the oldest one, so it dies with 'assertion failed'.
I'm trying to fix it like this:
diff --git a/src/osd/OSD.cc b/src/osd/OSD.cc
index a6cf188..b026401 100644
--- a/src/osd/OSD.cc
++ b/src/osd/OSD.cc@ -7357,6 +7357,9
@ void OSD::handle_osd_map(MOSDMap *m)
// check for deleted pools
OSDMapRef lastmap;
for (auto& i : added_maps) {
if (i.first <= first) {
+ continue;
+ }
if (!lastmap) {
lastmap = get_map(i.first - 1);
}
So please tell me if I'm correct and push this fix to your repository if yes :)