Bug #18266
closedMOSDPGTrim wrongly decode during backfill
0%
Description
I have pg 3.0 distribute on [1,2,4], and backfilling to osd.5
during the backfill, i saw the osd.1 receive the MOSDPGTrim from osd.5,
but it can not find it in local pg_map, which confused me.
from osd.5 log, the spg_t is 3.0s0, but in osd.1 the spg_t is 3.0
here is the log:
2016-12-16 15:28:08.125028 7fafe03e7700 1 -- 192.168.0.253:6803/2224051 --> 192.168.0.253:6808/2222060 -- pg_trim(3.0s0 to 61'321 e61) v1 -- ?+0 0x43af200 con 0x43ba3c0
2016-12-16 15:28:08.125101 7f5c3fa1e700 1 -- 192.168.0.253:6808/2222060 <== osd.5 192.168.0.253:6803/2224051 151 ==== pg_trim(3.0 to 61'321 e61) v1
==== 34+0+0 (3465528982 0 0) 0x4e62a00 con 0x5151080
2016-12-16 15:28:08.125116 7f5c3fa1e700 10 osd.1 61 do_waiters -- start
2016-12-16 15:28:08.125118 7f5c3fa1e700 10 osd.1 61 do_waiters -- finish
2016-12-16 15:28:08.125119 7f5c3fa1e700 20 osd.1 61 _dispatch 0x4e62a00 pg_trim(3.0 to 61'321 e61) v1
2016-12-16 15:28:08.125148 7f5c3fa1e700 7 osd.1 61 handle_pg_trim pg_trim(3.0 to 61'321 e61) v1 from osd.5
2016-12-16 15:28:08.125150 7f5c3fa1e700 15 osd.1 61 require_same_or_newer_map 61 (i am 61) 0x4e62a00
2016-12-16 15:28:08.125158 7f5c3fa1e700 10 osd.1 61 don't have pg 3.0
2016-12-16 15:28:08.125167 7f5c3fa1e700 10 osd.1 61 do_waiters -- start
2016-12-16 15:28:08.125168 7f5c3fa1e700 10 osd.1 61 do_waiters -- finish
from the MOSDPGTrim.h / decode_payload function:
void decode_payload() {
bufferlist::iterator p = payload.begin();
::decode(epoch, p);
::decode(pgid.pgid, p);
::decode(trim_to, p);
if (header.version >= 2)
::decode(pgid.shard, p);
else
pgid.shard = shard_id_t::NO_SHARD;
}
but the constructor doesn't set the HEAD_VERSION and COMPAT_VERSION
MOSDPGTrim(version_t mv, spg_t p, eversion_t tt) :
Message(MSG_OSD_PG_TRIM),
epoch(mv), pgid(p), trim_to(tt) { }
maybe that is the reason?