Bug #2307
closedOSD & Monitor disagree on the contents of pg_temp
0%
Description
See: http://marc.info/?t=133352732900001&r=1&w=2
It seems that (for example) pg 0.138 is in pg_temp, but the OSD can't find it when it goes looking. I obtained the maps from both, and their contents agree when you print them out, but when mapping the PG via --test-map-pg it doesn't contain the pg temp mapping. After a lot of looking, it turns out that the map has a pg_num of 8 and so the placement seed is getting inappropriately truncated (at least in the osdmaptool, and presumably on the OSD).
I suspect this is an encode/decode issue, but don't know for sure.
Updated by Sage Weil about 12 years ago
It looks to me liek the 'data' pool (0) was deleted, and then a new one (vmimages) was created. but somehow that was assigned an old pool id (0) instead of a new one. Or, a bug made us replace data with vmimages.. that's probably more likely!
Updated by Sage Weil about 12 years ago
nine:2307 03:56 PM $ osdmaptool osdmap_full/5754 -p | grep ^pool
pool 0 'data' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 320 pgp_num 320 lpg_num 2 lpgp_num 2 last_change 1 owner 0 crash_replay_interval 60
pool 1 'metadata' rep size 2 crush_ruleset 1 object_hash rjenkins pg_num 320 pgp_num 320 lpg_num 2 lpgp_num 2 last_change 1 owner 0
pool 2 'rbd' rep size 2 crush_ruleset 2 object_hash rjenkins pg_num 320 pgp_num 320 lpg_num 2 lpgp_num 2 last_change 1 owner 0
nine:2307 03:56 PM $ osdmaptool osdmap_full/5755 -p | grep ^pool
pool 0 'vmimages' rep size 2 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 lpg_num 0 lpgp_num 0 last_change 5755 owner 18446744073709551615
pool 1 'metadata' rep size 2 crush_ruleset 1 object_hash rjenkins pg_num 320 pgp_num 320 lpg_num 2 lpgp_num 2 last_change 1 owner 0
pool 2 'rbd' rep size 2 crush_ruleset 2 object_hash rjenkins pg_num 320 pgp_num 320 lpg_num 2 lpgp_num 2 last_change 1 owner 0
Updated by Sage Weil about 12 years ago
at some point the osdmap pool_max got set to -1.
nine:2307 04:15 PM $ ~/src/ceph/src/ceph-dencoder type OSDMap import osdmap_full/50 decode dump_json | grep max
"pool_max": 2,
"max_osd": 5,
nine:2307 04:15 PM $ ~/src/ceph/src/ceph-dencoder type OSDMap import osdmap_full/5528 decode dump_json | grep max
"pool_max": -1,
"max_osd": 5,
Updated by Greg Farnum about 12 years ago
I'm confused how you're getting that pool_max printout — I don't see it at all when I run that command with a ceph-dencoder from latest master?
But given what I see from how the Incrementals change, that is indeed the problem...if only we can track down how it happened.
(I'm currently circling around the decoders and the fact that Incremental::new_pool_max is an int64_t, whereas pool_max is an int32_t, and the decoders are treating old versions of those as __u32...but I can't actually find a way in which it's broken, even if it is horrible.)
Updated by Sage Weil about 12 years ago
Greg Farnum wrote:
I'm confused how you're getting that pool_max printout — I don't see it at all when I run that command with a ceph-dencoder from latest master?
But given what I see from how the Incrementals change, that is indeed the problem...if only we can track down how it happened.
(I'm currently circling around the decoders and the fact that Incremental::new_pool_max is an int64_t, whereas pool_max is an int32_t, and the decoders are treating old versions of those as __u32...but I can't actually find a way in which it's broken, even if it is horrible.)
i had to fix the OSDMap::dump() method; i'll push that shortly.
and yeah, it looks like i goofed the 32->64 bit pool conversion and didn't actually change pool_max to an int64_t. i want to go back and check the original commits to make sure that's the case before fixing it. (not that 64-bit pool ids are all that useful!)
i'm hoping it won't be hard to scour OSDMap.cc for places where pool_max is assigned a new value and infer what went wrong... there can't be too many places.
Updated by Sage Weil almost 12 years ago
pushed workaround that will repair osdmaps that saw your corruption, eea982e56739a7a91ca907ccc5c5ec1f78d9460d.
Updated by Greg Farnum almost 12 years ago
- Status changed from In Progress to 7
And I gave him a patched monitor so he could set pg_num, which should fix it. Waiting to hear back, and will apply that patch as well assuming it works.
Updated by Greg Farnum almost 12 years ago
- Status changed from 7 to Resolved
Just changing the pg_num and pgp_num did fix it up, so with the osdmap workaround we should be all good now.