Bug #19744
closedmon: pg_create not re-sent after primary osd changed
0%
Description
This happened during an upgrade test, but I think the issue has nothing to do with upgrades.
pg_create arrives while osd is starting (pool create and osd restart race! this is probably why teh upgrade suite triggered it):
2017-04-20 20:35:45.300933 7fef1a23e700 1 -- 172.21.15.67:6811/13837 <== mon.0 172.21.15.47:6789/0 10 ==== osd_pg_create(e6 3.2:6 3.4:6 3.7:6) v3 ==== 235+0+0 (1372823504 0 0) 0x7fef36d83200 con 0x7fef36c4a000 2017-04-20 20:35:45.300946 7fef1a23e700 20 osd.1 5 OSD::ms_dispatch: osd_pg_create(e6 3.2:6 3.4:6 3.7:6) v3 2017-04-20 20:35:45.300950 7fef1a23e700 10 osd.1 5 do_waiters -- start 2017-04-20 20:35:45.300952 7fef1a23e700 10 osd.1 5 do_waiters -- finish 2017-04-20 20:35:45.300954 7fef1a23e700 20 osd.1 5 _dispatch 0x7fef36d83200 osd_pg_create(e6 3.2:6 3.4:6 3.7:6) v3 2017-04-20 20:35:45.300969 7fef1a23e700 10 osd.1 5 handle_pg_create osd_pg_create(e6 3.2:6 3.4:6 3.7:6) v3 2017-04-20 20:35:45.300974 7fef1a23e700 15 osd.1 5 require_same_or_newer_map 6 (i am 5) 0x7fef36d83200 2017-04-20 20:35:45.300976 7fef1a23e700 7 osd.1 5 waiting for newer map epoch 6 > my 5 with 0x7fef36d83200
the osd goes active..
2017-04-20 20:35:47.446319 7fef1f17e700 1 osd.1 8 state: booting -> active
and then we reprocess teh message
2017-04-20 20:35:47.456488 7fef1923c700 10 osd.1 8 handle_pg_create osd_pg_create(e6 3.2:6 3.4:6 3.7:6) v3 2017-04-20 20:35:47.456494 7fef1923c700 15 osd.1 8 require_same_or_newer_map 6 (i am 8) 0x7fef36d83200 2017-04-20 20:35:47.456496 7fef1923c700 7 osd.1 8 from pre-up epoch 6 < 8
Updated by Sage Weil about 7 years ago
I think the require_self_aliveness() can be ignored in the pg_create case. That's the simplest fix.
ALternatively, subscriptions could be reset when the osd goes active, but i don't think the mon protocol lets you do that. :(
Updated by Sage Weil about 7 years ago
Actually, wait: I think the mon should recalculate the pg mapping when the osd goes "up" in 8 and resend the create message?
2017-04-20 20:35:47.291125 7fe759cb4700 10 mon.b@0(leader).osd e7 encode_pending e 8 2017-04-20 20:35:47.291129 7fe759cb4700 1 mon.b@0(leader).osd e7 encode_pending skipping prime_pg_temp; mapping job 0x7fe76ca716c0 is prior epoch 7 2017-04-20 20:35:47.291163 7fe759cb4700 2 mon.b@0(leader).osd e7 osd.1 UP 172.21.15.67:6811/13837 2017-04-20 20:35:47.291174 7fe759cb4700 2 mon.b@0(leader).osd e7 osd.2 UP 172.21.15.47:6800/2023 2017-04-20 20:35:47.291177 7fe759cb4700 2 mon.b@0(leader).osd e7 osd.3 UP 172.21.15.47:6806/2088 2017-04-20 20:35:47.291188 7fe759cb4700 10 mon.b@0(leader).osd e7 encode_pending encoding without feature SERVER_LUMINOUS 2017-04-20 20:35:47.291191 7fe759cb4700 10 mon.b@0(leader).osd e7 encode_pending encoding without feature SERVER_KRAKEN | MSG_ADDR2 2017-04-20 20:35:47.291192 7fe759cb4700 10 mon.b@0(leader).osd e7 encode_pending encoding full map with 575862587619852283 2017-04-20 20:35:47.291259 7fe759cb4700 20 mon.b@0(leader).osd e7 full_crc 2470623810 inc_crc 77687141 2017-04-20 20:35:47.291282 7fe759cb4700 10 mon.b@0(leader).osd e7 scan_for_creating_pgs no change in pool 0 replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 16 pgp_num 16 last_change 1 flags hashpspool stripe_width 0 2017-04-20 20:35:47.291292 7fe759cb4700 10 mon.b@0(leader).osd e7 scan_for_creating_pgs no change in pool 1 replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 3 flags hashpspool stripe_width 0 2017-04-20 20:35:47.291296 7fe759cb4700 10 mon.b@0(leader).osd e7 scan_for_creating_pgs no change in pool 2 replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 5 flags hashpspool crash_replay_interval 5 stripe_width 0 2017-04-20 20:35:47.291300 7fe759cb4700 10 mon.b@0(leader).osd e7 scan_for_creating_pgs no change in pool 3 replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 6 flags hashpspool stripe_width 0 2017-04-20 20:35:47.291304 7fe759cb4700 7 mon.b@0(leader).osd e7 encode_pending in the middle of upgrading, trimming pending creating_pgs using pgmap ... 2017-04-20 20:35:47.435749 7fe75c4b9700 10 update_creating_pgs to 8 pgs, osdmap epoch 8 2017-04-20 20:35:47.435763 7fe75c4b9700 20 update_creating_pgs 3.0 acting_primary: 0 -> 3 acting: [0] -> [3,0] up_primary: 0 -> 3 up: [0] -> [3,0] 2017-04-20 20:35:47.435772 7fe75c4b9700 20 update_creating_pgs 3.1 acting_primary: 0 -> 0 acting: [0] -> [0,2] up_primary: 0 -> 0 up: [0] -> [0,2] 2017-04-20 20:35:47.435781 7fe75c4b9700 20 update_creating_pgs 3.2 acting_primary: -1 -> 1 acting: [] -> [1,3] up_primary: -1 -> 1 up: [] -> [1,3] 2017-04-20 20:35:47.435789 7fe75c4b9700 20 update_creating_pgs 3.3 acting_primary: 0 -> 0 acting: [0] -> [0,2] up_primary: 0 -> 0 up: [0] -> [0,2] 2017-04-20 20:35:47.435799 7fe75c4b9700 20 update_creating_pgs 3.4 acting_primary: -1 -> 1 acting: [] -> [1,2] up_primary: -1 -> 1 up: [] -> [1,2] 2017-04-20 20:35:47.435804 7fe75c4b9700 20 update_creating_pgs 3.5 acting_primary: 0 -> 2 acting: [0] -> [2,0] up_primary: 0 -> 2 up: [0] -> [2,0] 2017-04-20 20:35:47.435810 7fe75c4b9700 20 update_creating_pgs 3.6 acting_primary: -1 -> 2 acting: [] -> [2,1] up_primary: -1 -> 2 up: [] -> [2,1] 2017-04-20 20:35:47.435818 7fe75c4b9700 20 update_creating_pgs 3.7 acting_primary: -1 -> 1 acting: [] -> [1,3] up_primary: -1 -> 1 up: [] -> [1,3] 2017-04-20 20:35:47.435824 7fe75c4b9700 10 update_creating_pgs 8 pgs changed primary
.. but nothing is sent.
/a/sage-2017-04-20_18:45:07-upgrade:jewel-x-master---basic-smithi/1049789
Updated by Greg Farnum about 7 years ago
This is definitely on the mon. If the OSD ignored the epoch mappings it could create a PG which was created by somebody else in epoch 7!
Updated by Kefu Chai about 7 years ago
- Subject changed from osd: pg_create ignored to mon: pg_create not re-sent after primary osd changed
Updated by Kefu Chai about 7 years ago
- Status changed from New to Fix Under Review
Updated by Sage Weil almost 7 years ago
- Status changed from Fix Under Review to Resolved