Actions
Bug #10347
closedduplicate OSD in acting set
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
In the map-for-loicd osdmap extracted from a ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578) cluster the osd 7 shows wtice in the acting set. This happened after changing pg_num from 12 to 128 on an erasure coded pool k=7, m=2
./osdmaptool --test-map-pg 19.dd /tmp/map-for-loicd ./osdmaptool: osdmap file '/tmp/map-for-loicd' parsed '19.dd' -> 19.dd 19.dd raw ([2,1,6,0,5,8,2147483647,7,4], p2) up ([2,1,6,0,5,8,2147483647,7,4], p2) acting ([2,1,6,0,5,8,7,7,4], p2)
Files
Updated by Loïc Dachary over 9 years ago
./osdmaptool --print /tmp/map-for-loicd ./osdmaptool: osdmap file '/tmp/map-for-loicd' epoch 5541 fsid eb26d697-03f5-4122-9f81-8c08ec680fe4 created 2014-03-19 12:32:51.564046 modified 2014-12-17 11:57:59.766630 flags pool 0 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 crash_replay_interval 45 min_read_recency_for_promote 1 stripe_width 0 pool 1 'metadata' replicated size 2 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 min_read_recency_for_promote 1 stripe_width 0 pool 17 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 5313 flags hashpspool stripe_width 0 pool 19 'ECpool7et2' erasure size 9 min_size 7 crush_ruleset 8 object_hash rjenkins pg_num 256 pgp_num 256 last_change 5337 lfor 5326 flags hashpspool tiers 20 read_tier 20 write_tier 20 stripe_width 4256 pool 20 'cacheFor7et2' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 5349 flags hashpspool,incomplete_clones tier_of 19 cache_mode writeback target_bytes 25000000000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 3600s x1 stripe_width 0 max_osd 9 osd.0 up in weight 1 up_from 5523 up_thru 5540 down_at 5519 last_clean_interval [3593,5518) 172.20.107.161:6800/3669 172.20.113.161:6800/3669 172.20.113.161:6801/3669 172.20.107.161:6801/3669 exists,up ebcc7f5a-d77c-4e43-a69e-acbeebe55338 osd.1 up in weight 1 up_from 5509 up_thru 5540 down_at 5507 last_clean_interval [5311,5506) 172.20.106.161:6800/9232 172.20.112.161:6800/9232 172.20.112.161:6801/9232 172.20.106.161:6801/9232 exists,up 93561c85-1ce7-4465-8da9-da8493cb323a osd.2 up in weight 1 up_from 5497 up_thru 5540 down_at 5495 last_clean_interval [3580,5495) 172.20.108.161:6800/2824 172.20.114.161:6800/2824 172.20.114.161:6801/2824 172.20.108.161:6801/2824 exists,up e5451dc2-d294-416b-9d4a-7ffa453cb985 osd.3 up in weight 1 up_from 5526 up_thru 5540 down_at 5521 last_clean_interval [4859,5520) 172.20.107.162:6800/13703 172.20.113.162:6800/13703 172.20.113.162:6801/13703 172.20.107.162:6801/13703 exists,up b7d1efb1-0dc0-4ab5-bcbd-b5cfc444c4c0 osd.4 up in weight 1 up_from 5513 up_thru 5540 down_at 5511 last_clean_interval [3577,5510) 172.20.106.162:6800/22491 172.20.112.162:6801/22491 172.20.112.162:6802/22491 172.20.106.162:6801/22491 exists,up 9ddb6c30-2a54-4278-88d4-25c4668fc072 osd.5 up in weight 1 up_from 5503 up_thru 5540 down_at 5499 last_clean_interval [4686,5498) 172.20.108.162:6800/32616 172.20.114.162:6800/32616 172.20.114.162:6801/32616 172.20.108.162:6801/32616 exists,up 8dca3ace-0d45-4406-8b39-ecadbd6940b3 osd.6 up in weight 1 up_from 5528 up_thru 5540 down_at 5525 last_clean_interval [5002,5524) 172.20.107.163:6800/13204 172.20.113.163:6801/13204 172.20.113.163:6802/13204 172.20.107.163:6801/13204 exists,up 519f9b0d-71e8-49d1-b271-5473c0871587 osd.7 up in weight 1 up_from 5540 up_thru 5540 down_at 5538 last_clean_interval [5536,5537) 172.20.106.163:6800/15101 172.20.112.163:6800/15101 172.20.112.163:6801/15101 172.20.106.163:6801/15101 exists,up c7a5850d-fccb-482d-ada7-84075ab5968a osd.8 up in weight 1 up_from 5505 up_thru 5540 down_at 5501 last_clean_interval [3587,5500) 172.20.108.163:6800/8386 172.20.114.163:6800/8386 172.20.114.163:6802/8386 172.20.108.163:6801/8386 exists,up ba4c2135-e859-43ee-a8bb-7f8d9874ed53 pg_temp 19.68 [2,7,8,5,1,0,6,4,1] pg_temp 19.9a [1,0,1,2,6,8,7,4,3] pg_temp 19.a8 [2,6,8,3,1,4,7,7,0] pg_temp 19.c1 [2,0,5,8,1,8,6,7,3] pg_temp 19.dd [2,1,6,0,5,8,7,7,4]
Updated by Loïc Dachary over 9 years ago
- Status changed from 12 to Need More Info
Increasing choose tries as follows will ask CRUSH to try harder to find OSDs to map to the PG.
rule ECpool7et2 { ruleset 8 type erasure min_size 3 max_size 20 step set_chooseleaf_tries 5 step set_choose_tries 200 step take default step choose indep 0 type osd step emit }
before changing this, the mapping sometime fails:
$ ./osdmaptool --export-crush /tmp/crush /tmp/map-for-loicd ./osdmaptool: osdmap file '/tmp/map-for-loicd' ./osdmaptool: exported crush map to /tmp/crush $ ./crushtool -i /tmp/crush --test --show-bad-mappings --rule 8 --num-rep 9 --min-x 1 --max-x 128 bad mapping rule 8 x 43 num_rep 9 result [3,2,7,1,2147483647,8,5,6,0] bad mapping rule 8 x 79 num_rep 9 result [6,0,2,1,4,7,2147483647,5,8]
after decompiling + modifying to set choose_tries to 200 + compiling again, testing the mapping over 1 million values does not show any mapping failure:
$ crushtool -i /tmp/crushfixed --test --show-bad-mappings --rule 8 --num-rep 9 --min-x 1 --max-x $((1024 * 1024))
Updated by Loïc Dachary over 9 years ago
- Status changed from Need More Info to In Progress
ceph-mon-lmb-E-1:~# ceph pg dump | grep 19.dd dumped all in format plain 19.dd 222 0 0 444 0 924292480 117 117 active+remapped 2014-12-17 11:57:59.926205 5346'3398 5541:6735 [2,1,6,0,5,8,2147483647,7,4] 2 [2,1,6,0,5,8,7,7,4] 2 0'0 2014-12-16 10:20:27.249480 0'0 2014-12-16 10:20:27.249480 ceph-mon-lmb-E-1:~# ceph --version ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578) ceph-mon-lmb-E-1:~# ceph pg 19.dd query { "state": "active+remapped", "snap_trimq": "[]", "epoch": 5541, "up": [ 2, 1, 6, 0, 5, 8, 2147483647, 7, 4], "acting": [ 2, 1, 6, 0, 5, 8, 7, 7, 4], "actingbackfill": [ "0(3)", "1(1)", "2(0)", "4(8)", "5(4)", "6(2)", "7(6)", "7(7)", "8(5)"], "info": { "pgid": "19.dds0", "last_update": "5346'3398", "last_complete": "5346'3398", "log_tail": "5346'3281", "last_user_version": 6870, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 5320, "last_epoch_started": 5541, "last_epoch_clean": 5541, "last_epoch_split": 0, "same_up_since": 5540, "same_interval_since": 5540, "same_primary_since": 5497, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "5346'3398", "reported_seq": "6735", "reported_epoch": "5541", "state": "active+remapped", "last_fresh": "2014-12-17 11:57:59.926977", "last_change": "2014-12-17 11:57:59.926205", "last_active": "2014-12-17 11:57:59.926977", "last_clean": "2014-12-16 17:43:39.416657", "last_became_active": "0.000000", "last_unstale": "2014-12-17 11:57:59.926977", "last_undegraded": "2014-12-17 11:57:59.926977", "last_fullsized": "2014-12-17 11:57:59.926977", "mapping_epoch": 5538, "log_start": "5346'3281", "ondisk_log_start": "5346'3281", "created": 5320, "last_epoch_clean": 5541, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000", "log_size": 117, "ondisk_log_size": 117, "stats_invalid": "1", "stat_sum": { "num_bytes": 924292480, "num_objects": 222, "num_object_clones": 0, "num_object_copies": 1998, "num_objects_missing_on_primary": 0, "num_objects_degraded": 0, "num_objects_misplaced": 444, "num_objects_unfound": 0, "num_objects_dirty": 222, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 222, "num_write_kb": 902629, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 2, 1, 6, 0, 5, 8, 2147483647, 7, 4], "acting": [ 2, 1, 6, 0, 5, 8, 7, 7, 4], "blocked_by": [], "up_primary": 2, "acting_primary": 2}, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 5541, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}, "peer_info": [ { "peer": "0(3)", "pgid": "19.dds3", "last_update": "5346'3398", "last_complete": "5346'3398", "log_tail": "5346'3281", "last_user_version": 7728, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 5320, "last_epoch_started": 5541, "last_epoch_clean": 5541, "last_epoch_split": 0, "same_up_since": 5540, "same_interval_since": 5540, "same_primary_since": 5497, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "5346'3397", "reported_seq": "6392", "reported_epoch": "5346", "state": "active+remapped", "last_fresh": "2014-12-16 17:58:41.455840", "last_change": "2014-12-16 17:44:21.999907", "last_active": "2014-12-16 17:58:41.455840", "last_clean": "2014-12-16 17:43:39.416657", "last_became_active": "0.000000", "last_unstale": "2014-12-16 17:58:41.455840", "last_undegraded": "2014-12-16 17:58:41.455840", "last_fullsized": "2014-12-16 17:58:41.455840", "mapping_epoch": 5538, "log_start": "5346'3281", "ondisk_log_start": "5346'3281", "created": 5320, "last_epoch_clean": 5342, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000", "log_size": 116, "ondisk_log_size": 116, "stats_invalid": "1", "stat_sum": { "num_bytes": 924292480, "num_objects": 222, "num_object_clones": 0, "num_object_copies": 1989, "num_objects_missing_on_primary": 0, "num_objects_degraded": 952, "num_objects_misplaced": 442, "num_objects_unfound": 0, "num_objects_dirty": 222, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 222, "num_write_kb": 902629, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 2, 1, 6, 0, 5, 8, 2147483647, 7, 4], "acting": [ 2, 1, 6, 0, 5, 8, 7, 7, 4], "blocked_by": [], "up_primary": 2, "acting_primary": 2}, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 5541, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}, { "peer": "1(1)", "pgid": "19.dds1", "last_update": "5346'3398", "last_complete": "5346'3398", "log_tail": "5346'3281", "last_user_version": 6870, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 5320, "last_epoch_started": 5541, "last_epoch_clean": 5541, "last_epoch_split": 0, "same_up_since": 5540, "same_interval_since": 5540, "same_primary_since": 5497, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "5346'3398", "reported_seq": "6401", "reported_epoch": "5496", "state": "active+undersized+degraded+remapped", "last_fresh": "2014-12-16 21:01:03.365617", "last_change": "2014-12-16 21:01:03.363852", "last_active": "2014-12-16 21:01:03.365617", "last_clean": "2014-12-16 17:43:39.416657", "last_became_active": "0.000000", "last_unstale": "2014-12-16 21:01:03.365617", "last_undegraded": "2014-12-16 21:01:02.654014", "last_fullsized": "2014-12-16 21:01:02.654014", "mapping_epoch": 5538, "log_start": "5346'3281", "ondisk_log_start": "5346'3281", "created": 5320, "last_epoch_clean": 5496, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000", "log_size": 117, "ondisk_log_size": 117, "stats_invalid": "1", "stat_sum": { "num_bytes": 924292480, "num_objects": 222, "num_object_clones": 0, "num_object_copies": 1998, "num_objects_missing_on_primary": 0, "num_objects_degraded": 0, "num_objects_misplaced": 666, "num_objects_unfound": 0, "num_objects_dirty": 222, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 222, "num_write_kb": 902629, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 2, 1, 6, 0, 5, 8, 2147483647, 7, 4], "acting": [ 2, 1, 6, 0, 5, 8, 7, 7, 4], "blocked_by": [], "up_primary": 2, "acting_primary": 2}, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 5541, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}, { "peer": "4(8)", "pgid": "19.dds8", "last_update": "5346'3398", "last_complete": "5346'3398", "log_tail": "5346'3281", "last_user_version": 6870, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 5320, "last_epoch_started": 5541, "last_epoch_clean": 5541, "last_epoch_split": 0, "same_up_since": 5540, "same_interval_since": 5540, "same_primary_since": 5497, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "5346'3397", "reported_seq": "6392", "reported_epoch": "5346", "state": "active+remapped", "last_fresh": "2014-12-16 17:58:41.455840", "last_change": "2014-12-16 17:44:21.999907", "last_active": "2014-12-16 17:58:41.455840", "last_clean": "2014-12-16 17:43:39.416657", "last_became_active": "0.000000", "last_unstale": "2014-12-16 17:58:41.455840", "last_undegraded": "2014-12-16 17:58:41.455840", "last_fullsized": "2014-12-16 17:58:41.455840", "mapping_epoch": 5538, "log_start": "5346'3281", "ondisk_log_start": "5346'3281", "created": 5320, "last_epoch_clean": 5342, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000", "log_size": 116, "ondisk_log_size": 116, "stats_invalid": "1", "stat_sum": { "num_bytes": 924292480, "num_objects": 222, "num_object_clones": 0, "num_object_copies": 1989, "num_objects_missing_on_primary": 0, "num_objects_degraded": 952, "num_objects_misplaced": 442, "num_objects_unfound": 0, "num_objects_dirty": 222, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 222, "num_write_kb": 902629, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 2, 1, 6, 0, 5, 8, 2147483647, 7, 4], "acting": [ 2, 1, 6, 0, 5, 8, 7, 7, 4], "blocked_by": [], "up_primary": 2, "acting_primary": 2}, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 5541, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}, { "peer": "5(4)", "pgid": "19.dds4", "last_update": "5346'3398", "last_complete": "5346'3398", "log_tail": "5346'3281", "last_user_version": 6870, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 5320, "last_epoch_started": 5541, "last_epoch_clean": 5541, "last_epoch_split": 0, "same_up_since": 5540, "same_interval_since": 5540, "same_primary_since": 5497, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "5346'3397", "reported_seq": "6392", "reported_epoch": "5346", "state": "active+remapped", "last_fresh": "2014-12-16 17:58:41.455840", "last_change": "2014-12-16 17:44:21.999907", "last_active": "2014-12-16 17:58:41.455840", "last_clean": "2014-12-16 17:43:39.416657", "last_became_active": "0.000000", "last_unstale": "2014-12-16 17:58:41.455840", "last_undegraded": "2014-12-16 17:58:41.455840", "last_fullsized": "2014-12-16 17:58:41.455840", "mapping_epoch": 5538, "log_start": "5346'3281", "ondisk_log_start": "5346'3281", "created": 5320, "last_epoch_clean": 5342, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000", "log_size": 116, "ondisk_log_size": 116, "stats_invalid": "1", "stat_sum": { "num_bytes": 924292480, "num_objects": 222, "num_object_clones": 0, "num_object_copies": 1989, "num_objects_missing_on_primary": 0, "num_objects_degraded": 952, "num_objects_misplaced": 442, "num_objects_unfound": 0, "num_objects_dirty": 222, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 222, "num_write_kb": 902629, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 2, 1, 6, 0, 5, 8, 2147483647, 7, 4], "acting": [ 2, 1, 6, 0, 5, 8, 7, 7, 4], "blocked_by": [], "up_primary": 2, "acting_primary": 2}, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 5541, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}, { "peer": "6(2)", "pgid": "19.dds2", "last_update": "5346'3398", "last_complete": "5346'3398", "log_tail": "5346'3281", "last_user_version": 6870, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 5320, "last_epoch_started": 5541, "last_epoch_clean": 5541, "last_epoch_split": 0, "same_up_since": 5540, "same_interval_since": 5540, "same_primary_since": 5497, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "5346'3397", "reported_seq": "6392", "reported_epoch": "5346", "state": "active+remapped", "last_fresh": "2014-12-16 17:58:41.455840", "last_change": "2014-12-16 17:44:21.999907", "last_active": "2014-12-16 17:58:41.455840", "last_clean": "2014-12-16 17:43:39.416657", "last_became_active": "0.000000", "last_unstale": "2014-12-16 17:58:41.455840", "last_undegraded": "2014-12-16 17:58:41.455840", "last_fullsized": "2014-12-16 17:58:41.455840", "mapping_epoch": 5538, "log_start": "5346'3281", "ondisk_log_start": "5346'3281", "created": 5320, "last_epoch_clean": 5342, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000", "log_size": 116, "ondisk_log_size": 116, "stats_invalid": "1", "stat_sum": { "num_bytes": 924292480, "num_objects": 222, "num_object_clones": 0, "num_object_copies": 1989, "num_objects_missing_on_primary": 0, "num_objects_degraded": 952, "num_objects_misplaced": 442, "num_objects_unfound": 0, "num_objects_dirty": 222, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 222, "num_write_kb": 902629, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 2, 1, 6, 0, 5, 8, 2147483647, 7, 4], "acting": [ 2, 1, 6, 0, 5, 8, 7, 7, 4], "blocked_by": [], "up_primary": 2, "acting_primary": 2}, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 5541, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}, { "peer": "7(6)", "pgid": "19.dds6", "last_update": "5346'3398", "last_complete": "5346'3398", "log_tail": "5346'3281", "last_user_version": 7728, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 5320, "last_epoch_started": 5541, "last_epoch_clean": 5541, "last_epoch_split": 0, "same_up_since": 5540, "same_interval_since": 5540, "same_primary_since": 5497, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "5346'3397", "reported_seq": "6392", "reported_epoch": "5346", "state": "active+remapped", "last_fresh": "2014-12-16 17:58:41.455840", "last_change": "2014-12-16 17:44:21.999907", "last_active": "2014-12-16 17:58:41.455840", "last_clean": "2014-12-16 17:43:39.416657", "last_became_active": "0.000000", "last_unstale": "2014-12-16 17:58:41.455840", "last_undegraded": "2014-12-16 17:58:41.455840", "last_fullsized": "2014-12-16 17:58:41.455840", "mapping_epoch": 5538, "log_start": "5346'3281", "ondisk_log_start": "5346'3281", "created": 5320, "last_epoch_clean": 5342, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000", "log_size": 116, "ondisk_log_size": 116, "stats_invalid": "1", "stat_sum": { "num_bytes": 924292480, "num_objects": 222, "num_object_clones": 0, "num_object_copies": 1989, "num_objects_missing_on_primary": 0, "num_objects_degraded": 952, "num_objects_misplaced": 442, "num_objects_unfound": 0, "num_objects_dirty": 222, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 222, "num_write_kb": 902629, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 2, 1, 6, 0, 5, 8, 2147483647, 7, 4], "acting": [ 2, 1, 6, 0, 5, 8, 7, 7, 4], "blocked_by": [], "up_primary": 2, "acting_primary": 2}, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 5541, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}, { "peer": "7(7)", "pgid": "19.dds7", "last_update": "5346'3398", "last_complete": "5346'3398", "log_tail": "5346'3281", "last_user_version": 6870, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 5320, "last_epoch_started": 5541, "last_epoch_clean": 5541, "last_epoch_split": 0, "same_up_since": 5540, "same_interval_since": 5540, "same_primary_since": 5497, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "5346'3397", "reported_seq": "6392", "reported_epoch": "5346", "state": "active+remapped", "last_fresh": "2014-12-16 17:58:41.455840", "last_change": "2014-12-16 17:44:21.999907", "last_active": "2014-12-16 17:58:41.455840", "last_clean": "2014-12-16 17:43:39.416657", "last_became_active": "0.000000", "last_unstale": "2014-12-16 17:58:41.455840", "last_undegraded": "2014-12-16 17:58:41.455840", "last_fullsized": "2014-12-16 17:58:41.455840", "mapping_epoch": 5538, "log_start": "5346'3281", "ondisk_log_start": "5346'3281", "created": 5320, "last_epoch_clean": 5342, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000", "log_size": 116, "ondisk_log_size": 116, "stats_invalid": "1", "stat_sum": { "num_bytes": 924292480, "num_objects": 222, "num_object_clones": 0, "num_object_copies": 1989, "num_objects_missing_on_primary": 0, "num_objects_degraded": 952, "num_objects_misplaced": 442, "num_objects_unfound": 0, "num_objects_dirty": 222, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 222, "num_write_kb": 902629, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 2, 1, 6, 0, 5, 8, 2147483647, 7, 4], "acting": [ 2, 1, 6, 0, 5, 8, 7, 7, 4], "blocked_by": [], "up_primary": 2, "acting_primary": 2}, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 5541, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}, { "peer": "8(5)", "pgid": "19.dds5", "last_update": "5346'3398", "last_complete": "5346'3398", "log_tail": "5346'3281", "last_user_version": 7728, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 5320, "last_epoch_started": 5541, "last_epoch_clean": 5541, "last_epoch_split": 0, "same_up_since": 5540, "same_interval_since": 5540, "same_primary_since": 5497, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "5346'3397", "reported_seq": "6392", "reported_epoch": "5346", "state": "active+remapped", "last_fresh": "2014-12-16 17:58:41.455840", "last_change": "2014-12-16 17:44:21.999907", "last_active": "2014-12-16 17:58:41.455840", "last_clean": "2014-12-16 17:43:39.416657", "last_became_active": "0.000000", "last_unstale": "2014-12-16 17:58:41.455840", "last_undegraded": "2014-12-16 17:58:41.455840", "last_fullsized": "2014-12-16 17:58:41.455840", "mapping_epoch": 5538, "log_start": "5346'3281", "ondisk_log_start": "5346'3281", "created": 5320, "last_epoch_clean": 5342, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-12-16 10:20:27.249480", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-12-16 10:20:27.249480", "last_clean_scrub_stamp": "0.000000", "log_size": 116, "ondisk_log_size": 116, "stats_invalid": "1", "stat_sum": { "num_bytes": 924292480, "num_objects": 222, "num_object_clones": 0, "num_object_copies": 1989, "num_objects_missing_on_primary": 0, "num_objects_degraded": 952, "num_objects_misplaced": 442, "num_objects_unfound": 0, "num_objects_dirty": 222, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 222, "num_write_kb": 902629, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 2, 1, 6, 0, 5, 8, 2147483647, 7, 4], "acting": [ 2, 1, 6, 0, 5, 8, 7, 7, 4], "blocked_by": [], "up_primary": 2, "acting_primary": 2}, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 5541, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}], "recovery_state": [ { "name": "Started\/Primary\/Active", "enter_time": "2014-12-17 11:57:59.853375", "might_have_unfound": [], "recovery_progress": { "backfill_targets": [], "waiting_on_backfill": [], "last_backfill_started": "0\/\/0\/\/-1", "backfill_info": { "begin": "0\/\/0\/\/-1", "end": "0\/\/0\/\/-1", "objects": []}, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "recovery_ops": [], "read_ops": []}}, "scrub": { "scrubber.epoch_start": "0", "scrubber.active": 0, "scrubber.block_writes": 0, "scrubber.waiting_on": 0, "scrubber.waiting_on_whom": []}}, { "name": "Started", "enter_time": "2014-12-17 11:57:58.771996"}],
Updated by Loïc Dachary over 9 years ago
- Status changed from In Progress to Rejected
<sjusthm> loicd: an appropriately odd crush map could result in an acting set with a repeated OSD, but only for an EC pool <sjusthm> crush failed to fill in the acting set spot <sjusthm> and the primary found an acceptable replacement <sjusthm> which happened to be on the same osd as another shard <sjusthm> so the new primary happened to find an existing shard which would fit, and requested a pg temp <sjusthm> which happened to have a duplicate <sjusthm> it's not really a duplicate, it just means that the osd actually has two copies of the pg <sjusthm> one for each shard <sjusthm> remember: with an EC pool, the different positions in the acting set actually have different data <sjusthm> loicd: that's why most of the OSD internals deal with PG's as pg_shard_t (pair<pg_t, shard_it_t>) so that an OSD can have two copies of the same pg and have it work <sjusthm> loicd: it's also why ghobjects exist: operations on the filestore need the shard_id_t to avoid different pg shards on the same osd clobbering each other's objects <sjusthm> by contrast, any replicated pg has the same value for the shard: NO_SHARD <sjusthm> so regardless of the acting set position, it maps to the same pg <sjusthm> shard <sjusthm> and there can only be one <sjusthm> imagine an EC pg with acting set [0,1,2] <sjusthm> which then changes to [2,1,0] <sjusthm> osd 2 must create a pg temp with [0,1,2] until the new shard on 2 is backfilled <sjusthm> then the temp becomes [2,1,2] <sjusthm> and then [2,1,0]
Actions