Bug #39249
closedSome PGs stuck in active+remapped state
0%
Description
Sometimes my PGs stuck in this state. When I stop primary OSD containig this PG, it becomes `active+undersized+degraded` and does not get remapped even when I start this OSD back again.
How to debug that? I have plenty of space on other OSDs. Restarting all OSDs does not help.
```
$ ceph osd df tree
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-1 14.15028 - 14 TiB 5.6 TiB 8.1 TiB 40.80 1.00 - root default
-2 3.19478 - 3.2 TiB 1.3 TiB 1.9 TiB 41.32 1.01 - host node1
6 blue_ssd 0.45599 1.00000 467 GiB 202 GiB 265 GiB 43.26 1.06 256 osd.6
1 prod 1.82419 0.93387 1.8 TiB 764 GiB 1.1 TiB 40.90 1.00 223 osd.1
2 prod 0.91460 0.79158 937 GiB 386 GiB 551 GiB 41.19 1.01 107 osd.2
-3 2.28519 - 2.3 TiB 1000 GiB 1.3 TiB 42.72 1.05 - host node2
0 blue_ssd 0.45599 1.00000 467 GiB 202 GiB 265 GiB 43.28 1.06 256 osd.0
3 prod 0.91460 0.83400 937 GiB 396 GiB 541 GiB 42.29 1.04 104 osd.3
4 prod 0.91460 0.72214 937 GiB 402 GiB 535 GiB 42.88 1.05 119 osd.4
-4 2.28996 - 1.8 TiB 826 GiB 1.0 TiB 44.05 1.08 - host node3
7 blue_ssd 0.45599 1.00000 467 GiB 202 GiB 265 GiB 43.26 1.06 256 osd.7
11 prod 0.45969 0 0 B 0 B 0 B 0 0 0 osd.11
13 prod 0.45969 0.84837 471 GiB 216 GiB 255 GiB 45.86 1.12 57 osd.13
14 prod 0.91460 0.65007 937 GiB 408 GiB 529 GiB 43.53 1.07 97 osd.14
-9 3.63689 - 3.6 TiB 1.4 TiB 2.3 TiB 37.66 0.92 - host node4
5 prod 0.90919 1.00000 931 GiB 350 GiB 581 GiB 37.58 0.92 97 osd.5
9 prod 1.81850 1.00000 1.8 TiB 745 GiB 1.1 TiB 40.00 0.98 207 osd.9
10 prod 0.90919 1.00000 931 GiB 308 GiB 623 GiB 33.04 0.81 92 osd.10
-16 2.74347 - 2.7 TiB 1.1 TiB 1.6 TiB 40.57 0.99 - host node5
8 prod 0.91449 0.94768 936 GiB 387 GiB 549 GiB 41.36 1.01 120 osd.8
12 prod 0.91449 0.84109 936 GiB 377 GiB 559 GiB 40.28 0.99 91 osd.12
16 prod 0.91449 0.70984 936 GiB 375 GiB 561 GiB 40.07 0.98 93 osd.16
TOTAL 14 TiB 5.6 TiB 8.6 TiB 40.80
```
So, my question is: how to debug such cases. My crushmap does not contain anything special (like upmaps) except two classes defined (prod and blue_ssd)
Updated by Марк Коренберг about 5 years ago
$ ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME -1 14.15028 - 14 TiB 5.6 TiB 8.1 TiB 40.80 1.00 - root default -2 3.19478 - 3.2 TiB 1.3 TiB 1.9 TiB 41.32 1.01 - host node1 6 blue_ssd 0.45599 1.00000 467 GiB 202 GiB 265 GiB 43.26 1.06 256 osd.6 1 prod 1.82419 0.93387 1.8 TiB 764 GiB 1.1 TiB 40.90 1.00 223 osd.1 2 prod 0.91460 0.79158 937 GiB 386 GiB 551 GiB 41.19 1.01 107 osd.2 -3 2.28519 - 2.3 TiB 1000 GiB 1.3 TiB 42.72 1.05 - host node2 0 blue_ssd 0.45599 1.00000 467 GiB 202 GiB 265 GiB 43.28 1.06 256 osd.0 3 prod 0.91460 0.83400 937 GiB 396 GiB 541 GiB 42.29 1.04 104 osd.3 4 prod 0.91460 0.72214 937 GiB 402 GiB 535 GiB 42.88 1.05 119 osd.4 -4 2.28996 - 1.8 TiB 826 GiB 1.0 TiB 44.05 1.08 - host node3 7 blue_ssd 0.45599 1.00000 467 GiB 202 GiB 265 GiB 43.26 1.06 256 osd.7 11 prod 0.45969 0 0 B 0 B 0 B 0 0 0 osd.11 13 prod 0.45969 0.84837 471 GiB 216 GiB 255 GiB 45.86 1.12 57 osd.13 14 prod 0.91460 0.65007 937 GiB 408 GiB 529 GiB 43.53 1.07 97 osd.14 -9 3.63689 - 3.6 TiB 1.4 TiB 2.3 TiB 37.66 0.92 - host node4 5 prod 0.90919 1.00000 931 GiB 350 GiB 581 GiB 37.58 0.92 97 osd.5 9 prod 1.81850 1.00000 1.8 TiB 745 GiB 1.1 TiB 40.00 0.98 207 osd.9 10 prod 0.90919 1.00000 931 GiB 308 GiB 623 GiB 33.04 0.81 92 osd.10 -16 2.74347 - 2.7 TiB 1.1 TiB 1.6 TiB 40.57 0.99 - host node5 8 prod 0.91449 0.94768 936 GiB 387 GiB 549 GiB 41.36 1.01 120 osd.8 12 prod 0.91449 0.84109 936 GiB 377 GiB 559 GiB 40.28 0.99 91 osd.12 16 prod 0.91449 0.70984 936 GiB 375 GiB 561 GiB 40.07 0.98 93 osd.16 TOTAL 14 TiB 5.6 TiB 8.6 TiB 40.80
Updated by Марк Коренберг about 5 years ago
$ ceph pg dump | egrep 'PG|unders' dumped all PG_STAT OBJECTS MISSING_ON_PRIMARY DEGRADED MISPLACED UNFOUND BYTES LOG DISK_LOG STATE STATE_STAMP VERSION REPORTED UP UP_PRIMARY ACTING ACTING_PRIMARY LAST_SCRUB SCRUB_STAMP LAST_DEEP_SCRUB DEEP_SCRUB_STAMP SNAPTRIMQ_LEN 19.21 676 0 676 0 0 2530544738 3029 3029 active+undersized+degraded 2019-04-11 17:16:02.726121 63928'17602712 63928:17911658 [5,12] 5 [5,12] 5 63838'17595232 2019-04-11 01:04:54.802666 63702'17549735 2019-04-06 01:04:51.953306 0 OSD_STAT USED AVAIL TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM
Updated by Марк Коренберг about 5 years ago
$ ceph -f json pg dump | jq '.pg_stats[] | select(.state | contains("unders"))' dumped all { "pgid": "19.21", "version": "63928'17602797", "reported_seq": "17911743", "reported_epoch": "63928", "state": "active+undersized+degraded", "last_fresh": "2019-04-11 17:34:03.450789", "last_change": "2019-04-11 17:16:02.726121", "last_active": "2019-04-11 17:34:03.450789", "last_peered": "2019-04-11 17:34:03.450789", "last_clean": "2019-04-11 17:15:08.576010", "last_became_active": "2019-04-11 17:16:02.726121", "last_became_peered": "2019-04-11 17:16:02.726121", "last_unstale": "2019-04-11 17:34:03.450789", "last_undegraded": "2019-04-11 17:16:02.724250", "last_fullsized": "2019-04-11 17:16:02.724138", "mapping_epoch": 63926, "log_start": "63838'17599783", "ondisk_log_start": "63838'17599783", "created": 1173, "last_epoch_clean": 63841, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "63838'17595232", "last_scrub_stamp": "2019-04-11 01:04:54.802666", "last_deep_scrub": "63702'17549735", "last_deep_scrub_stamp": "2019-04-06 01:04:51.953306", "last_clean_scrub_stamp": "2019-04-11 01:04:54.802666", "log_size": 3014, "ondisk_log_size": 3014, "stats_invalid": false, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": true, "snaptrimq_len": 0, "stat_sum": { ... }, "up": [ 5, 12 ], "acting": [ 5, 12 ], "blocked_by": [], "up_primary": 5, "acting_primary": 5, "purged_snaps": [] }
Updated by Марк Коренберг about 5 years ago
OSD.11 previously took part in this PG. I don't know now if as primary or not. The bug happened after I made `ceph osd out osd.11`
Updated by Марк Коренберг about 5 years ago
$ ceph pg 19.21 query { "state": "active+clean+remapped", "snap_trimq": "[]", "snap_trimq_len": 0, "epoch": 65828, "up": [ 5, 16 ], "acting": [ 5, 16, 12 ], "acting_recovery_backfill": [ "5", "12", "16" ], "info": { "pgid": "19.21", "last_update": "65828'17613229", "last_complete": "65828'17613229", "log_tail": "65765'17610184", "last_user_version": 17613229, "last_backfill": "MAX", "last_backfill_bitwise": 1, "purged_snaps": [], "history": { "epoch_created": 1173, "epoch_pool_created": 1173, "last_epoch_started": 65728, "last_interval_started": 65727, "last_epoch_clean": 65728, "last_interval_clean": 65727, "last_epoch_split": 0, "last_epoch_marked_full": 0, "same_up_since": 65726, "same_interval_since": 65727, "same_primary_since": 65705, "last_scrub": "65342'17606121", "last_scrub_stamp": "2019-04-12 02:18:09.539815", "last_deep_scrub": "63702'17549735", "last_deep_scrub_stamp": "2019-04-06 01:04:51.953306", "last_clean_scrub_stamp": "2019-04-12 02:18:09.539815" }, "stats": { "version": "65828'17613229", "reported_seq": "17930530", "reported_epoch": "65828", "state": "active+clean+remapped", "last_fresh": "2019-04-12 18:04:12.211201", "last_change": "2019-04-12 14:23:34.812265", "last_active": "2019-04-12 18:04:12.211201", "last_peered": "2019-04-12 18:04:12.211201", "last_clean": "2019-04-12 18:04:12.211201", "last_became_active": "2019-04-12 12:57:16.825414", "last_became_peered": "2019-04-12 12:57:16.825414", "last_unstale": "2019-04-12 18:04:12.211201", "last_undegraded": "2019-04-12 18:04:12.211201", "last_fullsized": "2019-04-12 18:04:12.211201", "mapping_epoch": 65727, "log_start": "65765'17610184", "ondisk_log_start": "65765'17610184", "created": 1173, "last_epoch_clean": 65728, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "65342'17606121", "last_scrub_stamp": "2019-04-12 02:18:09.539815", "last_deep_scrub": "63702'17549735", "last_deep_scrub_stamp": "2019-04-06 01:04:51.953306", "last_clean_scrub_stamp": "2019-04-12 02:18:09.539815", "log_size": 3045, "ondisk_log_size": 3045, "stats_invalid": false, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": true, "snaptrimq_len": 0, "stat_sum": { "num_bytes": 2525160546, "num_objects": 653, "num_object_clones": 16, "num_object_copies": 1959, "num_objects_missing_on_primary": 0, "num_objects_missing": 0, "num_objects_degraded": 0, "num_objects_misplaced": 653, "num_objects_unfound": 0, "num_objects_dirty": 653, "num_whiteouts": 0, "num_read": 4669894, "num_read_kb": 184550843, "num_write": 17585691, "num_write_kb": 1432285508, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 62562, "num_bytes_recovered": 248121752746, "num_keys_recovered": 14, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0 }, "up": [ 5, 16 ], "acting": [ 5, 16, 12 ], "blocked_by": [], "up_primary": 5, "acting_primary": 5, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 65728, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, "peer_info": [ { "peer": "12", "pgid": "19.21", "last_update": "65828'17613229", "last_complete": "65828'17613229", "log_tail": "65342'17606783", "last_user_version": 17609878, "last_backfill": "MAX", "last_backfill_bitwise": 1, "purged_snaps": [], "history": { "epoch_created": 1173, "epoch_pool_created": 1173, "last_epoch_started": 65728, "last_interval_started": 65727, "last_epoch_clean": 65728, "last_interval_clean": 65727, "last_epoch_split": 0, "last_epoch_marked_full": 0, "same_up_since": 65726, "same_interval_since": 65727, "same_primary_since": 65705, "last_scrub": "65342'17606121", "last_scrub_stamp": "2019-04-12 02:18:09.539815", "last_deep_scrub": "63702'17549735", "last_deep_scrub_stamp": "2019-04-06 01:04:51.953306", "last_clean_scrub_stamp": "2019-04-12 02:18:09.539815" }, "stats": { "version": "65712'17609876", "reported_seq": "17923240", "reported_epoch": "65712", "state": "active+clean+remapped", "last_fresh": "2019-04-12 12:39:04.048144", "last_change": "2019-04-12 12:27:14.334773", "last_active": "2019-04-12 12:39:04.048144", "last_peered": "2019-04-12 12:39:04.048144", "last_clean": "2019-04-12 12:39:04.048144", "last_became_active": "2019-04-12 12:27:14.334462", "last_became_peered": "2019-04-12 12:27:14.334462", "last_unstale": "2019-04-12 12:39:04.048144", "last_undegraded": "2019-04-12 12:39:04.048144", "last_fullsized": "2019-04-12 12:39:04.048144", "mapping_epoch": 65727, "log_start": "65342'17606783", "ondisk_log_start": "65342'17606783", "created": 1173, "last_epoch_clean": 65706, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "65342'17606121", "last_scrub_stamp": "2019-04-12 02:18:09.539815", "last_deep_scrub": "63702'17549735", "last_deep_scrub_stamp": "2019-04-06 01:04:51.953306", "last_clean_scrub_stamp": "2019-04-12 02:18:09.539815", "log_size": 3093, "ondisk_log_size": 3093, "stats_invalid": false, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": true, "snaptrimq_len": 0, "stat_sum": { "num_bytes": 2537348194, "num_objects": 679, "num_object_clones": 42, "num_object_copies": 2037, "num_objects_missing_on_primary": 0, "num_objects_missing": 0, "num_objects_degraded": 0, "num_objects_misplaced": 679, "num_objects_unfound": 0, "num_objects_dirty": 679, "num_whiteouts": 0, "num_read": 4665795, "num_read_kb": 184471580, "num_write": 17582383, "num_write_kb": 1431930472, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 62562, "num_bytes_recovered": 248121752746, "num_keys_recovered": 14, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0 }, "up": [ 5, 16 ], "acting": [ 5, 16, 12 ], "blocked_by": [], "up_primary": 5, "acting_primary": 5, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 65728, "hit_set_history": { "current_last_update": "0'0", "history": [] } }, { "peer": "16", "pgid": "19.21", "last_update": "65828'17613229", "last_complete": "65828'17613229", "log_tail": "65342'17606783", "last_user_version": 17609878, "last_backfill": "MAX", "last_backfill_bitwise": 1, "purged_snaps": [], "history": { "epoch_created": 1173, "epoch_pool_created": 1173, "last_epoch_started": 65728, "last_interval_started": 65727, "last_epoch_clean": 65728, "last_interval_clean": 65727, "last_epoch_split": 0, "last_epoch_marked_full": 0, "same_up_since": 65726, "same_interval_since": 65727, "same_primary_since": 65705, "last_scrub": "65342'17606121", "last_scrub_stamp": "2019-04-12 02:18:09.539815", "last_deep_scrub": "63702'17549735", "last_deep_scrub_stamp": "2019-04-06 01:04:51.953306", "last_clean_scrub_stamp": "2019-04-12 02:18:09.539815" }, "stats": { "version": "65712'17609876", "reported_seq": "17923240", "reported_epoch": "65712", "state": "active+clean+remapped", "last_fresh": "2019-04-12 12:39:04.048144", "last_change": "2019-04-12 12:27:14.334773", "last_active": "2019-04-12 12:39:04.048144", "last_peered": "2019-04-12 12:39:04.048144", "last_clean": "2019-04-12 12:39:04.048144", "last_became_active": "2019-04-12 12:27:14.334462", "last_became_peered": "2019-04-12 12:27:14.334462", "last_unstale": "2019-04-12 12:39:04.048144", "last_undegraded": "2019-04-12 12:39:04.048144", "last_fullsized": "2019-04-12 12:39:04.048144", "mapping_epoch": 65727, "log_start": "65342'17606783", "ondisk_log_start": "65342'17606783", "created": 1173, "last_epoch_clean": 65706, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "65342'17606121", "last_scrub_stamp": "2019-04-12 02:18:09.539815", "last_deep_scrub": "63702'17549735", "last_deep_scrub_stamp": "2019-04-06 01:04:51.953306", "last_clean_scrub_stamp": "2019-04-12 02:18:09.539815", "log_size": 3093, "ondisk_log_size": 3093, "stats_invalid": false, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": true, "snaptrimq_len": 0, "stat_sum": { "num_bytes": 2537348194, "num_objects": 679, "num_object_clones": 42, "num_object_copies": 2037, "num_objects_missing_on_primary": 0, "num_objects_missing": 0, "num_objects_degraded": 0, "num_objects_misplaced": 679, "num_objects_unfound": 0, "num_objects_dirty": 679, "num_whiteouts": 0, "num_read": 4665795, "num_read_kb": 184471580, "num_write": 17582383, "num_write_kb": 1431930472, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 62562, "num_bytes_recovered": 248121752746, "num_keys_recovered": 14, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0, "num_flush": 0, "num_flush_kb": 0, "num_evict": 0, "num_evict_kb": 0, "num_promote": 0, "num_flush_mode_high": 0, "num_flush_mode_low": 0, "num_evict_mode_some": 0, "num_evict_mode_full": 0, "num_objects_pinned": 0, "num_legacy_snapsets": 0, "num_large_omap_objects": 0, "num_objects_manifest": 0 }, "up": [ 5, 16 ], "acting": [ 5, 16, 12 ], "blocked_by": [], "up_primary": 5, "acting_primary": 5, "purged_snaps": [] }, "empty": 0, "dne": 0, "incomplete": 0, "last_epoch_started": 65728, "hit_set_history": { "current_last_update": "0'0", "history": [] } } ], "recovery_state": [ { "name": "Started/Primary/Active", "enter_time": "2019-04-12 12:57:16.816173", "might_have_unfound": [], "recovery_progress": { "backfill_targets": [], "waiting_on_backfill": [], "last_backfill_started": "MIN", "backfill_info": { "begin": "MIN", "end": "MIN", "objects": [] }, "peer_backfill_info": [], "backfills_in_flight": [], "recovering": [], "pg_backend": { "pull_from_peer": [], "pushing": [] } }, "scrub": { "scrubber.epoch_start": "65107", "scrubber.active": false, "scrubber.state": "INACTIVE", "scrubber.start": "MIN", "scrubber.end": "MIN", "scrubber.max_end": "MIN", "scrubber.subset_last_update": "0'0", "scrubber.deep": false, "scrubber.waiting_on_whom": [] } }, { "name": "Started", "enter_time": "2019-04-12 12:57:15.795368" } ], "agent_state": {} }
Updated by Nathan Cutler about 5 years ago
@Mark: Which version of Mimic are you running?
Updated by Jake Grimmett about 5 years ago
We have a Mimic 13.2.5 cluster with a similar looking problem:
After replacing a failing OSD, the cluster mostly healed, then stuck at 0.006%:
osd: 454 osds: 454 up, 454 in; 8 remapped pgs
pgs: 378896/6223874662 objects misplaced (0.006%)
8200 active+clean
16 active+clean+scrubbing+deep
8 active+clean+remapped
[root@ceph1 ~]# ceph health detail
HEALTH_WARN 378895/6223876069 objects misplaced (0.006%)
OBJECT_MISPLACED 378895/6223876069 objects misplaced (0.006%)
If relevant, the OSD was replaced by:
ceph osd out 193
...waiting until "ceph osd safe-to-destroy 193" came back positive.
physically replacing the drive, then
ceph osd crush remove osd.193 (perhaps ceph osd purge should have been used?)
"ceph-volume lvm create --osd-id 193 --data /dev/sds failed with "RuntimeError: The osd ID 193 is already in use or does not exist."
so the new drive was added using:
ceph-volume lvm create --bluestore --data /dev/sds
this added the drive as osd.0 (osd.0 had been removed some time ago)
"ceph osd tree | grep 193" gives no reply.
Updated by Марк Коренберг about 5 years ago
exactly the same. In order to heal that I have changed all my reweights to 1. This helped. But anyway, I don't understand how to debug that. I need to understand why that happens.
Updated by Jake Grimmett about 5 years ago
I've not tried changing reweights to 1, though last week I ran "ceph osd reweight-by-utilization 110"
Cluster is still showing:
[root@ceph1 ~]# ceph health
HEALTH_WARN 382063/6296785540 objects misplaced (0.006%)
Happy to pull any debug info or logs for the dev team...
:)
Updated by Sage Weil about 5 years ago
- Status changed from New to Closed
This looks like CRUSH's fault. Can you check with tunables you are running? (ceph osd crush show-tunables)
Using newer tunables may help.
I think the better solution is to get away from using the old reweight-by-utilization and osd reweight values. Instead, use the balancer and crush-compat mode. The balancer will even do a smooth/gradual transition away from the old reweights. See http://docs.ceph.com/docs/master/rados/operations/balancer/?highlight=balancer