Bug #51254
opendeep-scrub stat mismatch on last PG in pool
0%
Description
In the past few weeks, we got inconsistent PGs in deep-scrub a few times, always on the very last PG in the pool:
[root@popeye-mon-0-07 ~]# ceph --version
ceph version 14.2.20 (36274af6eb7f2a5055f2d53ad448f2694e9046a0) nautilus (stable)
[root@popeye-mon-0-07 ~]# ceph pg ls inconsistent
PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS* LOG STATE SINCE VERSION REPORTED UP ACTING SCRUB_STAMP DEEP_SCRUB_STAMP
1.1fff 66209 0 0 0 108207801181 0 0 3049 active+clean+inconsistent 20h 307780'5182385 307780:6211466 [780,1444,273]p780 [780,1444,273]p780 2021-06-15 15:57:11.670864 2021-06-13 19:42:47.051981
pool 1 has 8192 PGs, so 1fff is exactly the last PG. In all instances I've seen this, it has been the case.
Repeating the deep-scrub also shows the same error:
[root@popeye-mon-0-07 ~]# ceph pg deep-scrub 1.1fff
2021-06-16 12:47:23.008 7fffd2d89700 0 log_channel(cluster) log [DBG] : 1.1fff deep-scrub starts
2021-06-16 13:17:27.430 7fffd2d89700 -1 log_channel(cluster) log [ERR] : 1.1fff deep-scrub : stat mismatch, got 66234/66232 objects, 16496/16495 clones, 66234/66232 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 16483/16482 whiteouts, 108311671045/108307476741 bytes, 0/0 manifest objects, 0/0 hit_set_archive bytes.
2021-06-16 13:17:27.430 7fffd2d89700 -1 log_channel(cluster) log [ERR] : 1.1fff deep-scrub 1 errors
Repairing the PG works correctly:
[root@popeye-mon-0-07 ~]# ceph pg repair 1.1fff
2021-06-16 13:59:14.404 7fffd2d89700 0 log_channel(cluster) log [DBG] : 1.1fff repair starts
2021-06-16 14:30:39.399 7fffd2d89700 -1 log_channel(cluster) log [ERR] : 1.1fff repair : stat mismatch, got 66247/66245 objects, 16496/16495 clones, 66247/66245 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 16483/16482 whiteouts, 108365838605/108361644301 bytes, 0/0 manifest objects, 0/0 hit_set_archive bytes.
2021-06-16 14:30:39.399 7fffd2d89700 -1 log_channel(cluster) log [ERR] : 1.1fff repair 1 errors, 1 fixed
We had so far 4 instances of this that I have records of - always the last PG in the pool.
Updated by Neha Ojha almost 3 years ago
- Category changed from Scrub/Repair to Tiering
It seems like you are using cache tiering, and there has been similar bugs reported like this. I don't understand why it would only affect the last PG in the pool though.
Updated by Andras Pataki almost 3 years ago
We definitely do not use cache tiering on any of our clusters. On the cluster above, we do use snapshots (via cephfs) - we create one daily snapshot, and remove the oldest one - so there are rolling snaptrims on PGs. I have seen this on our other large cluster as well that doesn't use snapshots. Each OSD is on a spinning disk with db/wal on an NVMe partition.