Bug #44286
openCache tiering shows unfound objects after OSD reboots
0%
Description
We've got a cluster with a 3/2 size/min_size replicated cache pool in front of an erasure coded pool used for RBD.
Restarting OSDs sometimes results in unfound objects, example:
2/543658058 objects unfound (0.000%) pg 19.12 has 1 unfound objects pg 19.2d has 1 unfound objects Possible data damage: 2 pgs recovery_unfound pg 19.12 is active+recovery_unfound+undersized+degraded+remapped, acting [299,310], 1 unfound pg 19.2d is active+recovery_unfound+undersized+degraded+remapped, acting [290,309], 1 unfound # ceph pg 19.12 list_unfound { "num_missing": 1, "num_unfound": 1, "objects": [ { "oid": { "oid": "hit_set_19.12_archive_2020-02-25 13:43:50.256316Z_2020-02-25 13:43:50.325825Z", "key": "", "snapid": -2, "hash": 18, "max": 0, "pool": 19, "namespace": ".ceph-internal" }, "need": "3312398'55868341", "have": "0'0", "flags": "none", "locations": [] } ], "more": false }
Both PGs affected here share an OSD (the one that's offline).
The cache tiering agent is busy flushing with around 300-500 MB/s while this happens.
The unfound objects stay unfound even after all OSDs are back online. The affected PG never goes below 2 online OSDs.
Restarting the OSDs does not change the state, so it's not an instance of https://tracker.ceph.com/issues/37439
Ceph version 14.2.6 (restarting to upgrade to 14.2.7). Also seen on 14.2.4 a few months ago.
Attached is a pg query on a PG in that state (from an earlier instance of this issue, also 14.2.6)
Files
Updated by Paul Emmerich almost 4 years ago
this occasionally comes up on the mailing list as well. it's not reproducible on my test setup, though :(
Updated by Jan-Philipp Litza about 3 years ago
We even hit that bug twice today by rebooting two of our cache servers.
What's interesting is that only hit_set objects ever went missing. What's even more peculiar is the timestamps in their object IDs are from the downtime of the host, but they are only reported unfound after the host rejoined the cluster.
So either the objects were never created in the first place (but Ceph somehow assumes that they must exist), or they are created on another host but then somehow get lost during recovery. But since the cache pool has a size of 2, the latter seems highly implausible.
BTW, this happened on version 14.2.16, and after understanding the situation we simply marked the objects lost without any apparent adverse consequences.
Updated by Pawel Stefanski over 2 years ago
Jan-Philipp Litza wrote:
We even hit that bug twice today by rebooting two of our cache servers.
What's interesting is that only hit_set objects ever went missing. What's even more peculiar is the timestamps in their object IDs are from the downtime of the host, but they are only reported unfound after the host rejoined the cluster.
So either the objects were never created in the first place (but Ceph somehow assumes that they must exist), or they are created on another host but then somehow get lost during recovery. But since the cache pool has a size of 2, the latter seems highly implausible.
BTW, this happened on version 14.2.16, and after understanding the situation we simply marked the objects lost without any apparent adverse consequences.
I can confirm at 14.2.22 it still occurs.
Updated by Jan-Philipp Litza over 2 years ago
Update: Also happens with 16.2.5 :-(
Updated by marek czardybon over 2 years ago
the problem still exists on 15.2.15.
I've also got replicated size 3 min_size 2.
the problem occurs only when one OSD is restarted.