Project

General

Profile

Backport #18183

Updated by Loïc Dachary over 7 years ago

From time to time we experience deep-scrub errors in pgs inside our cephfs metadata pool. We do not see any hardware errors on the osds themselves so the error must have some other source. We also never saw this in infernalis.

The error itself is the following:

# cat /var/log/ceph/ceph.log| grep 5.3d0
2016-08-30 00:30:53.492626 osd.78 192.231.127.171:6828/6072 331 : cluster [INF] 5.3d0 deep-scrub starts
2016-08-30 00:30:54.276134 osd.78 192.231.127.171:6828/6072 332 : cluster [ERR] 5.3d0 shard 78: soid 5:0bd6d154:::602.00000000:head omap_digest 0xf3fdfd0c != best guess omap_digest 0x23b2eae0 from auth shard 49
2016-08-30 00:30:54.747795 osd.78 192.231.127.171:6828/6072 333 : cluster [ERR] 5.3d0 deep-scrub 0 missing, 1 inconsistent objects
2016-08-30 00:30:54.747801 osd.78 192.231.127.171:6828/6072 334 : cluster [ERR] 5.3d0 deep-scrub 1 errors

The result of a pg query for the inconsistent pg is here: https://dl.dropboxusercontent.com/u/2946024/pg_query.debug.log

The result of a manual deep-scrub after increasing log level to debug_ms 1, debug_filestore 20 and debug_osd 30 are here:
- osd.78 (primary): https://dl.dropboxusercontent.com/u/2946024/ceph-osd.78.debug.log.gz
- osd.59 (peer): https://dl.dropboxusercontent.com/u/2946024/ceph-osd.59.debug.log.gz
- osd.49 (peer): https://dl.dropboxusercontent.com/u/2946024/ceph-osd.49.debug.log.gz

Diving in in the previous logs, it seems that the primary is the problematic osd because the two peers 49 and 59 seem to be consistent between each other, and the pg was just recently migrated to osd.78. The logs of osd.49 and osd.59 seem to point that the divergence is on the getattrs for the object:

2016-08-31 00:55:01.444902 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49) stat 5.3d0_head/#5:0bd6d154:::602.00000000:head# = 0 (size 0)
2016-08-31 00:55:01.444909 7f2dcbaaa700 15 filestore(/var/lib/ceph/osd/ceph-49) getattrs 5.3d0_head/#5:0bd6d154:::602.00000000:head#
2016-08-31 00:55:01.444953 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49) -ERANGE, len is 208
2016-08-31 00:55:01.444964 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49) -ERANGE, got 104
2016-08-31 00:55:01.444967 7f2dcbaaa700 20 filestore(/var/lib/ceph/osd/ceph-49) fgetattrs 315 getting '_'
2016-08-31 00:55:01.444974 7f2dcbaaa700 20 filestore(/var/lib/ceph/osd/ceph-49) fgetattrs 315 getting '_parent'
2016-08-31 00:55:01.444980 7f2dcbaaa700 20 filestore(/var/lib/ceph/osd/ceph-49) fgetattrs 315 getting 'snapset'
2016-08-31 00:55:01.444986 7f2dcbaaa700 20 filestore(/var/lib/ceph/osd/ceph-49) fgetattrs 315 getting '_layout'
2016-08-31 00:55:01.444992 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49) getattrs no xattr exists in object_map r = 0
2016-08-31 00:55:01.444994 7f2dcbaaa700 10 filestore(/var/lib/ceph/osd/ceph-49) getattrs 5.3d0_head/#5:0bd6d154:::602.00000000:head# = 0
2016-08-31 00:55:01.444998 7f2dcbaaa700 10 osd.49 pg_epoch: 23099 pg[5.3d0( v 23099'104738 (23099'101639,23099'104738] local-les=22440 n=257 ec=339 les/c/f 22440/22440/0 19928/22439/22439) [78,59,49] r=2 lpr=22439 pi=4173-22438/25 luod=0'0 crt=23099'104736 lcod 23099'104737 active] be_deep_scrub 5:0bd6d154:::602.00000000:head seed 4294967295

I conclude that the replication of the pg did not go has expected. In the mailing list there are other reports of similar situations which seems to point to a systematic issue. At least for us it already happened more than once, and that is why I am opening this bug.

Back