Project

General

Profile

Bug #48216

Spanning blobs list might have zombie blobs that aren't of use any more

Added by Igor Fedotov over 3 years ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

As reported at https://tracker.ceph.com/issues/40449#note-9 users are still facing "no blob id" assertion. Provided log shows that onode's spanning blob list contains 32K spanning blob while actual amount of active blobs is far less. Which means there is spanning blobs "leakage" or something - blobs are still present in the spanning blob list while they don't have any references from extent map. Such fake blobs spoil the list (and make its encoding/decoding slower) and in the works case might result in spanning blob ids overflow. Hence the assertion.

ceph-osd.28.log.xz - Log with onode dump prior to "no blob id" assertion (425 KB) Igor Fedotov, 11/12/2020 05:16 PM


Related issues

Related to bluestore - Bug #50017: OSDs broken after nautilus->octopus upgrade: rocksdb Corruption: unknown WriteBatch tag Resolved

History

#1 Updated by Igor Fedotov over 3 years ago

#2 Updated by Igor Fedotov over 3 years ago

Related PR to detect leaked spanning blobs and fix with fsck: https://github.com/ceph/ceph/pull/38050

#3 Updated by Konstantin Shalygin almost 3 years ago

I have a case: Luminous 12.2.13 -> Nautilus 14.2.20 upgrade:

host 0-5: redeployed on last 60 days: ceph-disk -> ceph-volume migration
host 6-8: deployed on ceph-volume 12.2.13 at 2020-04-21

Hosts 0-5 restarted one by one: cpu usage on fsck on 100%, zombie fixed, per pool stats fixed, then boot. Zombie stats:

[root@ceph-osd0 ceph]# for e in ceph-osd*.log ; do echo -e "${e}:" ; grep zombie ${e} | wc -l | grep -v 0; done
ceph-osd.0.log:
161
ceph-osd.126.log:
225
ceph-osd.127.log:
ceph-osd.128.log:
238
ceph-osd.129.log:
289
ceph-osd.130.log:
246
ceph-osd.131.log:
228
ceph-osd.24.log:
ceph-osd.46.log:
164
ceph-osd.47.log:
ceph-osd.4.log:
232
ceph-osd.74.log:
ceph-osd.7.log:
134
ceph-osd.84.log:
ceph-osd.9.log:
226
ceph-osd.admin.log:

Host 6: all OSD's dead. Some segfaulted, some just stop fsck (no log for 10+ minutes). To continue upgrade I redeployed this host, currently on backfilling. On occasion, host lack of crond service and log is not logrotated - we have a full log from OSD birth to death.

On logs I noticed a trand that almost all errors comes from prefix "rbd_data.6", "six" is pool, right?

[root@ceph-osd0 ceph]# for e in ceph-osd*.log ; do echo -e "${e}:" ; grep zombie ${e} | grep rbd_data.6 | wc -l | grep -v 0; done
ceph-osd.0.log:
159
ceph-osd.126.log:
223
ceph-osd.127.log:
ceph-osd.128.log:
236
ceph-osd.129.log:
287
ceph-osd.130.log:
246
ceph-osd.131.log:
227
ceph-osd.24.log:
ceph-osd.46.log:
163
ceph-osd.47.log:
ceph-osd.4.log:
232
ceph-osd.74.log:
ceph-osd.7.log:
133
ceph-osd.84.log:
ceph-osd.9.log:
224
ceph-osd.admin.log:

six pool is a meta pool for a EC data pool:

pool 6 'erasure_rbd_meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode warn last_change 73436 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd
        removed_snaps [1~3]
pool 7 'erasure_rbd_data' erasure size 5 min_size 4 crush_rule 3 object_hash rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 73438 lfor 0/0/46778 flags hashpspool,ec_overwrites stripe_width 12288 application rbd

osd,100 log (285MB, zstd compressed to 15MB): https://www.icloud.com/iclouddrive/01q2sncWoeiKGidnbwcxB2XCw#ceph-osd.100

#4 Updated by Igor Fedotov almost 3 years ago

Konstantin Shalygin wrote:

On logs I noticed a trand that almost all errors comes from prefix "rbd_data.6", "six" is pool, right?

Can't verify right now, but I presume you're getting zombies for pools 1 & 7:
2021-05-08 20:57:25.740 7fe0350d6700 -1 bluestore(/var/lib/ceph/osd/ceph-100) fsck error: #1:1f652b1b:::rbd_data.5530d72bfd633b.0000000000005886:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55dccd5190a0 spanning 31 blob([!~20000] csum crc32c/0x1000) use_tracker(0x2*0x10000 0x[0,0]) SharedBlob(0x55dcce406e00 sbid 0x0))
...
2021-05-08 20:58:12.353 7fe0358d7700 -1 bluestore(/var/lib/ceph/osd/ceph-100) fsck error: 0#7:2f83c506:::rbd_data.6.fb84c259e11888.00000000000378ed:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55dcbfc067e0 spanning 6 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x55dca78031f0 sbid 0x0))

#5 Updated by Igor Fedotov almost 3 years ago

Konstantin Shalygin wrote:

I have a case: Luminous 12.2.13 -> Nautilus 14.2.20 upgrade:

host 0-5: redeployed on last 60 days: ceph-disk -> ceph-volume migration
host 6-8: deployed on ceph-volume 12.2.13 at 2020-04-21

Hosts 0-5 restarted one by one: cpu usage on fsck on 100%, zombie fixed, per pool stats fixed, then boot. Zombie stats:

[...]

Host 6: all OSD's dead. Some segfaulted, some just stop fsck (no log for 10+ minutes). To continue upgrade I redeployed this host, currently on backfilling. On occasion, host lack of crond service and log is not logrotated - we have a full log from OSD birth to death.

On logs I noticed a trand that almost all errors comes from prefix "rbd_data.6", "six" is pool, right?

[...]

six pool is a meta pool for a EC data pool:

[...]

osd,100 log (285MB, zstd compressed to 15MB): https://www.icloud.com/iclouddrive/01q2sncWoeiKGidnbwcxB2XCw#ceph-osd.100

It seems you're suffering from two rather different issues here:

1) Zombie blobs themselves which shouldn't exist
2) Faulty bulk repair for these blobs which looks like a duplicate of https://tracker.ceph.com/issues/50017

For 1) - could you please monitor on occasion if zombies appear again (e.g. by fsck-ing upgraded and repaired OSDs)

For 2) - it looks like repair transaction size matters so when a lot of zombies are present repair transaction corrupts RocksDB. Perhaps we need to cap such transactions and/or investigate what's wrong with them - I don't see gigabytes of data for your case - just ~21K errors and 3-4 KB per update op result in roughly 64MB. So this shouldn't be that destructive...

#6 Updated by Igor Fedotov almost 3 years ago

  • Related to Bug #50017: OSDs broken after nautilus->octopus upgrade: rocksdb Corruption: unknown WriteBatch tag added

#7 Updated by Konstantin Shalygin almost 3 years ago

Can't verify right now, but I presume you're getting zombies for pools 1 & 7:

1 is replicated RBD pool, and 7 is EC RBD pool.

For 1) - could you please monitor on occasion if zombies appear again (e.g. by fsck-ing upgraded and repaired OSDs)

Currently some OSD's was redeployed on Nautilus, some fixed before run Nautilus ceph-osd via

sudo ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-<osd_id> --command repair

Now all OSD's on 14.2.20, currently on deep scrub. I think I could check in couple of weeks for zombies via fsck with bluestore-tool. What command I should run?

#8 Updated by Igor Fedotov almost 3 years ago

Konstantin Shalygin wrote:

Can't verify right now, but I presume you're getting zombies for pools 1 & 7:

1 is replicated RBD pool, and 7 is EC RBD pool.

For 1) - could you please monitor on occasion if zombies appear again (e.g. by fsck-ing upgraded and repaired OSDs)

Currently some OSD's was redeployed on Nautilus, some fixed before run Nautilus ceph-osd via
[...]
Now all OSD's on 14.2.20, currently on deep scrub. I think I could check in couple of weeks for zombies via fsck with bluestore-tool. What command I should run?

ceph-bluestore-tool --path <..> --command fsck

#9 Updated by Konstantin Shalygin almost 3 years ago

Igor, after couple of weeks i run fsck on upgraded and repaired OSD's:

<snip_snap>

2021-06-09 00:25:00.395 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:72f3aec5:::rbd_data.6.fa27fd7d8bc873.000000000001e724:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb36015e0 spanning 11 blob([!~20000] csum crc32c/0x1000) use_tracker(0x2*0x10000 0x[0,0]) SharedBlob(0x55beb3600380 sbid 0x0))
2021-06-09 00:25:00.524 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:72fad359:::rbd_data.6.fa280624a4606d.0000000000008731:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb38e33b0 spanning 7 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55beb38e31f0 sbid 0x0))
2021-06-09 00:25:00.840 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a40b722d:::rbd_data.6.fb83c973866f89.000000000000ee8d:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb3fa3810 spanning 0 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55beb3fa31f0 sbid 0x0))
2021-06-09 00:25:00.913 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a40fb5ff:::rbd_data.6.1efb85dc2db12.0000000000037109:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb4145030 spanning 13 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55beb4145340 sbid 0x0))
2021-06-09 00:25:01.168 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a41d8e85:::rbd_data.6.fb7c6d13673f08.000000000000265f:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb46e6540 spanning 10 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x55beb46e61c0 sbid 0x0))
2021-06-09 00:25:01.299 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4248d71:::rbd_data.6.fb400f7c100c2e.0000000000038028:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb491fea0 spanning 14 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x55beb491fe30 sbid 0x0))
2021-06-09 00:25:01.384 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4292e75:::rbd_data.6.fb358079d534c0.000000000001ecd4:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb4b7fa40 spanning 8 blob([!~10000] csum crc32c/0x1000) use_tracker(0x10000 0x0) SharedBlob(0x55beb4b7ea80 sbid 0x0))
2021-06-09 00:25:01.539 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a431ab18:::rbd_data.6.fb33ca7495b7b3.0000000000031b2e:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb4f04150 spanning 11 blob([!~50000] csum crc32c/0x1000) use_tracker(0x5*0x10000 0x[0,0,0,0,0]) SharedBlob(0x55beb4ef9d50 sbid 0x0))
2021-06-09 00:25:01.551 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a43275de:::rbd_data.6.fb878f57d7f7de.000000000001d460:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb4f4a930 spanning 1 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55beb4f4aee0 sbid 0x0))
2021-06-09 00:25:01.561 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a432db1e:::rbd_data.6.fa29b91550277d.000000000001cf1e:head# - 14 zombie spanning blob(s) found, the first one: Blob(0x55beb4f87650 spanning 12 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55beb4f872d0 sbid 0x0))
2021-06-09 00:25:01.749 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a43d0864:::rbd_data.6.efb821c1dfd5f0.000000000000ab44:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb53bc150 spanning 12 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55beb53b3030 sbid 0x0))
2021-06-09 00:25:01.903 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4455e6e:::rbd_data.6.fb85df54800234.000000000003ef3f:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb57325b0 spanning 6 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55beb5732930 sbid 0x0))
2021-06-09 00:25:01.917 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4462821:::rbd_data.6.fb2bfc5f466437.0000000000031cd2:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb5784070 spanning 12 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x55beb5784cb0 sbid 0x0))
2021-06-09 00:25:02.052 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a44d341c:::rbd_data.6.fa2b721436cd81.0000000000018c08:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb5a69e30 spanning 11 blob([!~10000] csum crc32c/0x1000) use_tracker(0x10000 0x0) SharedBlob(0x55beb5a69810 sbid 0x0))
2021-06-09 00:25:02.072 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a44e4843:::rbd_data.6.fb7c70430e503e.0000000000002525:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb5ae0930 spanning 1 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x55beb5ae08c0 sbid 0x0))
2021-06-09 00:25:02.163 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a452cc26:::rbd_data.6.1efb85dc2db12.0000000000002b03:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb5dbb420 spanning 20 blob([!~60000] csum crc32c/0x1000) use_tracker(0x6*0x10000 0x[0,0,0,0,0,0]) SharedBlob(0x55beb5dbba40 sbid 0x0))
2021-06-09 00:25:02.494 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4640d0d:::rbd_data.6.fa2b721436cd81.000000000003eddc:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb64c9f10 spanning 8 blob([!~10000] csum crc32c/0x1000) use_tracker(0x10000 0x0) SharedBlob(0x55beb64c92d0 sbid 0x0))
2021-06-09 00:25:02.521 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4655add:::rbd_data.6.efb8405cd35a2d.000000000001815e:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb6563ab0 spanning 4 blob([!~70000] csum crc32c/0x1000) use_tracker(0x7*0x10000 0x[0,0,0,0,0,0,0]) SharedBlob(0x55beb6563a40 sbid 0x0))
2021-06-09 00:25:02.693 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a46e9428:::rbd_data.6.fb830f1360cbe1.000000000002bf21:head# - 2 zombie spanning blob(s) found, the first one: Blob(0x55beb6a94150 spanning 4 blob([!~20000] csum crc32c/0x1000) use_tracker(0x2*0x10000 0x[0,0]) SharedBlob(0x55beb6a940e0 sbid 0x0))
2021-06-09 00:25:03.103 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:ae8523d7:::rbd_data.6.fb85df54800234.000000000000c414:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beb7376230 spanning 1 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55beb7377180 sbid 0x0))
2021-06-09 00:25:03.169 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:ae891c09:::rbd_data.6.fb84c259e11888.0000000000004cee:head# - 36 zombie spanning blob(s) found, the first one: Blob(0x55beb74cb880 spanning 9 blob([!~20000] csum crc32c/0x1000) use_tracker(0x2*0x10000 0x[0,0]) SharedBlob(0x55beb74cb030 sbid 0x0))
2021-06-09 00:25:03.522 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:ae9d6e64:::rbd_data.6.fb84bf37e34862.000000000001594e:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55bdcc1337a0 spanning 8 blob([!~60000] csum crc32c/0x1000) use_tracker(0x6*0x10000 0x[0,0,0,0,0,0]) SharedBlob(0x55bdcc133ab0 sbid 0x0))
2021-06-09 00:25:03.598 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:aea1c3ef:::rbd_data.6.fa28ba10a28bad.0000000000022d96:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55be986d49a0 spanning 7 blob([!~60000] csum crc32c/0x1000) use_tracker(0x6*0x10000 0x[0,0,0,0,0,0]) SharedBlob(0x55bdca16d3b0 sbid 0x0))
2021-06-09 00:25:04.155 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:aec1d2a3:::rbd_data.6.fb90716445198f.0000000000035763:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55be56b24230 spanning 5 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x55be56b250a0 sbid 0x0))
2021-06-09 00:25:04.264 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:aec83912:::rbd_data.6.fb85df54800234.000000000000e6f5:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55bdca9d8380 spanning 8 blob([!~20000] csum crc32c/0x1000) use_tracker(0x2*0x10000 0x[0,0]) SharedBlob(0x55bdce9ad9d0 sbid 0x0))
2021-06-09 00:25:04.314 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:aecaf554:::rbd_data.6.fa27fd7d8bc873.000000000002811d:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55beaed0df10 spanning 3 blob([!~50000] csum crc32c/0x1000) use_tracker(0x5*0x10000 0x[0,0,0,0,0]) SharedBlob(0x55beaed18000 sbid 0x0))
2021-06-09 00:25:05.130 7f7423b38ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:aefa840d:::rbd_data.6.fb324a6322404f.0000000000038370:head# - 12 zombie spanning blob(s) found, the first one: Blob(0x55beb900bb90 spanning 33 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x55beb900bc00 sbid 0x0))
fsck found 309 error(s)

Then repair:

<snip_snap>

2021-06-09 00:28:46.457 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:72f3aec5:::rbd_data.6.fa27fd7d8bc873.000000000001e724:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x555826485810 spanning 11 blob([!~20000] csum crc32c/0x1000) use_tracker(0x2*0x10000 0x[0,0]) SharedBlob(0x5558264845b0 sbid 0x0))
2021-06-09 00:28:46.591 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:72fad359:::rbd_data.6.fa280624a4606d.0000000000008731:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x5558289f55e0 spanning 7 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x5558289f5420 sbid 0x0))
2021-06-09 00:28:46.903 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a40b722d:::rbd_data.6.fb83c973866f89.000000000000ee8d:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x555822b0f9d0 spanning 0 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x555822b0f3b0 sbid 0x0))
2021-06-09 00:28:46.982 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a40fb5ff:::rbd_data.6.1efb85dc2db12.0000000000037109:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55585d8d31f0 spanning 13 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55585d8d3500 sbid 0x0))
2021-06-09 00:28:47.245 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a41d8e85:::rbd_data.6.fb7c6d13673f08.000000000000265f:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x5558261c2700 spanning 10 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x5558261c2380 sbid 0x0))
2021-06-09 00:28:47.383 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4248d71:::rbd_data.6.fb400f7c100c2e.0000000000038028:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55582038c070 spanning 14 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x55582038c000 sbid 0x0))
2021-06-09 00:28:47.472 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4292e75:::rbd_data.6.fb358079d534c0.000000000001ecd4:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55582ac0dc00 spanning 8 blob([!~10000] csum crc32c/0x1000) use_tracker(0x10000 0x0) SharedBlob(0x55582ac0cc40 sbid 0x0))
2021-06-09 00:28:47.642 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a431ab18:::rbd_data.6.fb33ca7495b7b3.0000000000031b2e:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x555807146310 spanning 11 blob([!~50000] csum crc32c/0x1000) use_tracker(0x5*0x10000 0x[0,0,0,0,0]) SharedBlob(0x55582b675f10 sbid 0x0))
2021-06-09 00:28:47.655 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a43275de:::rbd_data.6.fb878f57d7f7de.000000000001d460:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x5558249e2af0 spanning 1 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x5558249e30a0 sbid 0x0))
2021-06-09 00:28:47.665 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a432db1e:::rbd_data.6.fa29b91550277d.000000000001cf1e:head# - 14 zombie spanning blob(s) found, the first one: Blob(0x555824c01810 spanning 12 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x555824c01490 sbid 0x0))
2021-06-09 00:28:47.861 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a43d0864:::rbd_data.6.efb821c1dfd5f0.000000000000ab44:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x555891676310 spanning 12 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55589208b1f0 sbid 0x0))
2021-06-09 00:28:48.019 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4455e6e:::rbd_data.6.fb85df54800234.000000000003ef3f:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x555831944770 spanning 6 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x555831944af0 sbid 0x0))
2021-06-09 00:28:48.033 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4462821:::rbd_data.6.fb2bfc5f466437.0000000000031cd2:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x555854188230 spanning 12 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x555854188e70 sbid 0x0))
2021-06-09 00:28:48.164 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a44d341c:::rbd_data.6.fa2b721436cd81.0000000000018c08:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55582c6d8000 spanning 11 blob([!~10000] csum crc32c/0x1000) use_tracker(0x10000 0x0) SharedBlob(0x555833a279d0 sbid 0x0))
2021-06-09 00:28:48.185 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a44e4843:::rbd_data.6.fb7c70430e503e.0000000000002525:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55582f1aeaf0 spanning 1 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x55582f1aea80 sbid 0x0))
2021-06-09 00:28:48.278 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a452cc26:::rbd_data.6.1efb85dc2db12.0000000000002b03:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x5558909ad5e0 spanning 20 blob([!~60000] csum crc32c/0x1000) use_tracker(0x6*0x10000 0x[0,0,0,0,0,0]) SharedBlob(0x5558909adc00 sbid 0x0))
2021-06-09 00:28:48.599 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4640d0d:::rbd_data.6.fa2b721436cd81.000000000003eddc:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x555830b640e0 spanning 8 blob([!~10000] csum crc32c/0x1000) use_tracker(0x10000 0x0) SharedBlob(0x5558324a9490 sbid 0x0))
2021-06-09 00:28:48.625 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a4655add:::rbd_data.6.efb8405cd35a2d.000000000001815e:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55582f8e7c70 spanning 4 blob([!~70000] csum crc32c/0x1000) use_tracker(0x7*0x10000 0x[0,0,0,0,0,0,0]) SharedBlob(0x55582f8e7c00 sbid 0x0))
2021-06-09 00:28:48.810 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:a46e9428:::rbd_data.6.fb830f1360cbe1.000000000002bf21:head# - 2 zombie spanning blob(s) found, the first one: Blob(0x555851e40310 spanning 4 blob([!~20000] csum crc32c/0x1000) use_tracker(0x2*0x10000 0x[0,0]) SharedBlob(0x555851e402a0 sbid 0x0))
2021-06-09 00:28:49.221 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:ae8523d7:::rbd_data.6.fb85df54800234.000000000000c414:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55583246c3f0 spanning 1 blob([!~30000] csum crc32c/0x1000) use_tracker(0x3*0x10000 0x[0,0,0]) SharedBlob(0x55583246d340 sbid 0x0))
2021-06-09 00:28:49.285 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:ae891c09:::rbd_data.6.fb84c259e11888.0000000000004cee:head# - 36 zombie spanning blob(s) found, the first one: Blob(0x555850787a40 spanning 9 blob([!~20000] csum crc32c/0x1000) use_tracker(0x2*0x10000 0x[0,0]) SharedBlob(0x5558507871f0 sbid 0x0))
2021-06-09 00:28:49.629 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:ae9d6e64:::rbd_data.6.fb84bf37e34862.000000000001594e:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x555832b9f960 spanning 8 blob([!~60000] csum crc32c/0x1000) use_tracker(0x6*0x10000 0x[0,0,0,0,0,0]) SharedBlob(0x555832b9fc70 sbid 0x0))
2021-06-09 00:28:49.702 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:aea1c3ef:::rbd_data.6.fa28ba10a28bad.0000000000022d96:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x5558c216cb60 spanning 7 blob([!~60000] csum crc32c/0x1000) use_tracker(0x6*0x10000 0x[0,0,0,0,0,0]) SharedBlob(0x5558273c9570 sbid 0x0))
2021-06-09 00:28:50.242 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:aec1d2a3:::rbd_data.6.fb90716445198f.0000000000035763:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x55582d0ca3f0 spanning 5 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x55582d0cb260 sbid 0x0))
2021-06-09 00:28:50.350 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:aec83912:::rbd_data.6.fb85df54800234.000000000000e6f5:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x555830f10540 spanning 8 blob([!~20000] csum crc32c/0x1000) use_tracker(0x2*0x10000 0x[0,0]) SharedBlob(0x555836809b90 sbid 0x0))
2021-06-09 00:28:50.400 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:aecaf554:::rbd_data.6.fa27fd7d8bc873.000000000002811d:head# - 1 zombie spanning blob(s) found, the first one: Blob(0x5558318780e0 spanning 3 blob([!~50000] csum crc32c/0x1000) use_tracker(0x5*0x10000 0x[0,0,0,0,0]) SharedBlob(0x5558318781c0 sbid 0x0))
2021-06-09 00:28:51.238 7f3b7a8d5ec0 -1 bluestore(/var/lib/ceph/osd/ceph-0) fsck error: 4#7:aefa840d:::rbd_data.6.fb324a6322404f.0000000000038370:head# - 12 zombie spanning blob(s) found, the first one: Blob(0x5558c6a3dd50 spanning 33 blob([!~40000] csum crc32c/0x1000) use_tracker(0x4*0x10000 0x[0,0,0,0]) SharedBlob(0x5558c6a3ddc0 sbid 0x0))
repair success

#10 Updated by Gilles Mocellin over 1 year ago

Hello,

No news on that ?
Does someone knows if the problem also happens on Quincy ?

#11 Updated by Igor Fedotov over 1 year ago

Gilles Mocellin wrote:

Hello,

No news on that ?
Does someone knows if the problem also happens on Quincy ?

Unfortunately the root cause is still unclear and hence Quincy is likely to suffer from this as well.

Good news is that Quincy is able to fix such issues through bluestore's repair thanks to https://github.com/ceph/ceph/pull/38050

Also available in: Atom PDF