Project

General

Profile

Actions

Bug #59660

open

Corruption: unknown checksum type 4 (ceph-osd fails to start)

Added by Kaleb KEITHLEY 12 months ago. Updated 11 months ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

(Note that ceph-17.2.6 in fedora rawhide is built with the bundled rocksdb.)

After updating ceph packages to ceph-osd-17.2.6-5.fc39.x86_64, OSD no longer starts.
Previously working version was ceph-osd-2:17.2.5-13.fc39.x86_64.

Relevant ceph-osd logs (full logfile attached):
#v+

2023-05-05T15:49:55.044+0200 7fabe775a6c0 2 rocksdb: [table/block_based/block_based_table_reader.cc:1161] Encountered error while reading data from properties block Corruption: unknown checksum type 4 from footer of db/002868.sst, while checking block at offset 68813363 size 86
2023-05-05T15:49:55.044+0200 7fabf99d92c0 4 rocksdb: [db/db_impl/db_impl.cc:446] Shutdown: canceling all background work
2023-05-05T15:49:55.044+0200 7fabf99d92c0 4 rocksdb: [db/db_impl/db_impl.cc:625] Shutdown complete
2023-05-05T15:49:55.044+0200 7fabf99d92c0 1 rocksdb: Corruption: unknown checksum type 4 from footer of db/002878.sst, while checking block at offset 17055611 size 86
2023-05-05T15:49:55.044+0200 7fabf99d92c0 -1 bluestore(/var/lib/ceph/osd/ceph-2) _open_db erroring opening db:
2023-05-05T15:49:55.044+0200 7fabf99d92c0 1 bluefs umount
2023-05-05T15:49:55.044+0200 7fabf99d92c0 1 bdev(0x55ac9fce9800 /var/lib/ceph/osd/ceph-2/block) close
2023-05-05T15:49:55.055+0200 7fabf99d92c0 1 bdev(0x55ac9fce8000 /var/lib/ceph/osd/ceph-2/block) close
2023-05-05T15:49:55.316+0200 7fabf99d92c0 -1 osd.2 0 OSD:init: unable to mount object store
2023-05-05T15:49:55.316+0200 7fabf99d92c0 -1 [[0;31m ** ERROR: osd init failed: (5) Input/output error[[0m
#v

Reproducible: Always

Steps to Reproduce:
1. Start ceph-osd
2.
3.
Actual Results:
Ceph-OSD fails to start.

Expected Results:
Ceph-OSD running.

Actions #1

Updated by Adam Kupczyk 12 months ago

  • Status changed from New to Need More Info

Checksum method 4 (kXXH3) is introduced in RocksDB 6.27.

ceph-17.2.5-13 does not have it.

Actions #2

Updated by Kaleb KEITHLEY 12 months ago

Until two weeks ago (19 April to be exact), rawhide had rocksdb-7.8.3. And ceph (17.2.6) was built with that. Now rawhide has rocksdb-8.1.1, requiring that the build on rawhide switch to use the bundled rocksdb. This has nothing to do with reef, or the rocksdb that's in reef.

And as an aside, even reef won't build with rocksdb-8.x.

fedora rocksdb builds https://koji.fedoraproject.org/koji/packageinfo?packageID=24329

Actions #3

Updated by Tomasz Torcz 11 months ago

Hi Adam, I'm the original reporter of this issue (https://bugzilla.redhat.com/show_bug.cgi?id=2193399).
To summarize, cluster was created with system rocksdb in Fedora, which resulted in checksum type 4 being used.
Then Fedora package was switched to use bundled RocksDB, which is too old to know this checksum type.

Is any more information needed to clear "need info" from this issue?

Could bundled RocksDB be updated to 6.27.0?
Or Ceph fixed to compile with RocksDB 8.1?

Actions #4

Updated by Ilya Dryomov 11 months ago

  • Target version deleted (v17.2.6)
Actions

Also available in: Atom PDF