Project

General

Profile

Actions

Bug #22044

closed

rocksdb log replay - corruption: missing start of fragmented record

Added by Michael Schmid over 6 years ago. Updated almost 6 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
rocksdb corruption record log replay bluestore
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I've conducted some crash tests (unplugging drives, the machine, terminating and restarting ceph systemd services) with Ceph 12.2.0 on Ubuntu and quite easily managed to corrupt what appears to be rocksdb's log replay on a bluestore OSD:

  1. ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-2/
    [...]
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2859] Recovered from manifest file:db/MANIFEST-000975 succeeded,manifest_file_number is 975, next_file_number is 1008, last_sequence is 51965907, log_number is 0,prev_log_number is 0,max_column_family is 0
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2867] Column family [default] (ID 0), log number is 1005
    4 rocksdb: EVENT_LOG_v1 {"time_micros": 1509298585082794, "job": 1, "event": "recovery_started", "log_files": [1003, 1005]}
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482] Recovering log #1003 mode 0
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482] Recovering log #1005 mode 0
    3 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:424] db/001005.log: dropping 3225 bytes; Corruption: missing start of fragmented record(2)
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:217] Shutdown: canceling all background work
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:343] Shutdown complete
    -1 rocksdb: Corruption: missing start of fragmented record(2)
    -1 bluestore(/var/lib/ceph/osd/ceph-2/) _open_db erroring opening db:
    1 bluefs umount
    1 bdev(0x557f5b6a4240 /var/lib/ceph/osd/ceph-2//block) close
    This certainly prevents the OSD from starting. If I understand this right, rocksdb is just trying to replay WAL type logs, of which presumably "001005.log" is corrupted. It then throws an error that stops everything.

I did also try to mount the bluestore, since I was assuming that would probably where I'd find the rocksdb's files somewhere & so I could maybe manually deal with this problem, but that too doesn't seem possible:

  1. ceph-objectstore-tool --op fsck --data-path /var/lib/ceph/osd/ceph-2/ --mountpoint /mnt/bluestore-repair/
    fsck failed: (5) Input/output error
  2. ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-2 --mountpoint /mnt/bluestore-repair/
    Mount failed with '(5) Input/output error'
  3. ceph-objectstore-tool --op fuse --force --skip-journal-replay --data-path /var/lib/ceph/osd/ceph-2 --mountpoint /mnt/bluestore-repair/
    Mount failed with '(5) Input/output error'

Adding --debug to these last 3 commands shows the ultimate culprit is just the above rocksdb error again.

The method to deal with this is unknown to me and ceph-users also couldn't help. Deploying v12.2.1 from Ubuntu bionic also didn't fix this (same error).


Files

log (104 KB) log Stephen Lastname, 06/18/2018 12:19 PM
Actions

Also available in: Atom PDF