Project

General

Profile

Actions

Bug #22044

closed

rocksdb log replay - corruption: missing start of fragmented record

Added by Michael Schmid over 6 years ago. Updated almost 6 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
rocksdb corruption record log replay bluestore
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I've conducted some crash tests (unplugging drives, the machine, terminating and restarting ceph systemd services) with Ceph 12.2.0 on Ubuntu and quite easily managed to corrupt what appears to be rocksdb's log replay on a bluestore OSD:

  1. ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-2/
    [...]
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2859] Recovered from manifest file:db/MANIFEST-000975 succeeded,manifest_file_number is 975, next_file_number is 1008, last_sequence is 51965907, log_number is 0,prev_log_number is 0,max_column_family is 0
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/version_set.cc:2867] Column family [default] (ID 0), log number is 1005
    4 rocksdb: EVENT_LOG_v1 {"time_micros": 1509298585082794, "job": 1, "event": "recovery_started", "log_files": [1003, 1005]}
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482] Recovering log #1003 mode 0
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:482] Recovering log #1005 mode 0
    3 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl_open.cc:424] db/001005.log: dropping 3225 bytes; Corruption: missing start of fragmented record(2)
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:217] Shutdown: canceling all background work
    4 rocksdb: [/build/ceph-pKGC1D/ceph-12.2.0/src/rocksdb/db/db_impl.cc:343] Shutdown complete
    -1 rocksdb: Corruption: missing start of fragmented record(2)
    -1 bluestore(/var/lib/ceph/osd/ceph-2/) _open_db erroring opening db:
    1 bluefs umount
    1 bdev(0x557f5b6a4240 /var/lib/ceph/osd/ceph-2//block) close
    This certainly prevents the OSD from starting. If I understand this right, rocksdb is just trying to replay WAL type logs, of which presumably "001005.log" is corrupted. It then throws an error that stops everything.

I did also try to mount the bluestore, since I was assuming that would probably where I'd find the rocksdb's files somewhere & so I could maybe manually deal with this problem, but that too doesn't seem possible:

  1. ceph-objectstore-tool --op fsck --data-path /var/lib/ceph/osd/ceph-2/ --mountpoint /mnt/bluestore-repair/
    fsck failed: (5) Input/output error
  2. ceph-objectstore-tool --op fuse --data-path /var/lib/ceph/osd/ceph-2 --mountpoint /mnt/bluestore-repair/
    Mount failed with '(5) Input/output error'
  3. ceph-objectstore-tool --op fuse --force --skip-journal-replay --data-path /var/lib/ceph/osd/ceph-2 --mountpoint /mnt/bluestore-repair/
    Mount failed with '(5) Input/output error'

Adding --debug to these last 3 commands shows the ultimate culprit is just the above rocksdb error again.

The method to deal with this is unknown to me and ceph-users also couldn't help. Deploying v12.2.1 from Ubuntu bionic also didn't fix this (same error).


Files

log (104 KB) log Stephen Lastname, 06/18/2018 12:19 PM
Actions #1

Updated by Greg Farnum over 6 years ago

  • Project changed from Ceph to bluestore
  • Category deleted (OSD)
Actions #2

Updated by Shinobu Kinjo over 6 years ago

With 12.2.1, I can't reproduce.
Can you do ceph-bluestore-tool fsck --path <path> --debug-bluestore=20 --log-file=c --no-log-to-stderr, and upload log file?

Actions #3

Updated by Sage Weil about 6 years ago

  • Status changed from New to Need More Info
  • Priority changed from Normal to Urgent

Can you share a bit about how you reproduced this?

Our test suite is doing failure injection at the block layer that should uncover anything that a simple device pull would, but... maybe not. What kind of SSD is it?

Actions #4

Updated by Sage Weil about 6 years ago

  • Status changed from Need More Info to Can't reproduce

please let us know and we can reopen if this is still an issue with the latest code.

Actions #5

Updated by Michael Schmid about 6 years ago

Excluse the late respons. On the advice of the ceph-users ML I had wiped and recreated the OSD. Of course this made the problem and any chance of getting at the problematic rocksdb disappear.

Actions #6

Updated by Stephen Lastname almost 6 years ago

I was able to reproduce.

Actions #7

Updated by Stephen Lastname almost 6 years ago

wrong file

Actions

Also available in: Atom PDF