Bug #52079: bluefs mount failed to replay log: (5) Input/output error - bluestore - Ceph

Actions

Copy link

Bug #52079

closed

bluefs mount failed to replay log: (5) Input/output error

Added by Viktor Svecov over 2 years ago. Updated over 2 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

Tags:

Backport:

pacific, octopus

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v16.2.1

ceph-qa-suite:

Pull request ID:

42830

Crash signature (v1):

Crash signature (v2):

Description

In the testlab after simultaneous power off of all OSD nodes (3) two of them can not start.

h2 node:

...
debug 2021-08-06T04:49:14.966+0000 7f42e40d5080 -1 bluefs _replay 0xb5000: stop: failed to decode: bad crc 1492738775 expected 0: Malformed input
debug 2021-08-06T04:49:14.966+0000 7f42e40d5080 -1 bluefs mount failed to replay log: (5) Input/output error
debug 2021-08-06T04:49:14.966+0000 7f42e40d5080 -1 bluestore(/var/lib/ceph/osd/ceph-1) _open_bluefs failed bluefs mount: (5) Input/output error
debug 2021-08-06T04:49:14.966+0000 7f42e40d5080 -1 bluestore(/var/lib/ceph/osd/ceph-1) _open_db failed to prepare db environment:
debug 2021-08-06T04:49:14.966+0000 7f42e40d5080  1 bdev(0x5621b7fce400 /var/lib/ceph/osd/ceph-1/block) close
debug 2021-08-06T04:49:15.226+0000 7f42e40d5080 -1 osd.1 0 OSD:init: unable to mount object store
debug 2021-08-06T04:49:15.226+0000 7f42e40d5080 -1  ** ERROR: osd init failed: (5) Input/output error@

h3 node:

...
debug 2021-08-06T04:38:55.526+0000 7f139f28e080 -1 bluefs _replay 0xc07000: stop: failed to decode: bad crc 3449997429 expected 0: Malformed input
debug 2021-08-06T04:38:55.526+0000 7f139f28e080 -1 bluefs mount failed to replay log: (5) Input/output error
debug 2021-08-06T04:38:55.526+0000 7f139f28e080 -1 bluestore(/var/lib/ceph/osd/ceph-2) _open_bluefs failed bluefs mount: (5) Input/output error
debug 2021-08-06T04:38:55.526+0000 7f139f28e080 -1 bluestore(/var/lib/ceph/osd/ceph-2) _open_db failed to prepare db environment:
debug 2021-08-06T04:38:55.526+0000 7f139f28e080  1 bdev(0x55898df4e400 /var/lib/ceph/osd/ceph-2/block) close
debug 2021-08-06T04:38:55.686+0000 7f139f28e080 -1 osd.2 0 OSD:init: unable to mount object store
debug 2021-08-06T04:38:55.686+0000 7f139f28e080 -1  ** ERROR: osd init failed: (5) Input/output error

Full log files attached.

ceph-bluestore-tool and ceph-objectstore-tool outputs the same error messages.

I am not sure if this problem related to existing expecially BUG #50965.

Files

Download all files

bluefs mount failed to replay log (5) on h2.txt (11.6 KB) bluefs mount failed to replay log (5) on h2.txt		Viktor Svecov, 08/06/2021 05:09 AM
bluefs mount failed to replay log (5) on h3.txt (11.6 KB) bluefs mount failed to replay log (5) on h3.txt		Viktor Svecov, 08/06/2021 05:09 AM
debug_bluefs_20.7z (59.6 KB) debug_bluefs_20.7z		Viktor Svecov, 08/14/2021 05:46 AM
h3_debug_bluefs_20_2.7z (55.3 KB) h3_debug_bluefs_20_2.7z		Viktor Svecov, 08/16/2021 04:57 AM

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by Igor Fedotov over 2 years ago

Could you please set debug-bluefs to 20, retry startup attempt and share the log?

Actions

Copy link

Updated by Viktor Svecov over 2 years ago

File debug_bluefs_20.7z debug_bluefs_20.7z added

Thank you for help. I have attached log files with 'debug_bluefs = 20' from two nodes.

Actions

Copy link

Updated by Igor Fedotov over 2 years ago

Viktor Svecov wrote:

Thank you for help. I have attached log files with 'debug_bluefs = 20' from two nodes.

One of the new log files looks incomplete, could you please update.

Actions

Copy link

Updated by Viktor Svecov over 2 years ago

File h3_debug_bluefs_20_2.7z h3_debug_bluefs_20_2.7z added

Sorry i didn't notice that err standard output stopped before the end of actual log on node h3. Now the log is complete for the OSD node h3.

Actions

Copy link

Updated by Igor Fedotov over 2 years ago

Igor Fedotov wrote:

Viktor Svecov wrote:

Thank you for help. I have attached log files with 'debug_bluefs = 20' from two nodes.

One of the new log files looks incomplete, could you please update.

Thanks for the update.
I've just shared my analysis and related questions at Ceph dev's mailing list, see
https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/DNDJQ656DMGLXJG7FPRAKXDVQYSJ7XMP/

and I think that you can try to recover the OSDs (and hence additionally prove my analysis) by the following steps:
For osd.1:
fill the 4K block at offset 0xB558715000 (=0xb558660000 + 0xb5000) with zeros (please make a backup first)
then try to start the OSD.

For osd.2:
this should be offset (if my math is valid) 0xcb72b60000 - 0x10000 + 0xc07000 = 0xCB73757000

Actions

Copy link

Updated by Viktor Svecov over 2 years ago

You are right. After zeroing appropriate areas of OSD block devices OSD daemons started. Now all PGs of Ceph Storage Cluster are active+clean. Thank you.

What are the future plans? How can i prevent such behaviour of BlueFS in the future?

Actions

Copy link

Updated by Igor Fedotov over 2 years ago

Status changed from New to In Progress
Backport set to pacific, octopus

Actions

Copy link

Updated by Igor Fedotov over 2 years ago

Status changed from In Progress to Fix Under Review
Pull request ID set to 42830

Actions

Copy link

Updated by Igor Fedotov over 2 years ago

Viktor Svecov wrote:

You are right. After zeroing appropriate areas of OSD block devices OSD daemons started. Now all PGs of Ceph Storage Cluster are active+clean. Thank you.

What are the future plans? How can i prevent such behaviour of BlueFS in the future?

You'll need to upgrade to the relevant pacific release which has the proper patch - I've just submitted one to master hence it should pass all the stages: review, merge in master, backport to pacific, minor pacific release...

Actions

Copy link

#10