Project

General

Profile

Feature #20801

ability to rebuild BlueStore WAL journals is missing

Added by Dmitry Smirnov over 3 years ago. Updated almost 3 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

We have a six-node CEPH cluster v12.0.1 with BlueStore back-end, all RocksDB WAL journals were located on Flashtec NVRAM PCIe cards. Due to power outage and not yet clarified hardware issue we have lost all WAL partitions (5 partitions for 5 SAS OSD in each server). Data disks are not affected and looks healthy, OSD tree map is healthy as well, but the entire cluster is down.
Is it any possibility to rebuild those journals from scratch? I tried to create GPT partitions with correct WAL GUID code (5CE17FCE-4087-4169-B7FF-056CC58473F9), same size as original and respective Partition UUID (taken from /var/lib/ceph/osd/ceph-X/block.wal_uuid). But it seems not enough, some data should be written to that new partition as well.

/usr/sbin/ceph-disk --verbose activate-block /dev/nvme0n1p1

get_dm_uuid: get_dm_uuid /dev/nvme0n1p1 uuid path is /sys/dev/block/259:2/dm/uuid
command: Running command: /usr/sbin/blkid -o udev -p /dev/nvme0n1p1
command: Running command: /usr/bin/ceph-osd --get-device-fsid /dev/nvme0n1p1
get_space_osd_uuid: Block /dev/nvme0n1p1 has OSD UUID 00000000-0000-0000-0000-000000000000
main_activate_space: activate: OSD device not present, not starting, yet

History

#1 Updated by Sage Weil about 3 years ago

  • Project changed from Ceph to bluestore
  • Category deleted (OSD)

#2 Updated by Sage Weil almost 3 years ago

  • Status changed from New to Rejected

The wal or journal is an integral part of the store. The data store cannot be reconstructed without it.

Also available in: Atom PDF