Project

General

Profile

Actions

Bug #62438

open

object-map rebuild might cause rbd data corruption

Added by Igor Fedotov 9 months ago. Updated 9 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
quincy, reef
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The following data corruption scenario is 100% reproducible in my vstart cluster (built from main branch). Seen reports for the same issue for real 17.2.5 clusters.

1) ceph osd pool create rbd 64 64; rbd create --image-feature layering,exclusive-lock -s 10G rbd/corruption ; rbd map rbd/corruption; mkfs.ext4 /dev/rbd0 ; rbd feature enable rbd/corruption object-map,fast-diff

2) rbd object-map rebuild rbd/corruption; mount /dev/rbd0 /mnt/rbd0 ; dd if=/dev/urandom of=/mnt/rbd0/random bs=4k count=1000000 ; umount /mnt/rbd0 ; rbd unmap /dev/rbd0 ;

3) rbd map rbd/corruption ; e2fsck -n -f /dev/rbd0 ; rbd unmap /dev/rbd0

As a result e2fsck from step 3 reports errors:

e2fsck 1.46.5 (30-Dec-2021)
Pass 1: Checking inodes, blocks, and sizes
Inode 12 has an invalid extent node (blk 33795, lblk 0)
Clear? no

Inode 12 extent tree (at level 1) could be shorter. Optimize? no

Inode 12, i_blocks is 8000008, should be 0. Fix? no

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(9264--31791) -33795 -(34816--98303) -(100352--163839) -(165888--229375) -(231424--294911) -(296960--524287) -(532512--555039) -(557056--819199) -(821248--884735) -(886784--969279) -(2555904--2621439)
Fix? no

/dev/rbd0: *** WARNING: Filesystem still has errors ***

Actions

Also available in: Atom PDF